Difference between revisions of "How to subtitle offline"
m (→Add the missing paragraph breaks when working on TEDTalks: Added info about vtt files)
m (→Other offline subtitling software)
|Line 85:||Line 85:|
=Other offline subtitling software=
=Other offline subtitling software=
There is a wide variety of freeware offline subtitling software to choose from.
There is a wide variety of freeware offline subtitling software to choose from.
Revision as of 14:48, 18 April 2017
To work offline, you can download subtitles from Amara, use subtitling software on your computer, and finally upload your work to Amara and complete the task. You can translate, transcribe and review subtitles offline when you know you won’t have access to an Internet connection or when you wish to leverage the advanced editing features offered by some free and paid subtitling software. This guide contains a few caveats related to working offline, as well as instructions related to using SubtitleEdit, an example of freeware subtitling software available for Windows systems. Aegisub is a subtitle editor with equally efficient functionality and both Mac OSX and Linux versions, complete with excellent documentation.
- 1 How to download subtitles
- 2 How to download the talk
- 3 How to upload subtitles
- 4 Always complete the task in the online editor
- 5 Using SubtitleEdit to subtitle offline
- 6 Other offline subtitling software
How to download subtitles
How to download the talk
For TEDTalks, play the video on Amara, right-click anywhere on the video in the player, and select “Save video as.” Note that videos downloaded from TED.com will be offset by the duration of the TED intro, so the subtitles you download from Amara will not be synchronized. Because of that, for translation, use the video downloaded from the Amara player. TEDx and TED-Ed are hosted on YouTube, and YouTube does not allow users to download videos offline.
How to upload subtitles
Add the missing paragraph breaks when working on TEDTalks
Note that paragraph breaks are not the same as line breaks; line breaks break lines within a single subtitle, and they are preserved when using .srt files. Paragraph breaks are only visible in the transcript view on TED.com; the transcript is manually divided into a few paragraphs for easier reading.
Note: Subtitles in Amara’s native format (Timed Text / .dfxp or WebVTT / .vtt) or will preserve the paragraph breaks on upload. Even if your subtitling software supports vtt/dfxp subtitles, make sure that it can also retain the paragraph-break information. Subtitle Edit fully supports dfxp/vtt (and preserves these paragraph breaks) starting from version 3.4.6.
Always complete the task in the online editor
Using SubtitleEdit to subtitle offline
SubtitleEdit is just one example of free subtitling software that runs on Windows systems. In many cases, it allows advanced users to transcribe and review transcripts and translations more easily than by using the online editor. If you decide to use this software, please bear in mind that it is in no way endorsed by TED or by Amara. Below, you will find advice on how to use the software to subtitle offline. For more detailed instructions, consult the software’s help page.
Installation, settings and features
After installing the latest version of Subtitle Edit, also install VLC media player. This free software is necessary to activate some of Subtitle Edit’s functionality.
For the last bit of initial configuration, go to “Spell check/Get dictionaries,” and download spellchecking dictionaries for all the languages that you work in.
Opening the video
Go to “Video/Open video file” and select the video you downloaded. Afterwards, select “Video/Show/hide video,” if the video player is not visible, and “Show/hide waveform” to pull up the box that will contain the waveform representation of the talk’s audio. Finally, click anywhere in the waveform box to generate the waveform.
Working with the list view
The Duration field in the list view will be highlighted if the subtitle is not in keeping with the reading speed standards. The text field will be highlighted if the subtitle is not in keeping with length standards (for line or/and total subtitle length).
Double-clicking a subtitle in the list view seeks the video and waveform to that subtitle’s position. After selecting two subtitles (using Shift or Ctrl), right-click to merge them. If you’re reviewing subtitles and find that the translator missed one and as a result, left the rest of them unsynced, right-click the subtitle they missed and go to “Column/Insert empty text and shift cells down.” To switch to the next line in the list view, use Alt+Down arrow.
You can extend the duration of a subtitle precisely by editing the value in the “Duration” field under the list view box, or by clicking the up and down arrows next to that field to extend or shorten the duration by the default increment of 100 ms. If the subtitle overlaps adjacent subtitles, this area will also show the degree of overlap. Very often, you’ll be able to make quick reading speed fixes by extending the duration of one subtitle using the arrows until the reading speed is fine, and then going to the next subtitle and adjusting its start time until it doesn’t overlap the previous subtitle that you just extended over it.
Leveraging the waveform
Spellcheck and global fixes
After you've made your edits, do a global spellcheck, and then go over the subtitles once again, making sure the changes made during spellcheck did not extend the text to a degree that would break the subtitle reading speed or length standards.
In “Tools/Fix common errors,” you will find a number of useful fixes. The recommended ones to use for the OTP are “Remove empty lines/unused line breaks,” "Fix short display times," “Remove unneeded spaces,” “Remove unneeded periods,” “Fix missing spaces,” “Remove line breaks in short texts with only one sentence” and “Remove line breaks in short texts (all except dialogues).”
In the “Verify fixes” window, you can click the header of the “Function” column to sort the fix list by type of issue, and uncheck the fixes that you do not want implemented.
“Fix overlapping display times” will usually modify all subtitles, since most TED and TEDx translations and transcripts do not employ a fixed-duration break between consecutive subtitles. Subtitles modified in this manner will show 100% edits in the revision comparison view on Amara, so this fix is not recommended for the reviewing or approval step.
“Fix short display times” is useful in fixing small reading speed errors through extending the subtitle's duration and adapting adjacent subtitles. IMPORTANT: before you use this option, go to Options/Settings/Tools and check "Fix short display time - allow move of start time." Do NOT use “Fix long display times” - instead, split subtitles longer than 7 seconds by hand, depending on where it's best to break the sentence into two subtitles (while respecting clause boundaries).
“Break long lines” is again a fix that is better done manually, since a human needs to decide whether a break would split a linguistic unit. However, you can run this test and then go over the results in the “Verify fixes” window, unchecking all the breaks that split grammatical units or proper names, and leaving the ones that happen to be correct. IMPORTANT: if you're working with English subtitles, go to Options/Settings/Tools and check "Use do-not-break-after list (for auto-br)." This will make the suggested breaks follow a set of rules and limit the number of breaks you need to input manually.
How to translate in Subtitle Edit
To begin translating, open the original subtitles in Subtitle Edit, and then select “Tools/Make new empty translation from current subtitle.” Save the newly created set of subtitles. Alternatively, when you’re starting from a set of translated subtitles and want to pull up the original, go to “File/Open original subtitle (translator mode)...” You can also close the original subtitle preview by going to “File/Close original subtitle.”
How to transcribe in Subtitle Edit
If you happen to have access to an offline copy of the video with the talk, you can use Subtitle Edit to transcribe the talk. Use Ctrl-P to play and pause the video, click in the waveform where you want to insert your subtitle (usually where you see that the given utterance begins in the audio representation), and hit F9 to insert a subtitle and start typing. The timing will automatically adjust to a convenient reading speed below 15 characters/second; you can manually adjust it by dragging or clicking in the waveform (Ctrl-left click, Shift-left click). Other useful shortcuts include: F4 to skip between the beginning and end of the subtitle’s duration (skip to the end, insert the next subtitle) and F5 (play the video from the beginning of the current subtitle and pause a little after it ends displaying).
After you’re done transcribing, you can improve your work by selecting “Tools/Minimum display time between subtitles…” (use 24 ms) and then “Tools/Bridge gap in durations…” (use 100 ms). These edits are not required while working on Amara, but they improve the transcript by making it easier to follow the subtitles (the little “flicker” signals that a new subtitle is going to appear) and making them safely compatible with various offline players (without any breaks between subtitles, some players will display consecutive subtitles at the same time).
How to review/approve in Subtitle Edit
Download the translation or transcript and the video. For translations, also download the original subtitles. In addition to all of the other global fixes and general features of Subtitle Edit described above, there are some features that are especially useful during the review/approval stage:
- Do a global spellcheck: Subtitle Edit offers a global spellcheck under "Spell check." To download a spellchecking dictionary for your language, go to Spell check/Get dictionaries...
- Globally fix reading speed problems: Go to Options/Settings/Tools, and check "Fix short display time – allow move of start time." Then, running the fix at Tools/Fix common errors/Fix short display times will allow you to fix most reading speed problems globally by extending the duration of the affected subtitle and moving the following subtitle ahead a little if necessary.
- Merge short consecutive subtitles: Subtitles are easier to read and to translate into other languages when sentences aren't split up into too many parts. You should not have the end of one sentence and the beginning of another in the same subtitle, and you should always make sure that there isn't a reason for splitting up the sentence (e.g. the speaker is making a pause for dramatic effect). Bearing that in mind, you can go to Tools/Merge lines with the same text and set the number under "Maximum characters in one paragraph" to 84, and then click anywhere in the window to refresh the list of results with subtitles that can be merged.
- Merge identical subtitles: Go to "Tools/Merge lines with same text..." to merge identical subtitles. Some translators may duplicate text in subtitles instead of merging them on Amara, and this feature will automatically merge them for you. While using this feature, increase the maximum break between subtitles to 500 ms, to allow the software to detect duplicate subtitles with longer breaks inbetween.
- Ubreak unnecessarily broken subtitles: Some subtitles may be broken into two lines even though the subtitle's total length is not over 42 characters. You can unbreak them by running two fixes at Tools/Fix common errors: "Remove line breaks in short texts with only one sentence" and "Remove line breaks in short texts (all except dialogues)."
- Globally insert semantic line breaks (in English): If you're working on English subtitles, go to Options/Settings/Tools, and check "Use do-not-break-after list (for auto-br)." The fix at Tools/Fix common errors/Break long lines will break most lines "semantically," based on the list of words after which the line should not be broken (like "the" or "of"). You will still need to review the suggested line breaks before you implement the fix, as some lines will need to get broken manually.
- Globally fix closing-quote punctuation (in English): In American English (where double quotes are used), periods and commas should be placed before the closing quote, while in British English (single quotes), they should follow the closing quote. You can globally fix mistakes regarding this by going to Edit/Replace (or hitting Cmd/Ctrl-h) and replacing all the instances of the mistake (e.g. changing ". to ."). Note: in the OTP, volunteers can use either British or American English rules, as long as they are consistent. Respect the choice made by the person who created the subtitles you are working on. To learn more, see the English Style Guide.
- Run fixes for other common mistakes: Including the fixes described in detail above, the fixes in Tools/Fix common errors that are useful when reviewing or approving OTP transcripts or translations are: Remove empty lines/unused line breaks, Fix short display times, Remove unneeded spaces, Remove unneeded periods, Fix missing spaces, Break long lines, Remove line breaks in short texts with only one sentence, Remove line breaks in short texts (all except dialogues) and Fix double apostrophe characters () to a single quote (").
Other offline subtitling software
There is a wide variety of freeware offline subtitling software to choose from. In each case, please do a test to make sure the software properly supports Amara's TimedText/.dfxp or WebVTT/.vtt format (necessary to use when working with TEDTalk translations, in order to preserve the paragraph divisions). Even if the description of the features of the tool does indicate TimedText support, it is a good idea to download same dfxp subtitles from Amara (e.g. using a TED Talk transcript), make a few edits, save the file and then compare the original and the edited version. If you notice unexpected timing differences, or if Amara doesn't accept the file you created when you attempt to upload it, this may indicate that the software you chose does not fully support the format used on Amara. All subtitling software should support SubRip/.srt subtitle files, which is fine for working with TEDx and TED-Ed videos.
- Aegisub - see this excellent documentation page
- Gnome Subtitles
- Jubler - see this guide
- Subtitle Editor