Being my first feature project, there are a lot of things in Back to the source that I’m doing for the first time. But not knowing how to do something has rarely been an obstacle for me. Solving technical problems is something that I enjoy quite a bit, in fact. Case in point, foreign language interviews. In this post I will explain the method I use to prepare them for editing.

Note that this is not a tutorial and I’m making the assumption that you’re fairly comfortable manipulating text files.

In the documentary, I have 3 interviews that aren’t in English. One’s in French, one in Polish and one in Hungarian. The question was “How the hell am I going to edit this?” I speak French so I can deal with that one pretty easily, but what about the other two? I obviously had a translator on location so that I could actually understand my interviewee and the translation was recorded, but to make a decent edit I need something much more accurate than that. I need to know exactly what the person is saying when.

The basic idea is to create a version of the interview with subtitles and use this to edit. After the edit is locked, I replace the subtitled version with the original clip, and voila. I didn’t have any idea whether this would actually work at the time, but that was the only idea I had, so it was worth trying.

Normally, when you make subtitles, you would take the transcript of the video, translate it, break it into chunks and display those chunks at specific timeframes. This wouldn’t do for me.

Firstly, it takes ages, and I was looking for a way of accelerating the process. But more importantly, this wasn’t good enough for editing. Different languages put words in different orders, so a proper translation wouldn’t be any help if I needed to cut in the middle a sentence for whatever reason.

For a line of subtitle, you need 2 pieces of information: when to start displaying the line (the start timecode) and when to stop displaying it (the end timecode). That’s when the penny dropped: a sequence is basically a series of start and end timecodes from one or several source clips, so I thought “What if I could use the editing tools in Premiere to generate my timecodes?”

Generating timecodes

Basically, you want to put your interview on a timeline and add cuts that will generate you start and end timecodes for your subtitles.

How often should you cut? As often as reasonably possible. Remember that the point of this exercise is to create an edit-friendly clip, not subtitles aimed at the viewer. The subtitling needs to be a pretty much word for word transcription/translation of what’s being said. But how to know where to cut when you don’t understand what’s being said?

slices

You can use the audio waveform to find pauses in the speech.

Regardless of the language, pretty much nobody stops talking in the middle of a word unless they’re interrupted. So you can safely assume that every time they stop talking or make a pause, you can add a cut. You might be cutting in the middle of a sentence, but that’s OK. Even if you don’t understand the language, with a bit a practice you’ll be able to guess if the person is hesitating, repeating words, or other things you’re not really interested in so you’ll learn to cut these out.

sliced

Once the slashing is finished, the clip should look like it’s been through a meat slicer. You might end up with 200+ cuts in your timeline, each slice of clip representing a portion of the original file, with a reference to the start and end timecode in the original file it’s coming from. All that is left to do is to extract these timecodes. This is the trickiest bit.

There aren’t many ways to export a sequence in text format, but EDL is a fairly simple one that contains the information we need in a fairly easily accessible way. The fiddly part is to remove all the information we don’t need. This might sound daunting but if you know your regular expressions, you should be able to make quick work of this step.

edl

We’re only interested in the first two timecodes on the V tracks.

I’m sure it’s possible to write a plugin for Premiere that would generate a file with only the timecodes we need, but for this project it wasn’t worth my time trying to write one. After this is done you can easily put the timecodes in a spreadsheet that you’re going to give to whoever is going to do the translation, along with an unsliced and version of the clip with a timecode overlay.

translation_spreadsheet

I always ask for a transcription as well as a translation so that I can try to identify the words to help refine my edits.

Note that as the timecodes might cut through a sentence, you might get a translation in a broken English but that’s okay for the purpose it’s serving. Once the translation is done, you can create two CSV files, one for the translation and one for the transcription. They should look like below:

text_csv

I’m actually using semi-colon as a delimiter as the text might contain commas.

You now have your subtitle files, it’s time to add them to the interview.

Importing the subtitles

A bit of research on the web led me to this blog post by August Bering that creates animated text in After Effects from an .SRT subtitle file. As I’m using a much simpler format, I had to modify his script to accommodate my format and made a few improvements. You can download my version of the script here, and you’re free to use it for all your projects (do so at your own risk though).

result in after effects

After running the CSV files through the script, you end up with an After Effects composition that has perfectly synchronised subtitles. Then you can simply import that composition in Premiere and use that to edit. However, as it’s a relatively complex composition (what with its hundreds of keyframes), I would actually recommend rendering the video before using it into Premiere.

result in premiere

And that’s about it. Without proper tools, this workflow feels a bit McGyverish and the slicing step can be a bit of a chore (took me a couple hours to slice a 30 minutes of interview), but it proved to be a much faster way to create the subtitles than writing down timecodes.

What about you? What is your workflow for editing foreign language interviews? If you know more efficient ways to do it, please leave a comment below.