Thoughts on Contextual Line Breaking

Closed captions are an essential part of video production. Whether the media that needs closed captioning is a web video, a documentary film, a training video for a business, or a television show, it’s an unfinished product unless it has closed captions.

Closed captions are more than a means for people who are Deaf and hard-of-hearing to experience the video. They’re also a way to make the video more accessible for people who don’t speak English fluently, those in environments where hearing the audio isn’t possible, and search engines crawling the Internet. High-quality closed captions increase the amount of information the viewer retains after watching the video. In many cases, closed captions are required by federal regulation.

But not all closed captions are created equal. For professional-quality productions, using automatic or cheap closed captions is like shooting a film in 35mm film but recording the audio with a cell phone. It just doesn’t make sense to invest the time, effort, money, talent, and resources to make a video only to have closed captions added that are riddled with timing errors, misspelled names, missing words, and any of the other well-documented problems that result from the use of automatic captions or captioning companies that outsource their labor to sweatshops located in Third World countries.

For some people, closed captions are an afterthought. If you can clearly hear the soundtrack to the program you’re watching, and you have no questions about what was being said, you might not have any reason to pay attention to the closed captions. But if you need closed captions, you might notice the differences in quality from one source to another.

Done correctly, the closed captions will match what is being voiced, they will be verbatim, with words properly spelled, including, but by no means limited to, the names of people, chemicals, businesses, etc.

The captions will be formatted in an easily readable way, they will appear on-screen at the correct time, and they will be displayed long enough to read.

If our aim as captioners is equivalency, then let’s take a look at one aspect of how we interact with closed captions, which is how the text of the line is broken into its display lines and phrases.

Two approaches can be taken.  One would be to auto break the lines at X character count depending upon the output format required, and the second would be to break the lines contextually but with an eye to the limitations of character counts.

Let’s look at two ways to format the same text, both limited to 32 characters per line:

We had to come up with new tech
niques and novel ways to do anal

ysis of the chemical signature,
said Mr. Ojha, the lead author

of the Nature Geoscience article

You might notice that some words are broken up awkwardly. The last period is on a line of its own, making it harder to read. Let’s see it without any words being broken while still remaining within our 32-character count limitation. Bear in mind, each two-line caption is displayed on its own.

We had to come up with new
techniques and novel ways to do

analysis of the chemical
signature, said Mr. Ojha, the

lead author of the Nature
Geoscience article.

It’s definitely better than the first example, but not quite ideal. Let’s try to break up the text into captions that more clearly convey what’s being said.

We had to come up with
new techniques and novel ways

to do analysis of the
chemical signature,

said Mr. Ojha,
the lead author

of the Nature
Geoscience article.

In this approach we see natural phrasing occurring while still adhering to the 32-character count line.

Line breaking for closed captioning is not about character counts, it is about context.

It’s about breaking up the line as a human being would naturally read it.

Be a human, be a hero.  Say no to the robot. Take the time and break the line correctly.


Updated:  In the video Rapid Roy performed by Jim Croce, we created two separate tracks.  Both are located by clicking the gear icon and then on subtitles.

The first track is line broken to represent Jim Croce’s speech pattern and the second is line broken for context.


Closed captioning for everyone, The Closed Captioning Project LLC.