Tag Archives: transcription

Autocaptioned YouTube Videos

captioned video

An example of an autocaptioned YouTube video. It IS time.

I wrote about advances in captioning technology and how this could make online video exponentially more useful almost a year ago.  Captions are obvoiusly very important for people with hearing impairments but can also be useful for students studying a language.

At the time, work was being done on automating the process because it takes several hours to transcribe the text and then synchronize it to each hour of video.  Manually captioning all of the video that exists online or even that is currently being created is simply not possible.  What a difference a year makes.

YouTube recently announced the addition of an automatic captions feature.  This announcement picked up by Mashable and it echoed through the Twittersphere.  My first reaction was, “Finally!”  Followed by the question, “I wonder how accurate it is.”

Ken Petri, Program Director for the Ohio State Web Accessibility Center, addressed these concerns in an email to the OSU Exploring Learning Technologies community:

If you have ever seen the results from Google Voice’s automatic transcription you know they are usually not perfect. For an educational context, a perfect or close to perfect transcript is usually necessary. This and the fact that most of you will not have access to the automated transcription feature in YouTube means that, while it is an exciting announcement, it is not a panacea.

Fortunately, you can opt to upload your own transcript and have YouTube auto-align it to your video.  If the video is scripted (as opposed to improvised) it can be easy to obtain a trascript.  Transcribing a video can take a long, long time, but automating one step in the process is helpful.

In general, every step that is automated will increase speed and efficiency while lowering costs, but will also introduce inaccuracies.  As each step in improved, we will get closer and closer to the goal of captioning every online video.

Leave a comment

Filed under Resources

Searchable Video – Enter the Dragon

I caught the tail end of the monthly Exploring Learning Technologies community meeting recently and became intrigued by the topic: accessibility. (Full disclosure: I’m part of the committee that plans these meetings.)

This is an important topic because, as more and more educational video is put online for class use (lectures, for example) accessibility becomes a greater issue. The more I heard, the more I thought about how processes like captioning video can be helpful with second language learners as well as people with hearing impairments. So, in general, the conclusion was, it’s good to make captioning part of your practice if you post video online. The most difficult part of this process, obviously, is transcribing the text.

There are several ways to do this, but, in general, you will need to pay someone to listen and type. Whether you hire someone yourself or use an online service, the cost of both kinds of service increase with the accuracy and rapidity of the completion of the transcription.

Dragon speech recognition software.

Dragon speech recognition software.

An interesting alternative incorporates Dragon speech recognition software. Unfortunately, you can’t just have this software listen to the video and produce a transcript. Background noise and other issues make this impossible. But you can have someone watch the video and repeat the transcript for the software. In effect, this someone becomes a biological interface between two digital entities! For a moment, I was distracted by images from the Matrix movies, in which machines use humans disposably, but then I started to realize the most useful feature of captioned video: searchability.

If you want to find a phrase in your favorite movie, you likely have to guess where it is and then skip forwards and / or backwards until you find it. This is difficult.  Now imagine looking for the same phrase in a movie you have never seen.  Or, searching a dozen movies. By searching a text transcript which is linked to the timeline of the movie, it would be extremely easy to find the phrase. Looking for “classroom technology?” The phrase is used at 03:58 and again at 17:22.

This process is costly and labor-intensive now, but eventually, whether speech recognition software is able to scan video and automatically transcribe speech accurately, or there is an offshore matrix of borg-like transcribers scouring YouTube, all video will be transcribed in a searchable way. This will make video useful and accessible in the same way that the Internet has made texts useful and accessible. And we’ll look back and say, why did we wait so long to do this?


Filed under Inspiration