2001 Conference Proceedings

Go to previous article 
Go to next article 
Return to 2001 Table of Contents


Greg Banks
1310 13th St.
Menomonie WI 54751

Dick Banks
121 3rd St. W
Menomonie WI 54751

Norman Coombs
590 Harvard St.
Rochester NY 14607


The use of multimedia as a means of disseminating information on the Web has exploded in the past three years. Audio, video, animation, and narrated PowerPoint presentations are just some of the wide variety of media options. Educators are now beginning to seriously consider using various media delivery to enhance the educational experience of their students. Many people with disabilities find themselves left out of the multimedia experience because developers do not include transcriptions, audio descriptions and captioning in their work.

This presentation will discuss EASI's efforts to ensure that all of its weekly webcasts and multimedia workshop materials are transcribed or captioned for Web delivery. Although the process is complex, nearly anyone can do it. With the adoption of SMIL (Synchronized Multimedia Integration Language) by the World Wide Web Consortium, developers are now able to accommodate people with disabilities. SAMI, developed by Microsoft, is another method of transcription and captioning for Web delivery.

The cost of producing accessible media can be prohibitive for many developers who might want to consider rendering accessible multimedia. EASI has developed a method of creating transcriptions and captioning which can be done by anyone who is willing to invest the time it takes to learn.

Problems in Creating Accessible Multimedia

When audio and video became a reality on the Web, there was no consideration for people with disabilities. There was no way for developers to caption for the Internet. There was no language that permitted streaming text with audio or video. Player plugins like RealPlayer and WindowsMedia etc. were not capable of rendering more than audio or audio/video.

Captioned media made for VCR videos might seem to be a simple answer to media delivered on the Web. The problem with captioned for VCR delivery is a universal one. All Internet media methods compress audio/video so that file size is manageable. In reducing the file size quality is lost. In many cases, captions from VCR videos are nearly impossible to read after they have been encoded for Internet delivery. Even at the highest bandwidth encoding this is the case.

Solutions in Creating Accessible Multimedia

The World Wide Web Consortium developed a specific code called SMIL, which allowed for streaming media in a parallel or sequential way. The parallel method allows the developer to stream more than one track of media happening at the same time. If you have a video with audio track, you can create a text file of the audio and stream the text in parallel with the video with audio track. The end user is presented with what appears to be one stream of information. In reality it is two separate files that are controlled by a SMIL file which delivers the information that the plugging needs to deliver and synchronize the media files.

There are many commercial products that create SMIL files for displaying media events on the Web. None of them facilitate streaming text that would serve as captions for the deaf. To date there is only one software program that will insert time codes for the streaming of text captions. MAGPIE, a software program developed by the accessible media team at WGBH in Boston, with funding from the Trace Research Center, is that program.

MAGPIE will take a text file of a transcription and allow the developer to insert time intervals into the transcribed text. You then can preview the results of the time with the audio or video. When you are satisfied with the results, MAGPIE will allow the designer to save the streaming text file to a number of popular media formats, including RealMedia, WindowsMedia and QuickTime. This presentation will show the process and end result of using MAGPIE to create captioned media.

The EASI Process

EASI has been delivering a weekly Webcast for the past two years. Nearly all of these Webcasts are audio interviews. We transcribe these Webcasts and deliver them as html files which link from the weekly Webcast page. This allows the deaf or hearing impaired user to access the interview.

The audio portion of videos are transcribed first and then formatted for importing into MAGPIE. All transcriptions are done by listening to the audio and using voice recognition to get the audio into ASCII text. Using a small earphone and plugging it into a tape recorder or the audio output of a VCR, you can listen to the audio/video and simultaneously speak into the dictation headset. As you might imagine, mastering this process takes some time, a lot of concentration and dedication.

Once the transcription has been completed, the ascii text files need to be spell checked and formatted so that margins are compatible for streaming. The text must fit into a small window.

MAGpie is the software that helps the multimedia designer synchronize the text and the video. MAGpie displays the video in a pop-up window on the computer monitor. The screen has another window showing the prepared text. The designer watches the video with one eye and the text with the other. When the video proceeds to the point that new text is required for the new pictures, the designer inputs a keystroke which inserts a timing stamp into the SMIL file. Actually, the designer is now working with three files. The video is one file, the text for captioning is another and the SMIL file which contains directions to control the other files is the third file. As simple as this sounds, it takes experience to feed the new text and synchronize with the video. In fact, the designer must have seen the video more than once previously to ensure his ability to mark the text at just the correct spot. Sitting there with fixed attention for a prolonged period while concentrating on both the text and the video requires considerable patience.

Internet captioning provides a wide variety of options for the captioned text. The designer can adjust font type, font size, font style, font and background colors. This enhances the accessibility of the captions. Deaf users may also be low vision or learning disabled. This control over the font means there is the ability to meet the needs of someone with multiple disabilities.

The examples that follow show the makeup of a media SMIL file and the streaming text file. Comments are placed in the examples and are prefaced with the words author's comments.

SMIL Example

The Media Control File

root-layout height="163" width="239"/
AUTHOR'S COMMENT: This is the overall height of the media presentation window.
region id="rgtthng1_Region" left="39" top="0" height="120" width="160" z-index="0"/
AUTHOR'S COMMENT: This identifies the video region of the media presentation.
region id="rgthng1_Region" left="25" top="120" height="43" width="214" z-index="0"/
AUTHORS COMMENT: This is the region where the streaming text appears. /layout
video id="rgtthng1" src="rgtthng1.rm" region="rgtthng1_Region" system-bitrate="70529"/
textstream id="rgthng1" src="rgthng1.rt" region="rgthng1_Region" system-bitrate="212"/
AUTHORS COMMENT: These identifiers point to the actual files that are pointed to, which make up the presentation.

SMIL Example

The Streaming Text File

window bgcolor="000000" wordwrap="true" duration="00:10:25.35"
font size=" 1" face="Arial" color="#FFFFFF"
AUTHOR'S COMMENT: This portion of the streaming text file gives directions to the player plugin on how the text is to be presented and the length of the entire presentation. time begin="00:00:00.00"/**clear/
My phrase for the pain of having
to keep up on things, we call it

AUTHORS COMMENT: Each chunk of text has a starting and ending point. These chunks of text
are timed with the speaking in the video. time begin="00:00:05.11"/clear/
upgrade fatigue syndrome, which I
suffer from greatly. So these are the
time begin="00:00:12.88"/clear/
list of things that we plan for you to
be able to take away with you today.

time begin="00:00:18.56"/clear/
We are going to focus a lot today on
background material. There are


Now that the Internet has gone multimedia, it has become a source of frustration for the Deaf. It is presently difficult to find any multimedia on the web that provides captions. People have been slow in waking up to the important need. Even distance learning by schools and colleges have neglected to provide captions in spite of its being mandated by Federal legislation.

While these skills can be learned and mastered, they do take time and commitment. Those institutions having to provide a lot of captioned material will be able to commit the resources to produce Internet captioned media themselves. Those schools and colleges only doing Internet captioning from time to time may benefit from outsourcing the project. Because EASI captions material regularly, we do ours ourselves, and we are also ready to provide our resources and services to educational and non-profit institutions.

To learn about EASI's Internet captioning service, go to http://www.rit.edu/~easi/caption.htm

Go to previous article 
Go to next article 
Return to 2001 Table of Contents 
Return to Table of Proceedings

Reprinted with author(s) permission. Author(s) retain copyright.