2000 Conference Proceedings

Go to previous article 
Go to next article 
Return to 2000 Table of Contents

Advances in Accessible Web-based Multimedia

Geoff Freed
Project Manager, Web Access Project
CPB/WGBH National Center for Accessible Media
WGBH Educational Foundation
125 Western Ave.
Boston, MA 02134
Phone: 617 300-4223
FAX: 617 300-1035
Email: geoff_freed@wgbh.org

The CPB/WGBH National Center for Accessible Media (NCAM) has made great progress in the past year towards making different types of multimedia accessible to deaf and blind users. NCAM has developed techniques which apply broadcast-based accessibility technologies, including closed captions and audio descriptions, to the Web. This paper will focus on three of those methods, using Apple's QuickTime (TM) software, the World Wide Web Consortium's (W3C) Synchronized Multimedia Integration Language (SMIL), and WGBH's MAGpie caption-authoring software.

QuickTime and Captioning

Apple's QuickTime 4.0 and the QuickTime Player allow captions and descriptions to be added to a movie using either a Macintosh or PC. A QuickTime movie is made up of separate video and audio tracks. Currently, only Apple's multimedia players (MoviePlayer version 2.1 or higher, or the QuickTime Player) allows the user to toggle these media tracks on and off. Because they are discrete, a movie may have multiple audio and video tracks, any number of which may be selected by the user. A user can select the appropriate language track at the time of playback.

In addition to video and audio tracks, one or more text tracks may be included with the clip. A text track becomes, for access purposes, a caption track, but can also be used to provide foreign-language subtitles or even as a search engine indexed by keywords. If the user views the movie clip directly from the Web site using streaming software, the caption track is open-- that is, it can't be turned off. However, if the clip is downloaded and played locally using either the QuickTime Player or MoviePlayer, the caption track may be toggled on or off, thus simulating closed captions. If the clip is downloaded and played using any other multimedia player, the captions remain open.

What few captioned movies are available on the Web usually display the text track in a small window below the video. However, in extensive testing in 1999, NCAM effectively used keyed text-- that is, text displayed over a transparent background-- to position captions over the video track itself (similar to broadcast captions but without the black box behind the text). However, captions shown in this manner place a heavy load on a computer's processor, slowing down the movie considerably. NCAM has also experimented with colored text and moving colored backgrounds which highlight text within each caption as the words are spoken (similar, in many respects, to karaoke). As with keyed text, though, these special features also bog down the entire presentation resulting in clipped audio and jerky video. Despite the speed of today's personal computers, it may still be most efficient to place captions below the video track instead of over it.

Sample captioned QuickTime movie clips and step-by-step details of the captioning process may be found at NCAM's Web site. Samples of a movie with keyed-text captions over video may also be found at this site.

QuickTime and Audio Descriptions

Not only is it possible to add text tracks to a QuickTime movie clip, it is also possible to add extra audio tracks-- specifically, an audio description track, which increases a movie clip's accessibility for people who are blind or visually impaired.

Audio descriptions in QuickTime clips are similar to those found on certain television programs or home videos. Brief narration describing key visual elements is inserted into the pauses of the dialog. This narration makes it easier for blind or visually impaired users to follow the action of a movie. The narration track is recorded separately and simply pasted into the movie. (QuickTime 4.0's QuickTime Player will add audio tracks using either a PC or Macintosh; earlier versions of MoviePlayer will add audio tracks using a Macintosh only.) Like captioned QuickTime movies, the user may toggle the audio description track on and off, depending on the movie playback device being used. Examples of described movie clips and step-by-step details of the audio-description process may be found at NCAM's Web site.

The W3C's Synchronized Multimedia Integration Language (SMIL)

To ease the authoring process of TV-like multimedia presentations on the Web, the W3C has designed the Synchronized Multimedia Integration Language (SMIL). SMIL allows for the creation of time-based, streaming multimedia presentations that combine audio, video, images and text. The SMIL specification defines an XML-based language that allows control over separate elements in a multimedia presentation with a simple, clear markup similar to HTML.

While SMIL was not created specifically for designing accessible multimedia, it can be used to assemble accessible multimedia presentations. The display of captions is somewhat flexible, and SMIL has the capability to play extra audio channels, thus supporting audio descriptions. Media files (audio, video, text) are stored separately and are synchronized at the time of playback. Currently, there are several companies marketing software which can play or create SMIL files: RealNetworks' G2 player and SMIL Wizard, and CWI's GRiNS player and editor are just two of the products available.

NCAM will be making extensive use of accessible SMIL presentations in a collaborative project with the Massachusetts Institute of Technology's Center for Advanced Educational Services. The project, known as "Access to PIVOT" (Physics Interactive Video Tutor) will test and implement the development of multimedia access solutions to make distance learning accessible to blind, low-vision, deaf and hard-of-hearing students.

In addition to developing cost-effective methods to navigate the Web site, NCAM will also research and apply methods to create captions and audio descriptions for the course's dozens of hours of multimedia. Conventionally, audio descriptions are placed in the natural pauses of a program's regular audio track. However, these pauses are frequently not long enough to allow for adequate transmission of descriptive information. NCAM is experimenting with pausing the video track of a SMIL presentation to accommodate lengthy, detailed descriptions, then resuming play of the video and regular program audio. It is hoped that these extended audio descriptions will prove an effective method of describing complex on-screen equations, graphs and scientific demonstrations.

Funding for Access to PIVOT is provided by The Mitsubishi Electric America Foundation, a non-profit foundation jointly funded by Mitsubishi Electric Corporation of Japan and its American affiliates with the mission of contributing to a better world for us all by helping young people with disabilities, through technology, to maximize their potential and participation in society. Future support is anticipated from the National Science Foundation. For more information about SMIL, visit the W3C's Multimedia site. Accessible SMIL clips may be found at NCAM's Web site.


While it is possible to add captions and audio descriptions to QuickTime and SMIL multimedia, there currently does not exist a single editor or application to create captions and audio descriptions. In view of this need, the WGBH Media Access Technologies group has developed the Media Access Generator, or MAGpie.

The MAGpie editor allows the user to type in or import text, format it as captions, add color to both the text and background, and make use of multiple fonts and styles. Once text has been entered and formatted, it can be easily timed by playing the digitized video and pressing a single key once per caption to insert the appropriate timecode. The user can review the captioned movie without exiting the application, make any necessary changes directly in the editor and then see these changes immediately. Once the captions have all been timed and reviewed, the user can output the text in three formats: SMIL, QuickTime and Microsoft's SAMI (Synchronized Accessible Media Interchange format). The current version of MAGpie allows only for the creation of captions. Future releases will incorporate support for audio descriptions.

Funding for MAGpie comes from the Trace Research and Development Center at the University of Wisconsin, as part of its Information Technology Access Rehabilitation Engineering Research Center which itself is funded by the U.S. Department of Education's National Institute on Disability and Rehabilitation Research. Download MAGpie from NCAM or Trace, free of charge.

Go to previous article 
Go to next article 
Return to 2000 Table of Contents 
Return to Table of Proceedings

Reprinted with author(s) permission. Author(s) retain copyright.