2006 Conference General Sessions

Go to previous article
Go to next article
Return to 2006 Table of Contents


David Klein
Law, Health Policy & Disability Center
280-1 Boyd Law Building, College of Law
Iowa City IA 52242
Day Phone: 319 335-6748
Fax: 319-335-9764
Email: david-klein@uiowa.edu

Presenter #2
Kenneth Thompson
Law, Health Policy & Disability Center
280-1 Boyd Law Building, College of Law
Iowa City IA 52242
Day Phone: 319-335-6748
Email: kenneth-d-thompson@uiowa.edu


Sentence Summary: Captioning of video is essential for full Web accessibility. This hands-on session will address captioning protocols and how to assemble pieces for accessible web video.

Captioning of video is essential for full Web accessibility. This hands-on session will address captioning protocols and how to assemble pieces for accessible web video.

The Law, Health Policy & Disability Center is committed to presenting all of its web-based content in accessible formats. When we began to include video-based training on our site we were confronted with how to make it available to all.  Differences in how multimedia applications and other web delivery applications deliver accessible video have been the biggest challenge to delivery of the message. To provide the best user experience among these variables, video developers have to account for these differences.

Although organizations such as the Worldwide Web Consortium (W3C) attempt to produce specifications and guidelines that would allow applications to interoperate and developers to produce materials that can be presented with easy integration among applications, the end user applications differ in how they comply with these specifications from one brand to another, from one OS to another, as well as from version to version of the same application.  

The Players

Multimedia playback on computers is handled by software applications such as Windows Media Player, QuickTime, RealMedia, Flash, Java and others. All these players are capable of displaying caption text (or other media elements) synchronized with the playback of video. However, no single multimedia application is used by a majority of people.  According to a 2002 study by Nielsen/NetRatings, not even half of work computers (45%) showed usage of video/multimedia applications, and that group was shared by Windows Media Player (28%), RealMedia (27%), and QuickTime (13%).(1) The numbers for home computer usage was lower. A more current estimate by NPD Online Research (Macromedia, 2005), for multimedia applications actually installed on Internet-connected computers, showing potential for use, nearly all computers have some multimedia application, with Flash at 97.6%, Windows Media Player at 84.3%, QuickTime at 64.1%, and Real at 58.9% installation.   Although Flash is not a standard!
new video player per se, it has a large installation base because different versions of the player have been part of most browser installation packages since the mid-1990s.

The Protocols

Current web-based video technologies require a complex orchestration of applications and media files.  Browsers must launch multimedia applications, which often stream or download video, captioning, and audio description information simultaneously.  This disparate information must be synchronized and displayed smoothly within a variety of browser windows.
The computer playback of multimedia synchronized with text is typically controlled by a script or text file interpreted by the player software. Of the several scripting protocols for synchronized multimedia, the one that is used depends on the choice of player (or players).
Synchronized Multimedia Integration Language (SMIL) for QuickTime or RealMedia and Synchronized Accessible Media Interchange (SAMI) for Windows Media Player are the two dominant protocols. Developers can also customize interfaces using other multimedia applications, such as Flash or Java, to enhance their web delivery. Both SMIL and SAMI solutions use multiple files to achieve the synchronized final playback for the end user. The video file is one component, the caption text file with timing markers is another and the coordinating script file identifying the location of the other files is the third.
In each case, the HTML file on the web points to the coordinating script file, which lets the player combine timed text and video on the fly. Often the HTML embeds the player within other text and graphics, all of which must be displayed in a browser window correctly and accessibly.
Embedding multimedia players in a web page with HTML can be tricky.  Microsoft, Apple, and Macromedia provide models usually using both the OBJECT and EMBED tags.  However this model does not validate as HTML 4.0, which creates an accessibility issue.  An alternative method, called the Satay method, has been developed, which does validate, is more parsimonious, and seems to render as well in various browsers as the conventional models.

The Process

At the Law, Health Policy & Disability Center, to deliver a large number of training videos to a disparate audience, we have chosen multiple formats, using Flash as our primary multimedia application but distributing the load also to Windows Media Player and QuickTime.(2)  We do our own video production, editing and compression into web formats.  We generate and proofread the transcripts of our videos.  With a completed video and transcript in hand, we divide the transcript into caption-sized chunks in Microsoft Word, eliminating characters which render poorly in the multimedia players (e.g., smart quotes, em dashes) and save as a text file.
The caption text file is imported into captioning software such as MAGpie and time-stamped by an operator in real time as the video plays. The software exports the appropriate files (SAMI and QuickTime text files), which contain the caption text and timestamp readable by the player software. The script file which combines the movie and caption text file is created. An .asx file for the Windows Media Player version which points to the Windows Media video file (.wmv) and the SAMI file. A SMIL file for the QuickTime (and RealMedia) version which points to the QuickTime (.mov) and to the QuickTime text file (.txt). These files are uploaded to the server, tested and proofed within their web pages a final time before making them available to the public.

Our Flash Video Player

Our decision to focus on Flash was based on a variety of reasons.  The most obvious was the large installation base of the application, which would allow us to deliver video in a standard-looking interface to a wide audience. Although the accessibility features of Flash have ranged from nonexistent to good but not complete, we felt that by version 7 the application had matured enough in its accessibility that we could develop a simple interface that was easily as accessible as any other multimedia player available.  An added advantage of Flash is its speed. Where QuickTime player may take ten to fifteen seconds to open, Flash can be open in two or three seconds on even a slow computer.  
Flash also has some advantages that mitigate some it its disadvantages. We can generate a caption file for either QuickTime or Windows Media Player (using MAGpie or Hi-Caption) and can use the same file for the Flash player. This saves development time.  We convert an existing QuickTime (.mov) movie to Flash (.flv) video, which can be done either in the Flash development environment or through third party compression software. The movie metadata are added to a small XML file, which tells the Flash player the location of the video and caption files, and how large to make the screen.  Then the pieces are uploaded to the server.  In other words, once the video is produced and the caption file is created for QuickTime or Windows Media Player, deploying a Flash video takes only a few extra minutes.


In this session, we will discuss differences among web-based multimedia players, concepts behind closed captioning, the basics of SMIL and SAMI protocols, and the nuts and bolts of embedding accessible video in web browsers.  Participants will use captioning software to generate a caption file (SMIL or SAMI format) from a transcript.  Then they will create scripting files needed to synchronize web-based video.  Finally, they will assemble these files to produce a web-enabled QuickTime video, a Windows Media Player video, and a Flash video using different HTML coding techniques embedding the video into web pages.


Macromedia, Inc. (2005). Macromedia Flash Player Statistics. Retrieved September 29, 2004, from http://www.macromedia.com/software/player_census/npd/.

Nielsen/NetRatings. (April, 2002). Nielsen//NetRatings launches new Web multimedia format report, tracking RealMedia, Windows Media and QuickTime.  Retrieved September 23, 2004, from http://www.nielsen-netratings.com/pr/pr_020620.pdf.

World Wide Web Consortium. (1999, May 5). Web Content Accessibility Guidelines 1.0. Retrieved July 3, 2002, from http://www.w3.org/TR/WCAG10/.


 (1)One can assume that the numbers quoted by Macromedia will show its product, Flash, in the best light, as compared to its major competitors. Note the Web log, Downey, M. (2004). [Video] Why deploy video with Flash instead of the other guys? In Macromedia Weblogs. Retrieved September 28, 2004, from http://www.markme.com/md/archives/004841.cfm and the following reply by "Jensa." The numbers may not be accurate for actual usage or installation of these players nor would they be useful for specific venues, such as the workplace or public libraries.  However, the numbers are roughly useful for comparisons of the software installations.

 (2)For examples of videos we have produced using this system, see http://disability.law.uiowa.edu/dpn/video/dpn_112004/dpn_112004_index.html

Go to previous article
Go to next article
Return to 2006 Table of Contents

Reprinted with author(s) permission. Author(s) retain copyright