Go to previous article
Go to next article
Return to 2004 Table of Contents
Presenter(s)
Marikka Rypa, Ph.D.
Kevin Erler, Ph.D.
Brent Robertson, M.A.Sc., M.B.A.
Automatic Sync Technologies
Email: Info@automaticsync.com
1 Closed Captioning: Addressing the Need
Closed captioning of popular media such as video and web-based instructional and televised broadcasting services has been a great boon to the hearing-impaired community. The benefits of closed captioning have also been shown for other disadvantaged groups such as limited English proficiency (LEP) populations as well as for adults or children learning to read.[1] Although federal regulations have continued to foster access to media for the disabled (e.g., the American Disabilities act of 1990, and the FCC Telecommunications Act of 1996[2]), obstacles to full captioning coverage and availability remain. High costs and prohibitive turnaround times for closed captioning services have hampered efforts to provide the hearing-impaired with universal access to video resources.
New and innovative techniques for automating the production of closed captioning show great promise in the elimination of these obstacles. It is possible to mine speech processing techniques to develop an automated web-based system that accepts the electronic submission of pre-recorded video or webcast content and automatically returns closed captioning results. By automating a process that is still largely manual, such a system can drastically reduce turnaround times and significantly decrease costs.
2 Reaching a Wider Community of Users
2.1 The Hearing-Impaired Population
According to a report by the National Center for Health Statistics[3], there are currently over 34 million Americans who are hearing-impaired. Individuals with hearing problems are limited in their access to a wide range of informational, educational, and entertainment resources.
Access to training, employment, and career advancement opportunities is similarly limited. Furthermore, as a result of the aging population, it is estimated that between 1990 and 2050, the number of hearing-impaired Americans will increase at a faster rate than the growth of the population.[4]
2.2 The Educational Benefits of Closed Captioning
2.2.1 Students with Learning Differences
Video and captioning are powerful educational tools for students with disabilities when effectively integrated into instruction. Increasingly, educators are experimenting with video and captioning techniques to bolster literacy skills in students who are deaf and hard of hearing and/or who have learning disabilities.[5]
Closed captioning also benefits a wider audience; research cited by the National Center to Improve Practice in Special Education[6] conducted with deaf students, students reading below grade level and students learning English as a second language has shown that captions play a strong role in improving language skills for all groups.
2.2.2 Universal Design and Online Learning
Captioning is also pivotal in the concept of universal design[7] in education, which seeks to make educational environments as usable as possible by as many people as possible regardless of age, ability, or situation.
Universal design does not imply one optimal solution for everyone, but rather reflects an awareness of the unique nature of each learner and the need to accommodate differences, creating learning experiences that suit the learner and maximize his or her ability to progress. Universal Design shifts previous assumptions about teaching and learning in four fundamental ways:[8]
Captioning technology can promote the inclusion of the increasing number of learners whose backgrounds, skills, abilities/disabilities, and interests do not fit traditional "mainstream" models of learning. By using captioning technology originally conceived to better accommodate students with hearing disabilities, we can enhance the online learning experience for all students.
In offering an alternate representation of core material and a very thorough and convenient way to search for key information, captioning broadens the options for learners in how they choose to interact with information. Automated captioning allows content creators to provide effective methods of high-level semantic querying for research and retrieval of relevant learning material.
3 Speech Processing and Closed Captioning
3.1 The Role of Speech Technology
Advances in automatic speech recognition (ASR) software over the past ten years have allowed researchers to begin to apply ASR technology to the problem of automating the captioning process. Almost all of these attempts[9] use commercially available software to perform recognition on a program audio stream, and still require a constrained speaker or linguistic domain to function at an acceptable level.
As a result, previous efforts have not provided fully automated captioning systems. Instead, they have largely resulted in assistive devices that can help speed the captioning process; the process itself remains essentially manual.
3.2 AST: Advances in Automated Closed Captioning
3.2.1 The Current Process
The following steps represent a brief summary of the process of pre-recorded captioning:
A range of commercial software is available to assist in the transcribing and captioning process, but caption editing and timing (Step 2 above) must still be done manually. Using a caption preparation workstation, the caption editor watches and listens to a prerecorded program and then breaks the text into discrete captions; s/he assigns appropriate screen placement to each caption and times the appearance and disappearance of each.
3.2.2 AST Advances
Steps 1 and 4 above are both relatively straightforward and are undertaken no matter what process is employed in captioning. Step 2, the segmentation of the text and the proper alignment of audio and video, represents the most time-consuming and expensive part of the process.
The AST system addresses the effective automation of this part of the process; we have chosen pre-recorded, or off-line captioning, as the immediate focus of our automatic closed captioning work in order to leverage transcription in segmenting and aligning the captions with the audio.
Our system has been designed to accept the electronic submission of the program and its transcript as generated in step 1. It returns a standard-format caption file (for traditional videotape, DVD authoring, or webcasting), which can then be used for subsequent quality review and encoding. In order to create such a system, four major areas of research and development must be addressed.
3.2.3 Four Areas of Technology Focus
There are four key areas of technology focus in our automated captioning system:
4 Current Capabilities: Examples and a Demonstration
To conclude our presentation, we will show examples to illustrate the range of our captioning work with educational institutions and broadcast television. This includes videos produced by educational television, videos produced by the schools themselves, traditional television content, and webcast classroom lectures. To respond to the needs of various institutions, AST automated captioning can readily move among various types of media, from traditional videotapes to DVDs to streaming electronic media for webcasting.
[1]Website:National Center to Improve Practice in Special Education
[2]See http://www.fcc.gov/cgb/dro/ccrules.html
[3]National Center for Health Statistics report,1997
[4]Wisconsin Self Help for Hard of Hearing People, Inc. Mission Statement. http://www.wi.sshh.org
[5]Koskinen, P.S., Wilson, R.M., Gambrell, L.B. & Neuman, S.B. (1993). Captioned video and vocabulary learning: an innovative practice in literacy instruction. The Reading Teacher, 47(1), 36-43.
[6]Website:National Center to Improve Practice in Special Education
[7]E.g., http://www.cast.org/ and www.design.ncsu.edu/cud/
[9] http://www.robson.org/gary/writing/cr-speechrecognition.html
Go to previous article
Go to next article
Return to 2004 Table of Contents
Return to Table of Proceedings