2001 Conference Proceedings
Go to previous article
Go to next article
Return to 2001 Table of Contents
Click the Captions, Select the Descriptions: Making
Captioning and Video Description Essential for Any Learner in
Broadband Education
Jutta Treviranus
Adaptive Technology Resource Centre
University of Toronto
jutta@utoronto.ca
16.10.00
Abstract...
The goals of equal access and on-line interactive learning are
converging. The shifting landscape in delivery of education by
broadcasters provides an opportunity to establish new conventions
and paradigms. This paper describes a project that uses enhanced
access tools to provide interactivity and personalization of
learning materials for learners with or without disabilities. It
is hoped that access tools will become an essential and not
special component of all broadband learning environments.
The Context
Television and the Web are converging. For educational TV
broadcasters this opens the possibility for new learning
paradigms. Access tools such as captioning and video description
provide an ideal vehicle for interactivity as well as a method of
accommodating a large range of learning styles and levels. The
Adaptive Technology Resource Centre in collaboration with
Canadian Learning Television and eight partner organizations is
developing tools that will establish captioning and video
description as essential components of interactive educational
video over broadband for all learners.
Combining the Old with the New
Educational broadcasters have predominantly used a lecturer
paradigm, augmented with audio-visual demonstrations. This
paradigm was well suited to the unidirectional nature of
broadcasting. When done well, this approach combined
entertainment, story-telling and theatre to engage the learner.
Unfortunately, due to its linear and inflexible nature there are
many instances where learning breaks down when using this
approach. This breakdown can be attributed to communication
problems: "I didn't hear an important concept, I missed part of
an argument, I don't understand a word or term used." It can be
due to a mismatch in the assumptions regarding the prior
knowledge of the learner: "What do you mean by terminal
velocity." It can be due to a mismatch in learning styles: "Can
you show me that rather than describing it. Can you illustrate it
in another way." It can be due to a mismatch in pace: "You're
going too fast, you've lost me. You're going too slow, I'm
bored." Whatever the cause of the breakdown, once the learner is
disengaged from the process it is very difficult to pick up the
threads again to achieve the learning goals.
The quality and quantity of learning material produced by
educational broadcasters is very impressive. At a time when we
are searching for content for on-line learning, educational
videos offer a rich store of resources if we can successfully
adapt and re-purpose them in an on-line environment.
The overall goal of the project entitled "Creating Barrier-free
Broadband Learning Environments," is to identify potential
barriers to access in broadband education delivery systems for
learners with disabilities, develop solutions to the barriers,
advance alternative or multi-modal display and control mechanisms
that are only possible in broadband environments and create tools
that allow learners to customize the learning experience to their
individual learning styles and needs. In meeting these goals the
project will also develop a means of creatively re-purposing
quality traditional educational programming while addressing the
problems that cause breakdown in learning.
The Essential Role of Captioning
A general objective of the project is to adapt the material to
meet the needs of the broadest range of learners, both from an
equal access perspective and from a knowledge level perspective
(e.g., making college level physics accessible to grade 9
students); and to make it highly interactive and responsive to
the specific needs of the learner. Captioning plays an essential
role in this objective. The verbatim captions are used for
several purposes. Captioning is used to structure and markup the
video. This structure is then used to navigate within the video
and to condense or expand the material. For example, if a learner
wants to go back to every mention of a specific term, the caption
would be used to sort the start time codes for the
segments.
The traditional "Line 21" captioning is replaced with enhanced
multimedia overlay captions. Standard captions are limited to
text and restricted to either the bottom or top of the screen.
This makes it difficult to communicate the source of the sound or
speech, it also makes it very difficult to communicate non-speech
sound-based information such as inflection, tone of voice, speech
rate, music, and other non-speech sounds. By allowing multimedia
overlay captioning, comic book conventions can be adopted to
indicate the source of sound. Color, animation and graphics can
be used to indicate affect, music and non-speech sounds. A video
window can be invoked to provide ASL/LSQ translation. Captions
can also be used to label visual objects or highlight a part of
the video frame.
Most importantly the captions will be used for hyper-linking.
Thus if the learner wants more information about a term, a
definition, background material, related material or an
interactive exercise that further illustrates a concept, they
would click on the term or phrase in the caption which would
pause the video and take them to the supportive material.
Tell Me More with Video Descriptions
Video description, beyond making the video accessible to learners
who are blind, will be used to elucidate, provide further detail
or clarify. Thus for a learner having difficulty in following the
steps in a chemistry experiment, the video description would
elucidate the steps in greater detail than provided by the
original video.
Putting these pieces together you can imagine the following
scenario: "Watching a physics lecture by an eminent physicist, a
phenomenon is referred to that you know little about, you turn on
captioning and click on the term in the text caption, this links
you to a definition of the term. To find out more about the
phenomenon referred to you click on an interactive exercise that
illustrates the concept. To better understand the forces at play
you turn on haptic rendering and use your force feedback joystick
to feel what is happening. Once you are confident that you
understand the term you return to the lecture. The lecturer moves
to a demonstration, some of which you find difficult to follow,
you turn on descriptive video which provides a subnarrative in
the audio pauses further describing what is happening. For
additional help you turn on overlay captions that provide text
labels of the objects and processes occurring in the
demonstration. You control the interface using a simple set of
voice commands."
Education that Fits the Learner
One important step in preventing the breakdown of learning is to
insure that the education fits the learner. The learner should be
able to customize the amount and type of background given, the
detail or verbosity of the dialogue, the reading level required
to follow the material, the pace of the teaching/learning, the
methods used to demonstrate concepts, and the learning outcomes
to be achieved (e.g., the general theory versus applying the
actual algorithm). A properly structured broadband learning
environment should allow this customization.
XML Schema and Practicing what we Preach
To allow customization of the learning material the ATRC and its
partners are creating XML schema for Captioning and Video
Description. The schema will be partially based upon the DAISY
standard. The XML schema combined with XSLT style sheets will
allow user specification of how captions are displayed (e.g.,
font, color, position, hyperlink, etc.). A Meta-tagging scheme
(based upon international learning object meta-tagging standards)
will be developed to allow the storage and retrieval of caption
tracks for different reading levels, languages, levels of
verbosity, etc. A similar classification will be possible for the
video description audio tracks. Thus a learner can take a
specific lesson, choose a caption track suited to their reading
level, displayed the way they want it, with the desired level of
description in both the captioning and video description, and the
desired amount and type of background material in-line and linked
to the lesson. Alternatively, a teacher can re-purpose a lesson
for the needs of a specific class, allowing further individual
customization for specific students.
The Necessary Tools
To make the above described scenario possible the ATRC and its
partners are developing a number of tools. These include the
authoring tools needed to create the enhanced captioning and
video description, the browser/viewer required to specify user
preferences and view the enhanced materials and the learning
repository needed to store and retrieve the associated learning
objects.
The enhanced caption and video description authoring and mark-up
tools will be created as modular components to be added to
existing video authoring tools for the web. The files created
will be SMIL compliant. Initially, Quicktime sprites will be used
to create the more advanced interactive components. It is hoped
that Magpie, developed by NCAM can act as the base for the
caption authoring tool. To make the authoring process realistic
for the typical educator a number of intelligent preprocessing
and script or text track management tools will be included.
To allow the expression of user preferences and to allow the
assembly of learning objects on the fly based upon the user
preferences a browser/viewer will be created using Mozilla and
XUL. Thus the learner can adapt the browser interface, the type
and order of learning objects and how they are displayed. The
captions and video descriptions would act as anchors to related
resources, interactive exercises, and other learning
resources.
It is hoped that the functionality required of the learning
repository can be integrated into national learning repositories
presently under development. The project will model the
functionality for the purposes of the project and advocate for
its inclusion with national and international groups governing
large learning repositories. Necessary functionality includes the
facility to create a number of assembly objects that can call
collections of atomic objects (caption tracks, URLs, audio
description tracks).
Educators and producers of learning objects would collectively
contribute to the learning repository, thereby reducing the
amount of development required for a specific learning module.
Once the objects in the repository have reached a critical mass,
the objects can be reused for several learning modules.
Conclusions
The principles of both equal access and successful learning are
to allow a broad and flexible range of display, control and
interaction techniques. The terminology may have differed in the
two fields but the functionality is the same. The field of equal
access has had much more experience in implementing these
principles in a technical environment. In this time of converging
technologies and shifting paradigms, we have much to teach
educators. If done correctly, access tools can become an
essential and not a special component of broadband learning
environments.
References
Further information about the project and related material can be
found at the project web site:
Acknowledgements
The project is partially funded by Canarie Inc.
The author would like to acknowledge all project partners and
staff. For a complete list please refer to: http://snow.utoronto.ca/barrier-free
Go to previous article
Go to next article
Return to 2001 Table of Contents
Return to Table of
Proceedings
Reprinted with author(s) permission. Author(s) retain copyright.