2001 Conference Proceedings
Go to previous article
Go to next article
Return to 2001 Table of Contents
DIGITAL TALKING BOOKS: DEVELOPING A USER INTERFACE
John Cookson (jcoo@loc.gov)
Tom McLaughlin, and Lloyd Rasmussen
National Library Service for the Blind and Physically
Handicapped The Library of Congress
1291 Taylor St. NW
Washington, DC 20542
ABSTRACT
On the basis of the NLS-sponsored NISO standard for a digital
talking book (DTB), we propose a method to develop a comfortable
and effective user interface. Our proposed method is: Select
several books of moderate complexity, format them in conformance
with the standard, and send them to patron-evaluators on
individually registered DVDs. Develop simple, PC-hosted, player
software for distribution to the evaluators. Modify player
software in response to evaluator suggestions while gradually
increasing book complexity and player capability. Select a
feature subset, build a flexible haptic interface and iterate
that design.
INTRODUCTION
The National Library Service for the Blind and Physically
Handicapped (NLS) presently produces about 2000 recorded books
per year in approximately 1000 copies each on specially formatted
cassette tape, and distributes them free of charge to a
readership of 764,000 eligible readers via a nationwide network
of participating libraries. Books and players are delivered "free
matter" through the U.S. Postal Service. NLS is presently engaged
in a long term effort to convert its audio book and magazine
services from analog to digital methods. Outline plans for the
conversion are expressed in terms of a 20-step program and other
summary documents that may be found on our web site,
http://www.loc.gov/nls/dtb.html. Two recent papers summarize what
progress has been made and what further work needs to be done.
They may be found in the Aug. 2000 issue of Information
Technology and Disabilities, http://www.rit.edu/~easi/itd/itdv07n1/contents.html.
For a complementary approach to player evaluation, please see
also at this site, Dr. Sara Morley's report on the evaluation of
the Daisy player.
Readers of this CSUN paper should understand that it is a
proposal. It does not constitute an implication of commitment,
funding, or approval by Library management. It does not represent
an intent to make a significant collection available via the
internet or on DVDs. It does not imply any intent to require or
provide computers or internet access to NLS patrons. It does
represent an effort by NLS researchers to explore ways to design
comfortable, convenient, and efficient patron access to future
products. It proposes a way to design software and hardware
controls for future DTB players. While the authors are not
soliciting volunteer participants, at this time, they hope that
this proposal can be strengthened or enhanced through community
review and comment.
OVERVIEW
In summary, we propose a three step process that includes an
evaluator feedback loop:
1. Build at least ten sample DTBs that have various levels of
complexity.
2. Build simple player software and give it to evaluators having
sufficient resources.
3. Change the software in response to evaluator suggestions and
repeat the process.
STEP ONE:
Build at least ten sample DTBs that have various levels of
structural complexity.
To evaluate a book player, a set of sample books will be
necessary. We propose a collection of at least ten DTBs built in
conformance with the NISO standard. This standard is presently
posted for public comment at http://www.loc.gov/nls/niso.
A book built to the standard consists of a file set, or folder,
that includes audio, probably MPEG encoded files, an optional XML
text file, an index file for rapid random access and a text/audio
synchronization file. To this set we would add a readme file to
help evaluators get started. In terms of print-book structure,
the standard supports various levels of complexity. The range can
go from very simple, such as a novel with only a table of
contents and chapters, to very complex, such as a cookbook with
many categories and perhaps hundreds of recipes. Please see our
standards paper at http://www.rit.edu/~easi/itd/itdv07n1/contents.html,
for a summary description of DTB components and how they relate
to book complexity. We would not include in our initial sample
set the full range of complexity. The first ten would emphasize
leisure reading such as novels with a table of contents (TOC)
listing only chapters. We would also include a few titles of more
elaborate structural complexity such as a collection of short
stories, each having chapters and sub-sections and perhaps having
footnotes containing analytical comments. The average duration of
an audio book is 12 hours or about 250MB of compressed audio.
Using a 56 Kbps modem, it would take about 12 hours to download
the book, not a reliable nor appealing delivery method. The most
practical method at present would be to mail a DVD, one per book.
As a copyright protection mechanism, the DVDs would be registered
to the individual evaluator and controlled by NLS. Although NLS
patrons are very familiar with the need to safeguard copyrighted
material and have a long history of successfully doing so,
evaluators would be reminded to safeguard the material in the
book's readme file. This DVD set might be available as early as
July 2001.
STEP TWO
Build simple player software and give it to evaluators having
sufficient resources.
A simple player implements basic features cited in the
navigation and playback features lists found at http://www.loc.gov/nls/niso.
It supports stop (return to the navigation center), play, pause,
fast forward, and fast reverse. It presents the audio files in
proper order from beginning to end without user intervention;
supports multiple bookmarks subject to disk space limitations
(set, identify, and clear); rapid access to major divisions that
are explicit in the navigation center such as chapters; is based
on a common browser such as IE5; is fully keyboard-accessible; is
self-voicing so no screen reader is needed for its use; has
variable presentation rate; optional reading or skipping of
entities such as footnotes. It has "alpha" status meaning it is
tested on the designer's system but proper and harmless operation
on other systems cannot be guaranteed. A simple player does not
support: full text search; dictionary (define, spell, alternate
pronunciations, etc.); user preference profiles; audio
annotation; speech recognition for voice control (although a
knowledgeable user could add it); jump to page; highlighting;
text attribute indication (italics, bold, etc.); and other
complex navigation features that depend on the presence of full
text.
A manageable number of evaluators to begin with might be about
20. This number would be increased when the player becomes
stabilized and ultimately, many thousands of patrons would be
involved in field testing of a hardware implementation that is
based on the results of this evaluation. Resources needed by
evaluators include the following:
- Minimum computer hardware: Pentium III, 366 MHz; 10GB
available disk space; audio output; DVD reader; keyboard;
internet access.
- Minimum computer software: Windows 98 and versions of Windows
Media Player and Internet Explorer that support MPEG3 and XML,
respectively.
- Be a registered NLS patron selected as a volunteer evaluator;
Player software could be ready for distribution to evaluators by
July 2001. To facilitate timely updates, playback software
distribution would be by download from a password-protected web
site. Self installation and a readme file would be included.
STEP THREE
Change the software in response to evaluator suggestions and
repeat the process.
Evaluators must agree to listen to significant portions of all
of the sample books; exercise all player features; provide
designers with a brief overview and a few terse suggestions for
improvement or addition of features. This report could be a
one-page email note.
In response to evaluator suggestions, designers will make
changes and evaluators will repeat step three. The history of
software suggests that this process will yield a very rich
feature set, user-tested and reasonably bug-free after about 5
iterations. There will also be a need for designers to provide
more complex test books to exercise the feature set as it becomes
more elaborate.
After multiple iterations, evaluators with special expertise
would be encouraged to suggest a haptic interface for a hand-held
player that implements a sub-set of the software feature set. We
expect that the evaluation-improvement cycle will reveal what
features are most important to users and thus what feature
sub-set should have the highest priority for hardware
implementation. This insight will come from "hands-on", in-home
use by patrons familiar with present methods. We recognize that
with this approach there is some danger that results will favor
the more technically savvy sector of our patron population. Being
forewarned, however, we can take special care to make evaluators
as representative as possible and use results as guidance for
follow-on studies. In the follow-on studies we would introduce an
interface to the computer via a special purpose switch matrix
that emulates keyboard commands. Such devices are affordable
enough to allow for multiple copies of several varieties that
could be improved using an iterative process similar to the
software development method proposed in this paper. A remote
control might even be considered so that the user need not even
be aware of the computer involved in the development process. We
plan to have a generic prototype switch matrix in laboratory
evaluation by July 2001. When player software evaluators reach a
consensus on how to control the full feature set, what the haptic
interface should begin with, and what subset it should implement,
we would then recommend a wider study involving perhaps thousands
of representative patrons using a prototype switch matrix. The
details of this step are in the proposal stage.
CONCLUSION
The "1, 2, 3" iterative player software and hardware development
process proposed here yields two DTB interfaces, rigorously
tested and evaluated "hands-on" by NLS patrons in their own homes
or other everyday setting. The first interface will be
computer-based, full-featured, and capable of exploiting complex
DTBs. The second will be haptic, where the evaluator need not be
aware of a computer's presence. It will support a feature subset,
emphasize leisure reading and provide clear guidance for the
design of a portable DTB reader.
Go to previous article
Go to next article
Return to 2001 Table of Contents
Return to Table of
Proceedings
Reprinted with author(s) permission. Author(s) retain copyright.