2001 Conference Proceedings

Go to previous article 
Go to next article 
Return to 2001 Table of Contents


DIGITAL TALKING BOOKS: DEVELOPING A USER INTERFACE

John Cookson (jcoo@loc.gov)
Tom McLaughlin, and Lloyd Rasmussen
National Library Service for the Blind and Physically
Handicapped The Library of Congress
1291 Taylor St. NW
Washington, DC 20542

ABSTRACT

On the basis of the NLS-sponsored NISO standard for a digital talking book (DTB), we propose a method to develop a comfortable and effective user interface. Our proposed method is: Select several books of moderate complexity, format them in conformance with the standard, and send them to patron-evaluators on individually registered DVDs. Develop simple, PC-hosted, player software for distribution to the evaluators. Modify player software in response to evaluator suggestions while gradually increasing book complexity and player capability. Select a feature subset, build a flexible haptic interface and iterate that design.

INTRODUCTION

The National Library Service for the Blind and Physically Handicapped (NLS) presently produces about 2000 recorded books per year in approximately 1000 copies each on specially formatted cassette tape, and distributes them free of charge to a readership of 764,000 eligible readers via a nationwide network of participating libraries. Books and players are delivered "free matter" through the U.S. Postal Service. NLS is presently engaged in a long term effort to convert its audio book and magazine services from analog to digital methods. Outline plans for the conversion are expressed in terms of a 20-step program and other summary documents that may be found on our web site, http://www.loc.gov/nls/dtb.html. Two recent papers summarize what progress has been made and what further work needs to be done. They may be found in the Aug. 2000 issue of Information Technology and Disabilities, http://www.rit.edu/~easi/itd/itdv07n1/contents.html. For a complementary approach to player evaluation, please see also at this site, Dr. Sara Morley's report on the evaluation of the Daisy player.

Readers of this CSUN paper should understand that it is a proposal. It does not constitute an implication of commitment, funding, or approval by Library management. It does not represent an intent to make a significant collection available via the internet or on DVDs. It does not imply any intent to require or provide computers or internet access to NLS patrons. It does represent an effort by NLS researchers to explore ways to design comfortable, convenient, and efficient patron access to future products. It proposes a way to design software and hardware controls for future DTB players. While the authors are not soliciting volunteer participants, at this time, they hope that this proposal can be strengthened or enhanced through community review and comment.

OVERVIEW

In summary, we propose a three step process that includes an evaluator feedback loop:

1. Build at least ten sample DTBs that have various levels of complexity.

2. Build simple player software and give it to evaluators having sufficient resources.

3. Change the software in response to evaluator suggestions and repeat the process.

STEP ONE:

Build at least ten sample DTBs that have various levels of structural complexity.

To evaluate a book player, a set of sample books will be necessary. We propose a collection of at least ten DTBs built in conformance with the NISO standard. This standard is presently posted for public comment at http://www.loc.gov/nls/niso. A book built to the standard consists of a file set, or folder, that includes audio, probably MPEG encoded files, an optional XML text file, an index file for rapid random access and a text/audio synchronization file. To this set we would add a readme file to help evaluators get started. In terms of print-book structure, the standard supports various levels of complexity. The range can go from very simple, such as a novel with only a table of contents and chapters, to very complex, such as a cookbook with many categories and perhaps hundreds of recipes. Please see our standards paper at http://www.rit.edu/~easi/itd/itdv07n1/contents.html, for a summary description of DTB components and how they relate to book complexity. We would not include in our initial sample set the full range of complexity. The first ten would emphasize leisure reading such as novels with a table of contents (TOC) listing only chapters. We would also include a few titles of more elaborate structural complexity such as a collection of short stories, each having chapters and sub-sections and perhaps having footnotes containing analytical comments. The average duration of an audio book is 12 hours or about 250MB of compressed audio. Using a 56 Kbps modem, it would take about 12 hours to download the book, not a reliable nor appealing delivery method. The most practical method at present would be to mail a DVD, one per book. As a copyright protection mechanism, the DVDs would be registered to the individual evaluator and controlled by NLS. Although NLS patrons are very familiar with the need to safeguard copyrighted material and have a long history of successfully doing so, evaluators would be reminded to safeguard the material in the book's readme file. This DVD set might be available as early as July 2001.

STEP TWO

Build simple player software and give it to evaluators having sufficient resources.

A simple player implements basic features cited in the navigation and playback features lists found at http://www.loc.gov/nls/niso. It supports stop (return to the navigation center), play, pause, fast forward, and fast reverse. It presents the audio files in proper order from beginning to end without user intervention; supports multiple bookmarks subject to disk space limitations (set, identify, and clear); rapid access to major divisions that are explicit in the navigation center such as chapters; is based on a common browser such as IE5; is fully keyboard-accessible; is self-voicing so no screen reader is needed for its use; has variable presentation rate; optional reading or skipping of entities such as footnotes. It has "alpha" status meaning it is tested on the designer's system but proper and harmless operation on other systems cannot be guaranteed. A simple player does not support: full text search; dictionary (define, spell, alternate pronunciations, etc.); user preference profiles; audio annotation; speech recognition for voice control (although a knowledgeable user could add it); jump to page; highlighting; text attribute indication (italics, bold, etc.); and other complex navigation features that depend on the presence of full text.

A manageable number of evaluators to begin with might be about 20. This number would be increased when the player becomes stabilized and ultimately, many thousands of patrons would be involved in field testing of a hardware implementation that is based on the results of this evaluation. Resources needed by evaluators include the following:

STEP THREE

Change the software in response to evaluator suggestions and repeat the process.

Evaluators must agree to listen to significant portions of all of the sample books; exercise all player features; provide designers with a brief overview and a few terse suggestions for improvement or addition of features. This report could be a one-page email note.

In response to evaluator suggestions, designers will make changes and evaluators will repeat step three. The history of software suggests that this process will yield a very rich feature set, user-tested and reasonably bug-free after about 5 iterations. There will also be a need for designers to provide more complex test books to exercise the feature set as it becomes more elaborate.

After multiple iterations, evaluators with special expertise would be encouraged to suggest a haptic interface for a hand-held player that implements a sub-set of the software feature set. We expect that the evaluation-improvement cycle will reveal what features are most important to users and thus what feature sub-set should have the highest priority for hardware implementation. This insight will come from "hands-on", in-home use by patrons familiar with present methods. We recognize that with this approach there is some danger that results will favor the more technically savvy sector of our patron population. Being forewarned, however, we can take special care to make evaluators as representative as possible and use results as guidance for follow-on studies. In the follow-on studies we would introduce an interface to the computer via a special purpose switch matrix that emulates keyboard commands. Such devices are affordable enough to allow for multiple copies of several varieties that could be improved using an iterative process similar to the software development method proposed in this paper. A remote control might even be considered so that the user need not even be aware of the computer involved in the development process. We plan to have a generic prototype switch matrix in laboratory evaluation by July 2001. When player software evaluators reach a consensus on how to control the full feature set, what the haptic interface should begin with, and what subset it should implement, we would then recommend a wider study involving perhaps thousands of representative patrons using a prototype switch matrix. The details of this step are in the proposal stage.

CONCLUSION

The "1, 2, 3" iterative player software and hardware development process proposed here yields two DTB interfaces, rigorously tested and evaluated "hands-on" by NLS patrons in their own homes or other everyday setting. The first interface will be computer-based, full-featured, and capable of exploiting complex DTBs. The second will be haptic, where the evaluator need not be aware of a computer's presence. It will support a feature subset, emphasize leisure reading and provide clear guidance for the design of a portable DTB reader.


Go to previous article 
Go to next article 
Return to 2001 Table of Contents 
Return to Table of Proceedings


Reprinted with author(s) permission. Author(s) retain copyright.