Go to previous article
Go to next article
Return to 2000 Table of Contents
Chieko Asakawa, Hironobu Takagi and Takashi Itoh
chie@jp.ibm.com, takagih@jp.ibm.com and JL03313@jp.ibm.com
IBM Japan Ltd., Tokyo Research Laboratory
1624-14, Shimotsuruma, Yamato-shi, Kanagawa-ken 242-8502,
Japan
Personal computers have played an important role in enabling
the blind to access printed documents. Blind users can
communicate with others by e-mail and can read and write
electronic documents by using screen readers and other
assistive technology software. These days, however, electronic
documents are becoming increasingly visual, with various fonts,
colors, and illustrations. In addition, their layouts are
becoming more complex, since authors are paying more attention
to visual appearance. These visual characteristics are making
electronic documents increasingly inaccessible to blind users.
They have a hard time reading such formatted documents with
screen readers, and need to know how to operate a variety of
applications just to read documents. Consequently, they tend to
ask sighted people to provide documents in plain text format.
However, automatic conversion of formatted documents into plain
text format sometimes destroys the original logical structure.
Sighted people therefore have to spend time checking whether
the conversion was successful or not.
Another issue that affects blind users is the difficulty of
using a screen reader to access presentation materials created
by using software such as Lotus Freelance and MS PowerPoint(R).
They are often unable to include visual elements in their own
presentations or to follow presentations given by others
without assistance, because they cannot access presentation
software designed for sighted people.
We therefore decided to develop a document reader to allow
blind users to access various types of formatted document
through a single user interface. Our system uses an object
model of each application, so it can present the contents of a
document without regard to its layout. The logical structure of
the sentences can also be navigated, so it is easy to detect
titles, headings, list items, paragraphs, sentences, and so
on.
Since the user interface for navigating through a document is
universal, users do not need to operate any word processors or
presentation software to read formatted documents. In the
following section, we will describe the user interface of our
prototype document reader.
Overview of the system The figure above shows the structure of
our document reader. Documents should be opened through
Explorer. When a document is opened, it is listed in the
document selector. The system uses a numeric keypad for command
input. When the minus key is pressed, the document selector
becomes active. The name of any document in the selector can be
announced by pressing the up/down cursor key. When the Enter
key is pressed, the document becomes ready for reading.
Currently, our prototype system allows users to access MS Word
(abb. Word) and Lotus WordPro(TM) (abb. Wordpro) as word
processors and MS PowerPoint(R) (abb. Powerpoint) and Lotus
Freelance (abb. Freelance) as presentation software.
When a user presses one of the navigation keys on the numeric
keypad, the document reading handler communicates with an
object model and, after getting the appropriate information
through the latter, sends it to a TTS engine. Three TTS engines
are now available with the system: Viavoice Outloud, L&H,
and ProTalker for Japanese. The engine can be toggled by
pressing the enter key followed by the asterisk key. When a
braille pin display is connected, the system is also capable of
braille output. The enter key can be used to stop the speech
output.
Navigation functions for word processors Table 1 shows
navigation keys for word processors. Navigation keys that are
frequently used for reading documents in various ways, such as
paragraph by paragraph, sentence by sentence, word by word,
page by page, and heading by heading, are provided for both
Word and WordPro(TM).
The functions of each application are somewhat different,
because each application has different characteristics and each
object model has different functions. Since there is no
function for determining the current character in the WordPro
object model, the system could not provide a character jump key
for WordPro.
Table 1: Navigation keys for MS Word & Lotus WordPro 1
Jump to previous page
2 Read current page
3 Jump to next page
4 Jump to previous sentence/paragraph
5 Read current sentence/paragraph
6 Jump to next sentence/paragraph
7 Jump to previous word
8 Read current word
9 Jump to next word
0 Play from current sentence
Plus + 1 Jump to first page
Plus + 2 Toggle jump mode (page/heading)
Plus + 3 Jump to last page
Plus + 4 Play from top of document
Plus + 5 Toggle jump mode (sentence/paragraph)
Plus + 6 Jump to last sentence/paragraph
Plus + 8 [Word only] Toggle jump mode (word/character)
Navigation functions for presentation software Table 2 shows
navigation keys for presentation software. A unique feature of
the system is its ability to deal with a slideshow mode.
Usually it is very hard for blind presenters to know which
slide is currently on the screen, and they have to pay careful
attention to avoid making mistakes. This function allows users
to control a slide, since it reads out the title of a new slide
when it appears on the screen. In this way, it creates a much
more user-friendly presentation environment.
Presentation packages are especially difficult to read through
screen readers. Our system reads all the text information
contained in a slide, just as if it were a text document. The
logical structure of a slide can be navigated by using each
object model's capabilities.
Table 2: Navigation keys for MS PowerPoint & Lotus
Freelance 1 Jump to previous
slide and announce its title
2 Announce title of current slide
3 Jump to next slide and announce its title
4 Jump to previous shape
5 Read current shape
6 Jump to next shape
7 Jump to previous paragraph in shape
8 Read current paragraph
9 Jump to next paragraph in shape
0 Play from current shape
* [PowerPoint] Toggle slideshow mode/normal mode
[Freelance] Slideshow mode (no resume)
Plus + 1 Jump to first slide
Plus + 3 Jump to last slide
Plus + 4 Play from start of slide
Plus + 6 Jump to last shape
Plus + 7 Jump to first paragraph in shape
Plus + 8 [PowerPoint] Toggle jump mode
(paragraph/sentence/word/character)
[Freelance] Toggle jump mode (paragraph/character)
Plus + 9 Jump to last paragraph in shape
Shape: a group of graphic objects
Plans The prototype document reader provides a nonvisual
universal user interface for four applications. Users do not
need to know how to operate each application, and they can read
documents formatted for those applications by using only the
numeric keypad. This approach will enable even computer novices
to read formatted documents quickly and easily.
We will keep studying object models for other applications
such as spreadsheets, mail software, databases, and Web
browsers, to provide the same user interface for as many
applications as possible. We will also add other navigation
functions to make it possible to read a document more quickly
and precisely by taking advantage of the object model's
capabilities.
Our next goal is to provide a universal nonvisual writing
method for editing and writing documents formatted for any type
of application, such as Word, WordPro(TM), PowerPoint(R),
Freelance, Lotus 1-2-3, or MS Excel, including rich text
information. Such a method will enable blind users to create
visual presentation packages and various kinds of document
without any assistance, through the universal user interface.
After creating a document, they will be able to check how it
looks by using our system's document-reading capabilities.
Go to previous article
Go to next article
Return to 2000 Table of Contents
Return to Table of
Proceedings