2001 Conference Proceedings

Go to previous article 
Go to next article 
Return to 2001 Table of Contents


CONSIDERATIONS FOR USER INTERACTION WITH TALKING SCREEN READERS

Paul Blenkhorn and Gareth Evans
Centre for Rehabilitation Engineering, Speech and Sensory Technology (CRESST)
Department of Computation, UMIST
PO Box 88, Manchester, M60 1QD, United Kingdom.
p.blenkhorn@co.umist.ac.uk

Introduction

Talking screen readers are widely used by blind people to access modern 'windowed' Graphical User Interfaces (GUIs). In many respects the operation of these systems is standardized. However, there are a number of areas in which there are alternative methods of providing the user with access to information and methods by which additional information can be given. This paper examines some of the pertinent user interaction issues in the context of a new, low cost; Windows screen reader called LookOUT, which has been developed by the authors.

There are three principles of screen reader design that are widely accepted. These are: Many commercially available screen readers are very effective and go a long way toward achieving these goals. However, there some areas that warrant further investigation and discussion. These include:

Each of these areas is considered in more detail below with particular reference to the way the issues are addressed in the LookOUT screen reader.

Additional Information

When a user moves the cursor around the screen, using the cursor keys, most screen readers speak information appropriate to the action. Consider reading a document with in a word processor, when the up and down cursor keys are used the complete line is spoken. When the left and right cursor keys are used, the next character is spoken. When the control key used with the left and right cursor keys movement is from word to word, and in this case the new word is spoken. These modes are well established and give good access to the text. However, they do not indicate to the user, the position of the cursor on the screen, which can be useful for determining layout of a document. Moreover, they do not indicate to the user the type of character under the cursor. The screen reader could easily speak this position and character type information, but this would compromise the maxim of 'maximum information, minimum speech'. In LookOUT this position and character type is optionally provided by tones that are played at the same time as the speech.

LookOUT takes advantage of a modern sound card's ability to play wave and MIDI information at the same time. The MIDI channel is used for the tones. To allow the user to distinguish between vertical and horizontal movement different MIDI instruments are used. When the user is familiar with the tones, he/she can determine screen position and action (horizontal or vertical movement) quite easily.

The modes for position are:

LookOUT can also use tones to indicate 'character type' in parallel with speech. In this case different tones are used to indicate different character types and the user can quickly review the document.

This approach can be extended to give additional information; for example, it can be used to indicate whether a character is bold, underlined, and italic by changing the instrument accordingly. It could also be augmented to indicate changes in font, with different fonts being represented by different instruments.

Another approach to giving character type information in parallel with speech is to use 'force feedback' joysticks or mice, which will allow the user to 'feel' the format of the document. It is believed that considerable research and evaluation work is necessary before this type of interface becomes commercially available.

Varying the Amount of Speech

Satisfying the concept of 'maximum information, minimum speech' can vary dependent on the experience of the user. They key issue is how familiar the user is with the operating system. For example, when presented with a check box in Windows the experienced user needs to know the text associated with the checkbox, the fact that is a checkbox (rather than say a radio button) and its status (whether it is checked or not). The novice user will also wish to know the type of operation he can perform with the interface element. For example, in the case of a checkbox he/she needs to know that the status of the checkbox can be changed by the space bar. LookOUT supports two modes of operation, standard, which assumes that use is familiar with Windows, and novice, which a much more verbose mode that explains the options for each possible operation. For a novice user, LookOUT first speaks the information that is presents in the same circumstances to a standard user before providing the additional information. This strategy is adopted to support a novice user's migration to a standard user.

When a user types, most screen readers speak support either character or word echo, the user decides which. LookOUT, in addition to supporting these modes, provides a character and word echo mode. So that as the user types the characters are spoken, when the space bar is hit, the word is spoken. This mode has been provided for novice users and was prompted by the experience of teachers and trainers.

Speech is sometimes required to confirm a user's action. For example, the Insert key toggles the mode of operation for a word processor between inserting test at the caret and overtyping text. The user needs to be aware that he/she has hit the key, he/she also needs to be aware of the mode that has been entered. Of course, other keys can modify the operation of the Insert key, for example the Shift key. If Shift and Insert are pressed, a paste operation will be performed and the user is informed of the paste operation and the text that is to be pasted is read.

Finally, the user needs to be aware of screen changes, especially new windows that appear. New windows may appear due to user action (for example, when the user has entered the command to save a document) or due to system messages (such as lost network connection). LookOUT tries to deal with the different types of windows in an appropriate manner. When a window results form a user action, its title is read and the user can then use the 'screen review' keys to further investigate the window. When a system message appears, it reads the whole window. LookOUT distinguishes between user prompted windows and systems messages based on size. A window smaller than a certain size is deemed to be a system message.

Configuration

As stated above, on goal when designing a screen reader can be to make any standard application appear to be a specialized application that was written for blind users. An important issue to be address, therefore, is that given a particular application, how can the screen reader be configured to give this illusion? Screen readers typically use two approaches to this problem. One is to use screen markers and the other is to use scripts.

A screen marker is a location or area of the screen that is associated with a particular combination of keys. When the user presses the key combination, the cursor is moved to the location. Given an application screen markers can be set up relatively easily. This can be achieved either by the blind user (exploring the screen using his/her screen reader in screen reading mode) or by a sighted colleague or friend. Thus, for an application that is not supported by the screen reader, a relatively unskilled person can develop a reasonably efficient interface. However, markers only examine appropriate places on the screen and, whilst useful, they do not wholly fulfill the goal of making the application appear to be a specialized talking application. LookOUT supports the use of markers, and markers can be saved for subsequent use with an application.

One can view scripting as a more general method of specializing the behavior of a screen reader so that it supports a particular application. Scripts are written in a programming language and are loaded together with an application. When the user interacts with an application through the keyboard, code in the script is executed before information is passed to the screen reader. Thus, the functionality of the screen reader can be adapted for a particular application. The big advantage of this approach is that, given a well-written script, the standard application can really appear as if it were written for a blind user. In effect, the interface has been rewritten. Functionality far in excess of simply reading the screen can be incorporated. For example, LookOUT's Microsoft Excel script distinguishes between numbers and formulae. A cell with a number will have its row column location and then its value read, whereas a cell with a formula will identify the location, the result and the formula used to compute the results. If the cell has a comment, this will also be read. Another example of a LookOUT script is the one that is used to control the CD player in Windows. Graphically this has command buttons that allow the player to be started, stopped, paused, etc. This interface is completely remapped to the numeric keyboard by the LookOUT script, which uses the '4' key to start, the '5' key to stop, etc. In this way a completely new interface is created. Script files can also contain help information that explains the interface.

The problem with scripts is that scripting is relatively complicated and unless the vendor supplies a script, few people have the expertise to write new scripts. This is problem is partially ameliorated in LookOUT by using Microsoft Visual Basic Script rather than a proprietary scripting language. Because Visual Basic is a very widely used programming language, it is thought that a reasonably competent programmer can develop new scripts.

As a final point, a number of blind people are employed in 'call sites' where they interact with customers via the telephone and a computer system using a screen reader. The software used by call sites can be complex and require an operator to navigate a large number of complex screens. This poses a problem for the screen reader user who needs to locate information and screen quickly and efficiently. Scripting can alleviate these problems by providing an alternative interface that allows the user to reference forms and fields directly through the keyboard. However, given that call site software may be very complex, there may be a large number of operations that an operator needs to perform. Generally, scripts re-map interfaces to keystroke combinations, however with many operations remembering the keystroke combinations can be difficult. LookOUT supports the idea of 'script strings' this allows memorable command names rather than esoteric keystroke chains to be associated with operations and should make operation much easier.

As an aside, it should be noted that call sites may be based on platforms other than Windows. However, in this case the blind user can use a terminal emulator running on a Windows PC together with his/her screen reader.


Go to previous article 
Go to next article 
Return to 2001 Table of Contents 
Return to Table of Proceedings


Reprinted with author(s) permission. Author(s) retain copyright.