2004 Conference Proceedings

Go to previous article 
Go to next article 
Return to 2004 Table of Contents 


Karen McCall, M.Ed.
Karlen Communications
Adaptive Technology and consulting Practice
Phone: 905-510-6014
Website: http://www.iprimus.ca/~martha/ 
Email: martha@iprimus.ca 


As screen readers and voice recognition software evolve, it becomes easier to integrate the technologies for clients who need blended solutions. This paper demonstrates the technologies and discusses techniques for best practices.


In the past, people have confused voice recognition with screen reading. Adopting the desire for the computer to "talk to me" is quite luring when you are losing your vision, or are blind or visually disabled. Traditionally, it has been described by clients as a means of not having to learn keyboarding. Although we are not at the point where we can expect a totally hands-free experience combining voice recognition with screen reading, the two technologies are working together in a more collaborative way. WindowEyes by GW Micro works on its own with Dragon NaturallySpeaking; JAWS from Freedom Scientific can work on it's own or in combination with JawBone from Next Generation Technology.

Knowing that the technologies work in a more collaborative way, the next step is in how to assess clients for the blended technology. Learning how to use voice recognition adds another layer of core computer competency to the skills base of a client. What kinds of skills does the client need to bring to the use of this blended technology? Is there a best practices process for learning computer skills, screen reader skills and voice recognition skills? What are the benchmarks and outcomes that can be expected for clients using these multiple layers of technology?

This paper will focus on the use of Dragon NaturallySpeaking with screen reading technology. IBM ViaVoice will be discussed as the alternative tool for voice recognition. The advantage to Dragon NaturallySpeaking Professional is the broader range of clients with disabilities who might be able to use the product. NaturallySpeaking has more keyboard support and a higher level of customization through the creation of macros.

The process begins with the needs analysis. This is broken down into the daily living tasks the client needs to perform; academic tasks that need to be fulfilled, and more recreational tasks that a client might want to do. A component of the needs analysis is the cognitive and coordinative abilities of the client. This includes an ability to understand and work with Windows XP platforms, as well as an ability to maintain focus and reference points within the application the client is working in. A third critical component is the ability of the client to interact with the voice recognition software.

Occasionally voice recognition software has difficulty with some voices due to their tone, speech patterns, audibility and ability to create sentences and structure by dictation and review rather than typing and review. As you type, you can pause for reflection; as you dictate, there may be a tendency to lose your place in thought or document. These issues need to be addressed in the assessment process. The client needs to be able to create a voice file and work with the software on a guided, cursory level.


The assessment process begins with an "interview" of the client. This can be done either over the phone or in person. If the interview is conducted in person, it is beneficial to have a computer available with the adaptive technology the client uses, or is most likely to use.

There are three layers of computer competency in learning to use voice recognition with screen reading, or screen magnification technology for that matter. The client must be able to demonstrate basic computer literacy in terms of computer use. This can be done in the initial interview when talking with the client. In this interview, the assessor is looking for the ability to put thoughts together without pen, paper or computer. A sustained vocal strength is required for voice recognition. A client will also need to be able to speak loudly without shouting and to speak clearly into the microphone.

One of the benefits of using voice recognition for anyone is that it forces people to enunciate and pronounce words as they should be rather than maintaining lazy speaking habits such as slurring words together, ending words prematurely, and using a "stuttering slang" [for example "ya know, like, like]. Our ears may be forgiving, but the computer takes every word and sound down. Good speech patterns avoid hours of editing.

During this interview, it is a good idea to have the client demonstrate an ability to use the computer. If the client has used a computer before, they should have a level of competency. If they haven't used a computer, it is easy to create a rough voice file and prompt the client through some basic tasks to evaluate their ability to work with the computer. If a screen reader is available, a cursory evaluation of the client's ability to balance all three technologies can be examined. Voice recognition is not for everyone. It is not a solution for not wanting to keyboard. This first interview might take up to two hours if done on-site with computer equipment. At this point, an assessor is not looking at matching a tool to task, but is looking for an ability to use and combine technologies.

Once an assessor has an idea of the client's ability to combine the technologies, a more formal assessment can be scheduled. This is the time to match the tools to the tasks identified by the client. This is also the time to establish the outcomes from which you can measure a client's success once the client has been using the equipment for six months. An assessor would already have a list of tasks and during the formal assessment the client can be walked through them to ensure an ability to perform them successfully.


Training is a critical component of using voice recognition with screen reading. It is a critical component of any use of adaptive technology on a computer system. Clients are trying to balance and combine the Windows operating system with a screen reader and a voice recognition tool. This means that there are three things that can "go wrong". When the client gets an error message, which one of the three tools is giving the error? Troubleshooting is an integral part of training. If the client requires technical support from a friend or family member, that friend or family member should be present during training.

A client should be in their home on their own system for training. If the equipment is to be used in a school or work environment, this will give the client time to become familiar with the equipment before it is moved into the school or work environment.

A trainer should create curriculum that allows the client to work with the computer, the screen reader and the voice recognition software. Although a lesson might seem to be seamless and autonomous, a trainer needs to be able to break down the skills to ensure that the client is able to master all of the skills for a lesson before moving on.

Clients should also be encouraged to create new voice files as they become more familiar with the technology. Many clients believe that once their voice file is created, they are "stuck with it" and don't think they can create another one. During the creation of the voice file, it is helpful to review the text with a client. Although the computer doesn't care if things make sense, we read for comprehension and reading words in isolation often results in a longer voice file creation process and frustrated and confused clients. Without understanding what they are reading, why they are reading it, clients often don't make a connection with the voice recognition technology. They are in effect, following instructions without having access to the visual representation clients who don't use screen readers have.

Ensuring that a client works at their own speed during training is also an important piece of the process. Training should be no longer than two hours per session and done once or twice a week, depending on the client's computer literacy level Clients need time to practice and use the technology so providing them with a day of training and then setting them on their own often results in failure to use the technology. Clients also need homework! Trainers should leave exercises that will reinforce what has been learned during a training session.


It is important that clients be matched to the technology. It is as important as matching the tool to the task. The interviewing of a client introduces the client to technology they may have only read about or seen demonstrated. Having the technology explained to them by someone who is knowledgeable in the three layers of computer skills needed to have the technology work collaboratively often gives clients a more realistic view of the technology. It is a good opportunity for both client and assessor to determine if voice recognition is a good solution.

The assessment further refines and defines the use of the technology. It provides the client with a better understanding of the work involved in using a computer, a screen reader and voice recognition technology.

Without training, clients are abandoned and doomed to failure using voice recognition with screen reading. It is a complex balance of cognitive skills. Having curriculum and exercise to work from between training sessions is a key component for success.

During all of this process, detailed notes are essential on the part of the assessor. A good working relationship between the assessor and the trainer is critical. The assessor can provide insight and information on how the client approaches the technology, and where the areas of strength and weakness are. The assessment is, after all, the framework for the training. It creates the goals, objectives and outcomes that the client's success will be measured against. Communication and collaboration between the assessor, trainer and client is just as important as communication and collaboration between the operating system, screen reader and voice recognition technology.


ANU Administration and support Services [Australian National University], "A Quick Guide to the Training Process for Voice Recognition" http://www.anu.edu.au/disabilities/atproject/Guide%20to%20Training%20VR.htm

Closing the Gap Forums: Voice Recognition Assessment for Dummies, http://www.closingthegap.com/cgi-bin/ultimatebb.cgi?ubb=forum;f=76;t=

Georgia Assistive Technology Project, Assistive Technology Resources [checklists and assessment tools] http://gpat.org/AT%20resources_1.htm

Karlen Communications, Computer Literacy Checklist, Writing Aids Checklist - Screen Readers http://www.iprimus.ca/~martha/tutorials.htm

National Centre to Improve Practices in Special Education, Update on Speech Recognition, http://www2.edc.org/NCIP/VR/VR_Bob.html

Next Generation Technologies, http://www.ngtvoice.com/products/software/jawbone/

Go to previous article 
Go to next article 
Return to 2004 Table of Contents 
Return to Table of Proceedings

Reprinted with author(s) permission. Author(s) retain copyright.