Go to previous article
Go to next article
Return to 2000 Table of Contents
Terry Thompson
Computer Learning Center Coordinator
Independence, Inc.
Lawrence, KS
Phone: 785-841-0333
Email: tt@peakware.com
Website: http://www.sunflower.com/~indepinc
Kevin L. Price
Disabilities and Computing Program, UCLA
Los Angeles, CA
Phone: 310-206-7133
Email: PriceK@ucla.edu
Website: http://www.dcp.ucla.edu
The purpose of this paper is to provide a comparative
overview of current speech recognition products. These products
can benefit everyone, including persons with disabilities, but
they have differences in their accuracy, user interfaces and
usability. For anyone interested in speech recognition, it is
important to understand these differences, and how they may
impact one’s success at using the products. At the CSUN
conference, we will propose criteria for evaluating these
speech recognition products, including the following:
How much training does it take to be productive at using the
speech recognition program? Is the user interface easy to
understand and logically arranged for the new user of the
product? How “hands-free” is the system? How easily
can a user interact with the computer system using only voice?
Is the training and correction of individual speech files easy
and intuitive for a wide range of users? Does the program
support a wide range of speech patterns and accents? Is the
program customizable to allow adjustment for different speaking
styles of individuals? Does the program integrate transparently
and without slowing recognition within a wide variety of
applications? Can the speed of a person’s speech vary
without reducing the program’s recognition accuracy? Does
the program include speech output or integrate with other
speech output programs to help people with learning
disabilities or visual impairments? Is the program well
supported by the company? Is the program affordable? Is the
program still being developed for increased usability? Speech
recognition is a technology that is constantly evolving. It is
a technology that is experiencing tremendous growth in the
commercial market, apart from its original niche as an
assistive technology product. There are presently three major
companies with speech recognition products, Dragon Systems,
Lernout & Hauspie (L&H), and IBM. Stiff competition
between these companies and more demand from consumer and
business markets, has led to a tremendous drop in prices over
the last few years. Competition has also fueled the development
of a plethora of new products. Each company has several
products available, ranging in price, features, and the
applications that they support. This paper seeks to make sense
of the overwhelming array of products so that persons who are
shopping for speech recognition will have a better
understanding of their choices.
At the time of this paper’s submission (October 1999),
new products were on the horizon for each of the companies
profiled here. Our presentation at CSUN will include any new
products that have been released in Fourth Quarter 1999 or
First Quarter 2000.
What are the Types of Speech Recognition? Discrete Slower
dictation process - better for persons with difficulty in
language processing or in fluid speech Word-by-word style,
rather than phrases, reflects the way beginning writers form
sentences Continuous Processes speech by phrase Takes context
into account Is less accurate if phrases are interrupted
Advantages: Speed and accuracy (for most users) Who Can Benefit
from Speech Recognition? Persons with mobility impairments or
injuries that prevent keyboard access Persons who have or who
are seeking to prevent repetitive stress injuries Persons with
writing difficulties Any person who want hands-free access to
the computer Any persons who wants to increase their typing
speed (reportedly up to 160 wpm) What is Required to Use Speech
Recognition? A Powerful Computer Consistent Speech (not
necessarily intelligible) Fluid speech (i.e., not pausing
between words) desirable for use of continuous speech products
Patience Basic knowledge of computers Fairly high cognitive
ability What Do You Mean by Fairly High Cognitive Ability?
Ability to voice appropriate capitalization and punctuation
Ability to assess the accuracy of the dictation (text-to-speech
is available in some products, but no highlighting of
words)
Ability to correct incorrect dictation, which usually requires
the ability to spell at least the first few letters of a word,
and the ability to recognize a correct spelling among similarly
spelled words
Ability to memorize commands and procedures
Current Speech Recognition Products (October 1999) As
mentioned above, the major players in the speech recognition
market are Dragon Systems, Lernout & Hauspie (L&H), and
IBM. Each company offers several products, ranging in price and
features. Because of the variety of products available,
shopping for a speech recognition system can be an overwhelming
experience. This presentation will present the differences
between the products that are presently available. The
following is a brief summary of the products available as of
October 1999:
Company: Dragon Systems
Web: www.dragonsys.com
Phone: 1-800-TALKTYP (1-800-825-5897) Dragon’s original
product, Dragon Dictate, is currently the only product that
uses the discrete speech model. Discrete speech, as mentioned
above, is the best solution for persons with difficulty in
language processing or in fluid speech, or who form sentences
one word at a time, rather than in phrases. The latest version,
3.0 Classic, offers fully functional voice control across all
applications. It is the only current speech recognition product
that supports Windows 3.x. Because it uses discrete speech, it
is better than current continuous speech products at
recognizing the speech patterns of persons who naturally pause
between words, and seems to be better at learning to recognize
persons with unique speech patterns. Unfortunately, Dragon
Systems has discontinued development on this product, as the
company’s focus is now on continuous speech products,
which are more viable in the larger commercial market.
Dragon’s current continuous speech product line, known
as Dragon NaturallySpeaking, includes a Standard, Preferred,
and Professional edition, listed in order from low end to high
end. The Preferred edition includes dictation playback and
text-to-speech, features that distinguish it from the Standard
edition. The Preferred edition also supports input from an
external recording device, although no recording device is
provided. A special version of the Preferred edition, Dragon
NaturallySpeaking Mobile, does include a digital recording
device for additional cost. On the high end of Dragon’s
NaturallySpeaking product line, the Professional edition is
distinguished by its expanded macro and scripting capabilities,
which allow users to dictate long sections of text or complex
computer operations with simple commands. The Professional
edition also comes in Legal and Medical versions, which feature
custom vocabularies for these disciplines.
Dragon has also developed a teen version, which includes
special teen voice models and an easier-to-use interface,
including easier documentation and on-line help.
As of October 1999, a Macintosh version of Dragon Naturally
Speaking was scheduled for release near the end of 1999.
Company: Lernout & Hauspie
(L&H) Web: www.lhs.com/voicexpress/
Phone: 800-380-1234
L & H products are based on speech recognition technology
developed by Kurzweil, a major pioneer in speech recognition.
The current L&H product line, called VoiceXpress, includes
a Standard, Advanced, and Professional edition. The differences
in these editions are fairly straightforward. In the Standard
edition, VoiceXpress’s natural language command interface
works only in L&H’s own word processing application,
called XpressPad. The Advanced edition extends natural language
support to include Microsoft Word. The Professional edition
further extends natural language support to encompass the
entire Microsoft Office suite, plus Internet Explorer. The
Professional edition also provides support for recorded
dictation, and includes a bundled digital recorder.
Company: IBM
Web:www.software.ibm.com/speech/
Phone: 1-800-825-5263 (IBM Speech Systems)
IBM has been a major player in speech recognition for many
years. Its discrete speech product, IBM VoiceType, was a major
competitor of Dragon Dictate. However, IBM has discontinued
this product and is now focusing all its efforts on developing
continuous speech products. Its current product line, IBM
ViaVoice Millenium, includes a Standard, Web and Professional
edition. The web edition features natural language commands for
Internet Explorer, Netscape Communicator and America Online.
The web edition also features a specialized vocabulary for
on-line chats. The Professional edition provides most of the
features of the Web edition, but also provides natural language
commands for the entire Microsoft Office suite, and specialized
business, finance, and computer vocabularies.
Although speech recognition got its start as an assistive
technology product, the commercial market has fueled its rapid
development in recent years, and the primary target market of
each of the companies described above is now the general
public, rather than persons with disabilities. In this
presentation, we will return our attention to speech
recognition as a tool for persons with disabilities. A person
who has a disability or who works with persons with
disabilities will come out of this presentation with a more
accurate representation on which speech recognition products
will best work with them. There is a lot of confusion today
about speech recognition products. The main focus of this
presentation will be to clarify many issues and to ultimately
guide people with disabilities to the best programs for
them.
Go to previous article
Go to next article
Return to 2000 Table of Contents
Return to Table of
Proceedings