2004 Conference Proceedings

Go to previous article 
Go to next article 
Return to 2004 Table of Contents 


SCAN-TO-SPEECH TECHNOLOGY: A BASIC OVERVIEW

Presenter(s)

David Flament
Manager, Computer Training Center
Guild for the Blind
180 N. Michigan Ave., Suite 1700
Chicago, IL 60601-7463
Phone: 312-236-8569
Email: davidf@guildfortheblind.org
Web: www.guildfortheblind.org

Scan-to-speech technology allows printed material to be read using synthetic speech. This technology requires both hardware (a standard flatbed scanner) and software. This presentation will focus on the most common scan-to-speech software programs currently available as well as others some that may be more affordable.

Scan-to-speech software is based on Optical Character Recognition (OCR) and text-to-speech technologies. OCR involves recognizing characters from a scanned image and turning printed material into electronic text. Text-to-speech involves reading electronic text with synthetic speech.

Kurzweil 1000(tm)
Kurzweil 1000(tm) is a scan-to-speech program designed for those who are blind or visually impaired. It can read almost any printed material that can fit on a scanner. This software descends directly from the very first reading machine invented by Ray Kurzweil in 1976. Kurzweil 1000(tm) is a product of Kurzweil Educational Systems.

Features:
* State-of-the-art synthetic speech options
* Optimal Scanning Solution
* Accurate OCR
* Flexible writing and editing tools
* Study tools
* Customizable user interface
* Online e-book and magazine search and retrieval
* Low vision features
* Sending files to portable devices
* Sending files to e-mail
* Audio file creation while working
* Kurzweil Virtual Printer
* Familiar office applications - calculator, photocopier, and fax
* Braille embosser compatibility
* Kurzweil news and product enhancements

System Requirements:
* 200 MHz or higher Pentium(r) processor
* 64 MB of RAM
* 300 MB of available hard disk space
* CD-ROM drive
* 16-bit or better Sound Blaster(r)-compatible sound card
* Full-size keyboard or 17-button Kurzweil 1000(tm) keypad
* A compatible TWAIN scanner
* Windows(r) 95, 98, NT(r) Service Pak 6, 2000, ME, or XP(r) operating systems

Retail Price: $995.00

Advantages and Disadvantages:
Kurzweil 1000(tm) is an excellent OCR program. It is very stable and feature-packed. Since it has been specifically designed for the blind and visually impaired, its features are designed to be of the most use to this community. Most features can be controlled by using the keyboard's numeric keypad. A built-in speech synthesizer is included so no additional software is required for scan-to-speech.

The main disadvantage to Kurzweil 1000(tm) is its price.

Kurzweil 3000(tm)
Kurzweil 3000(tm) is designed for educational use. It is intended to help students and adults with learning disabilities, attention deficit disorder, and other literacy problems. It does not have all the low vision features that Kurzweil 1000(tm) has, but it does have other features such as help with test taking, building study guides, highlighting tools, and note posting.

OpenBook
OpenBook, a Freedom Scientific product, is a scan-to-speech program also designed for those who are blind or visually impaired. OpenBook users sometimes call it "Ruby" for its Ruby edition, or "Archenstone" for its original developer. Using the FineReader 6 OCR engine, OpenBook will scan printed material including text embedded in graphics.

Features:
* Includes FineReader 6 as well Caere, and Recognita scanning engines
* Reads documents quickly
* Easy movement to preferred locations in the document
* Includes a dictionary that can search for additional definitions, synonyms as well as phrase definitions
* Low vision features
* Create MP3 or WAV files
* Emboss in Computer and Grade II Braille
* Convert OpenBook(r) files to .brf and .brl Grade II Braille formats
* Supports keyboard layouts that mimic popular screen readers
* Connect Outloud included
* Easily configurable
* New Help System
* DAISY File Format available
* Modify the layout of a current document
* Set and arrange pre-defined template fields

System Requirements:
* IBM-compatible Pentium(r) or higher processor
* 32 MB of RAM
* CD-ROM drive
* 200 MB of available hard disk space
* SSIL-supported speech synthesizer, or a SoundBlaster 16-compatible sound card
* Supported TWAIN-compatible scanner

Retail Price: $995.00

Advantages and Disadvantages:
OpenBook is a very good OCR program with a loyal following. It is designed for those who are blind and visually impaired. It comes with a built-in speech synthesizer so no additional software is required for scan-to-speech.

As with Kurzweil 1000(tm), one of the main disadvantages is its price. Also, OpenBook uses different reading commands than JAWS(r) for Windows. This can frustrate JAWS(r) users who thought the programs would share similar reading commands because both products were developed by Freedom Scientific. Some users that have been using OpenBook since the Archenstone days also seem to have trouble with the changes Freedom Scientific has made to the software. How well these changes actually work is in question.<>/p WYNN
WYNN, like Kurzweil 3000(tm), is much more than just an OCR program. WYNN is intended to help anyone with a reading problem. It does not offer all the low vision features of OpenBook, but it does offer features such as full editing, study tools, and word prediction.

OmniPage(r)
OmniPage(r) is a commercially available program intended for use by the general public. OmniPage(r), however, can be used with adaptive technology. OmniPage(r) was created by ScanSoft, a Xerox owned company formerly known as KCP (Kurzweil Computer Products).

Features:
* New Flowing Page Output
* Unlock PDF
* Batch process files
* Create searchable archives
* A redesigned user interface
* Cut scanning time in half

System Requirements:
* Intel(r) Pentium(r) processor or equivalent
* 128 MB RAM
* 140 MB of free hard disk space
* SVGA monitor with 256 colors and 800x600 pixel resolution
* CD-ROM drive
* Compatible Scanner
* Windows(r) 98SE, 2000, ME, XP(r), or Windows NT(r) 4.0, SP 6.0 or higher

Retail Price: $199.00/$599.00
Advantages and Disadvantages:
Price is certainly an advantage for OmniPage(r). The upgrade price of $199.00 applies to any OCR software, including the software that comes with the scanner. While OmniPage(r) is not specifically designed for the blind and visually impaired, it does have features that can be very useful. OmniPage(r) has a basic built-in speech synthesizer, but only for reading the scanned document. Speech users will also need a screen reader to use this software. Documents can be scanned into MS Word(r) directly from an "Acquire Text" menu choice under the File menu.

The main disadvantage of OmniPage(r) is that it is not designed for the blind or visually impaired. As such, some features are not easily accessible. Although this program can work very effectively with adaptive technology, it is probably not for the novice speech user.

FineReader
FineReader is a commercially available program designed for the general public. The FineReader OCR engine is one of the most widely used engines in OCR software. It will work with most adaptive technology and is very affordable. FineReader is an Abbyy Software House product.

Retail Price: $299.00

Included Software
Flatbed scanners will come bundled with scanning software and most will include OCR software. Printed materials can be scanned with the included OCR software. Screen reader software will then read the converted text.

Compatibility between the included software and the adaptive technology being used must be checked. Included software will not offer all the adaptive solutions that some of the other scan-to-speech technologies offer. Recognition of printed material and converting text into the desired format may also not be as good as the other programs available.

Conclusion
Scan-to-speech programs can be divided into two groups, solutions designed specifically for the blind and visually impaired and solutions designed for the general public. Although the software programs designed for the general public are more cost effective, those who use adaptive technology may not be able to access all features. Software designed for the blind and visually impaired offer more features and allows full access, but the cost may be prohibitive. Answering the following questions will help determine the best product to meet one's needs:

1. How will the scan-to-speech technology be used?
2. Does the scan-to-speech technology itself need to be accessible?
3. Is any additional software needed to use this solution?
4. What adaptive technology features are needed?
5. How important is price?

References

Abbyy Software House, FineReader, Online
http://www.finereader.com

Freedom Scientific, OPENBOOK, Online
http://www.freedomscientific.com/fs_products/software_open.asp

Freedom Scientific, WYNN, Online
http://www.freedomscientific.com/WYNN/index.asp

Kurzweil Educational Systems, KURZWEIL 1000/3000, Online
http://www.kurzweiledu.com/products.asp

ScanSoft Inc, OminPage, Online
http://www.scansoft.com/omnipage


Go to previous article 
Go to next article 
Return to 2004 Table of Contents 
Return to Table of Proceedings


Reprinted with author(s) permission. Author(s) retain copyright.