1999 Conference Proceedings

Go to previous article 
Go to next article 
Return to 1999 Conference Table of Contents


VOICE RECOGNITION AND PHONEMIC PROCESSING: YEAR THREE

Joe Reid
Roosevelt High School
456 S. Mathew St.
Los Angeles, CA 90033
(213) 268-7241
FAX: (213) 269-5473
email joetennis@hotmail.com

As computer hardware and software become more affordable and more powerful, they increase the possibilities for assisting individuals with disabilities. One such area is voice recognition. Voice recognition offers a way in which such individuals can write without needing to do the difficult phonemic processing. The smallest unit that needs to be processed is the whole word.

When used as a writing tool such a system can provide assistance to persons with phonemic processing deficits. Several of our students use voice recognition to take their tests, complete lab reports for their science class, and book reports for their English classes. These individuals feel much more confident and independent with such assistive technology and are reporting that their reading is improving.

One manifestation of this deficit is in the area of processing sounds of words and knowing phoneme-grapheme correspondence rules. The individual might use "he saw" for "she was" or "ter" for a word that starts with "tre". Cambell and Butterworth (1985) conducted a case study of a very literate subject with a deficit in phonological processing. They reported that she could easily read aloud irregularly spelled words such a "placebo" and "idyll", but had great difficulty with simple non words like "bant". One area of weakness noted involved rhyme judgments. If the word looked like it rhymed and it did rhyme, she was correct every time. She also got 100% correct when the word looked like it did not rhyme and it didn't. If the word looked like it rhymed but it did not (lost/post) she had a score of 63%. Her score dropped to 11% when the word rhymed but appeared not to rhyme (true/shoe).

Another area of weakness for the subject involved an auditory acronym test. The subject was asked to take the first sound of each word in a phrase to form a word. For example, using the phrase "hold aching toes" the word would be "hate". If the subject used orthographic clues and took the first letter of each word, the word formed would be "hat". On 21 such test items this particular subject responded with the orthographic option on all 21 items.

The authors postulated that the deficit in associating phonemes with letters may play a relatively more important role in writing than in reading. On the spelling test the subject's errors were more often phonemically implausible. This hypothesis is supported by Perin (1983) who found that phonological awareness is more closely tied to spelling that to reading ability. Higgins and Raskind (1995) noted evidence that more than 90% of adults with learning disabilities report significant problems with writing and /or spelling.

Phonological awareness deficits tend to stay with the students as adults and continue even when reading levels improve. Cambell and Butterworth concluded that the subject was lacking in all aspects of phonemic representation: segmentation, manipulation, awareness and assembly. This was in spite of the fact that she was very literate. Bruck (1992) concluded that adults with childhood histories of learning disabilities never approximate levels of performance on phonological awareness tasks that are appropriate for their age or reading level. One phonological awareness task used by Bruck in testing for a weakness, involved asking the subject to delete a phoneme from a word and say what is left. For example, "small" would become "mall" not á"áallá"á, and "there" would be "ere" not "here". Another phonemic task is counting the number of phonemes for the given word. Some words would have a digraph and thus have more letters than phonemes. For example, "chin" has four letters but three phonemes.

In conclusion phonological processing deficits continue even after improvements in reading. A prevalent manifestation is in the area of processing sounds of words and knowing phoneme-grapheme correspondence rules. What is now available is a computer system that can do just that.

A voice recognition system takes the sounds the student makes as he or she is talking, matches these sounds with the sounds in the English language, and computes the word the student is saying. In processing what the student says, the acoustical sequence is divided by a small pause to indicate the break between words. An analysis is done, using a sequence of transformations, to obtain a maximum likelihood estimation. Part of this analysis is performed by the sound card which is hardwired for voice recognition enabling more timely responses. The phonemes extracted from this analysis are compared to phoneme reference patterns obtained during training sessions with this user. The similarities between these reference phonemes and the input speech are calculated. A dictionary containing words represented as a sequence of phonemes is searched for the word that yields the maximum similarity.

The effectiveness of using a voice recognition system to assist in writing was tested by Higgins and Raskind (1995). The subjects were undergraduate students with learning disabilities who were asked to take a writing test similar to a proficiency exam required for graduation. The student essays using voice recognition were compared to their essays written without assistance. The scores on the essays written with the voice recognition system were significantly higher. They also noted that bigger words were used and longer essays were written. They also reported one advantage frequently mentioned by the students in the study was the freedom from the mental distraction of having to check spelling or thinking of an easier word to use. This mental energy could be used on content and organization.

This is our third year using voice recognition. We started with one machine and software that needed ten hours of training to reach an accuracy rate of 90%. We now use software that needs no training and can be more than 90% accurate the first time. We also use continuous speech software that requires about an hour of training and is also very accurate. A Digital High School grant will soon enable us to place voice recognition and speech to text capabilities in all 150 classrooms in the school. Students will be able to use these assistive devices without leaving their general education classroom.

Used along with text to speech programs, students take tests from their general education classes by having each question read to them with the text to speech program and then answering the question with a voice recognition program. Students can concentrate on the content of the class more, and they also can work independently. Students who have used the programs for over a year are commenting that their reading is improving. This would certainly be a very positive side effect.

Voice recognition is assisting a group of individuals in the area of their greatest weakness. People who have difficulties processing the phonemes of words and making the phoneme-grapheme connections can use a system to accomplish this very task. Individuals with phoneme processing deficits can use their energy on the conceptual issue of what they want to say and how they want to organize it. With the drive for voice I/O in many commercial applications strong, technological advances have been very rapid. With proper planning and preparation the elementary school students of today should experience much greater success in high school and beyond then do the high school students of today.

REFERENCES

Bruck, M. (1992). Persistence of Dyslexics' Phonological Awareness Deficits. Developmental Psychology, 28(5), 874-886.

Cambell, R. & Butterworth B. (1985). Phonological Dyslexia and Dysgraphia in a Highly Literate Subject: A Developmental Case with Associated Deficits of Phonemic Processing and Awareness. The Quarterly Journal of Experimental Psychology, 37A, 435-475.

Higgins, E. & Raskind M. (1995). Compensatory Effectiveness of Speech Recognition on the Written Composition Performance of Postsecondary Students with Learning Disabilities, Learning Disability Quarterly, 18, 159-174.

Perin D., (1983). Phonemic Segmentation and Spelling. British Journal of Psychology, 74, 129-144.

Saito, S. & Nakata, K. (1985). Fundamentals of Speech Signal Processing. Tokyo: Academic Press

Zhang, y., Topneri, R. & Alder M. (1997). Phonemic©Based Vector Quantization in a Dscrete HMM Speech Recognizer. IEEE Transactions on Speech and Audio Processing. 5(1), 26©32.


Go to previous article 
Go to next article 
Return to 1999 Conference Table of Contents 
Return to Table of Proceedings


Reprinted with author(s) permission. Author(s) retain copyright.