Go to previous article
Go to next article
Return to 2001 Table of Contents
Gregg Vanderheiden Ph.D. Trace R & D Center University of Wisconsin
In the past, products were designed to operate in a rather fixed fashion and were targeted toward people that fell in the center or high center of the ability curve. That is, they were designed for people who either had all of their abilities or actually had slightly above average abilities. People who had disabilities typically gained access to these products by either purchasing special versions (where they were available) or by attaching assistive technologies or other special interfaces. A number of things are changing, however, which will dramatically change both the way products can be designed and the potential of future products for accessibility. This paper outlines some of these changes as well as describing some current and future technologies that demonstrate this new flexibility and potential for built-in accessibility.
First, almost all products today are controlled by a microprocessor running a program. In the past, products were more electromechanical in nature. If a company wanted a product to behave differently, they had to physically design it differently. Today, the microprocessor of the phone controls what the phone does based on its instructions (its program). As a result, it is easy to cause phones to have different behaviors by simply changing the instructions (program). It is also possible to have the phones behave differently for different users (for the same user in different situations) by simply allowing the user to select different behaviors. With the speed of processors increasing and the cost of processors and memory continuing to go down on a steady rate, it is possible to build flexibility into products in ways that were unheard of before. The Nintendo game machine of a few years ago had the processing capacity of a Cray supercomputer in 1985. And those Nintendo's are now slow and obsolete by today's standards.
Mobile computing has also driven the need for more flexible interfaces and systems. As people want to access their systems at their desktops, while walking, while driving (eyes busy), and while doing other things (hands busy), in noisy environments (can't hear), in quiet environments like libraries or meetings (can't use sound), etc., individuals want systems which allow them to be able to easily change the way they interact with their product based on their needs and the situations.
Finally, the rapid advances in voice technologies are making verbal (e.g., words) control of products and the use of natural language to operate products an increasing reality.
All these trends are coming together to provide all users with systems which will have unprecedented flexibility and ability to adapt to the different needs of users at different times and in different environments. This same flexibility will have profound effects on the ability of products to adapt to the individual situations faced by people who have temporary or permanent disabilities. This paper/presentation will describe some recent and future technologies and demonstrate how these technology advances can dramatically change the design and natural accessibility of future information and telecommunication products.
When digital cellular phones were introduced, compression algorithms distorted the TTY signals making it difficult or impossible to use to transmit clear TTY messages over digital cellular phones. To address this problem the phone manufacturers changed the software in the phone so that it would recognize the TTY tones and convert them to a more easily transmitted form.
This meant that all of these digital cellular phones would be capable of sending and receiving TTY characters as data. As a result, these standard off the shelf phones could, with little additional effort, present the TTY characters on the phone's display as they are received. Individuals who are deaf and who can speak would then be able to use any of the standard phones without any special modifications. If the cell phone also had some type of character entry mode (as many do these days), the individual who is deaf could also use the standard cell phone as a bi-directional TTY.
This is a feature that might be useful and used by many people, not just individuals who are deaf. Phones could be designed with a "silent operation" mode which would allow people to communicate with others when they were in a meeting or another location where they were not able to talk on a phone.Captel - Captioned Telephone Conversations The Captel technology is already available today in limited markets on a pilot basis and soon will be rolling out nation-wide. The service is called Captel, which is short for captioned telephone. Figure 1 Captel phone looks like standard desk phone with handset on the left and oversized number buttons. Above the keypad are 6 small buttons and above them is an LCD screen. To the right of the keypad are 4 small and one large button. At the front edge of the phone below the keypad is a flap which folds down to reveal a small hidden keyboard. The Captel telephone looks much like a standard telephone as shown in Figure 1. The key difference is that it has a small "caption" button on it. Pressing this button before dialing a phone will cause the phone call to be captioned. Some day this technology will occur through the use of completely computer based speech recognition. Today, speech recognition is not nearly good enough and does not accommodate a variety of callers including people with accents. So, instead, the Captel system operates by having a third person listen in on the call. Pressing the caption button before dialing links this person automatically into the call. This third person listens to everything that is being said by the other party. As they listen, they simultaneously re-speak the same words very distinctly into a computer which does voice recognition. Because this person has trained the computer to recognize their voice and does this all day, they are able to get a recognition rate which is extremely high. In addition, whenever the computer makes a mistake, they can see it on screen and they correct it before it appears on the display of the person's Captel telephone. For more information see www.ultratec.com.
The "try harder" technique is a technique which can be used with any of the technologies described in this paper. It is basically a mechanism which allows us to more easily transition from present technical capabilities to the future. Applied to the area of speech recognition it might look like the following. An individual would like to use fully automatic, computer based speech recognition. However, today this would only work in certain circumstances and with certain speakers. Adding a "try harder" button on the product would allow a user to be able to first try the automatic (and much less expensive) speech recognition to see if it will work. If it doesn't, the person simply pushes a "try harder" button and a human being is brought in which views the output of the automatic speech recognition and makes corrections. Pushing the "try harder" button again could cause it to shift to a situation where a human being is either re-voicing or using short hand or some other technique to provide more accurate speech to text translation. Each press of the try harder button could bring in more skilled individuals at a higher cost.
With the "try harder" technique in place, we are able to put systems in place today which will work today and which will naturally migrate to and incorporate newer technologies (in this case speech recognition) as they become available providing a faster and lower cost system over time. By having a system which works today and leans into tomorrow, we can create the services which are needed today and the infrastructures to automatically incorporate advances as they become available.
A more general form of the speech to text concept has been described by multiple researchers and prognosticators over the years to provide deaf users with a personal portable interpreter. One example (Vanderheiden, 1995) took the form of a hypothetical device consisting of a small directional microphone (which might take an appearance of a pen) and a small alphanumeric display which could be built into a pair of glasses. A deaf person could carry the device with them inconspicuously. Whenever they got into a conversation, they would simply angle the pen toward the mouth of the person who is speaking. The speech would be picked up and transmitted off to a central location where it would be transcribed into text. The transcribed text is then sent back to the individual where it is presented to the user using the display built into the glasses. This device would provide individuals who are deaf with the ability to carry on face-to-face or eye-to-eye conversation with anyone else who might be talking to them. As technology advances, the central "transcribing" service could be more and more automated and eventually be entirely automatic. (In fact, in the far distant future the processing power to do the perfect recognition may fit into the pen itself). In September 2000, Ultratec a new service called "Instant Captioning" which brings this concept to reality. Currently, in field testing this will provide a capability similar to that described here using different technology formats (www.ultratec.com)
Remote sign language interpretation operates in a fashion similar to the listening pen except that instead of text being sent back, the image of someone using sign language is sent back and projected for the individual. The individual who is deaf can then see someone floating in space providing a sign language interpretation of whatever is being said.
Initial versions of this have already been implemented using a live person doing the sign language interpretation from a remote location over Internet 2 (Barnicle, 2000). An alternate approach to sending a complete video signal, however, would be the use of an avatar (computer generated person). This could be abstracted from a real (live) signer and reconstructed as an avatar. Or it could be a direct audio to avatar conversion Research is currently underway on both text to sign language interpretation and on the generation of software avatars which can carry out the sign language gestures with sufficient accuracy and fluidness. The ability to have a personal interpreter system such as those described here will be tremendous. Also, live remote interpretation opens up the ability for interpreters to work out of their homes on a demand basis, greatly increasing the number of interpreters available, allowing them to work in shorter time spans and decreasing the interpretation cost for people who have disabilities.
These concepts can be extended further when individuals are involved in communications over telephones. Using morphing technology it is possible to take an image of a person and combine it with a signing avatar so that one ends up with an image which looks exactly like the individual themselves doing the signing. In this fashion, it is possible to send an image of someone signing over a picture phone even though the person does not know how to sign at all.
Similarly, it is possible to take an image of a person and morph their lips so that it looks like they are talking even though they are not. Thus, a person (who cannot speak) could type on a keyboard and project an image of themselves sitting there calmly talking with their lips moving and speech apparently coming from them.
Combining these two applications one could have an individual on one end of a videophone call who is deaf (and cannot speak) appearing to speak to others on the phone. And when they talk back to him - he sees them signing. Eventually, gesture recognition will be good enough that the individual who is deaf could actually be signing and seeing the other person sign back while the other person would talk and see the deaf person talking back. In this fashion, each individual can choose to either present themselves or view the other person in a form which is easiest for them to interact with.
In the above examples, the concept of an "assistant on demand" was introduced. Basically, an "assistant on demand" is an individual who could be called up to assist someone with a disability anytime they required it, but who would not be around the rest of the time. In the examples above, the assistant was an interpreter helping someone who is deaf deal with the auditory environment. Assistants on demand can also be used by individuals with visual or cognitive disabilities.
For example, someone who had low vision or blindness might find it very convenient to be able to use a small camera and a video telephony link to allow them to call on a remote person who could act as a vision assistant to them whenever they needed it. For example, an individual may be sitting in a lecture, (or partaking in a teleconference call) most of which is easily learnable by simply listening to the professor. However, at a certain point in time, the lecturer suddenly begins referring to a chart or diagram. The individual who is blind could push a button on their Visual Assistant on Demand (VAOD) and instantly have someone who could help them interpret the chart through verbal description. Because the teleconference is electronically mediated, it is also possible to keep a 30 or 60 second loop running. Thus, when the individual visual assistant on demand is brought into the conference, they can actually begin 60 seconds in the past so that the visual reference or exhibit which prompted the call for the VAOD, could be viewed by the assistant and described to the individual who was blind. A technique, which might be called freeze and catch up, could also be used whereby the video conference was frozen (only for the person who is blind) during the time the video exhibit was being described to them. The conference would then resume and be played to the individual in a slightly delayed fashion but at a slightly accelerated speed so that after a short while, the individual would catch back up to real time.
An example of a hypothetical device, which provided assistance to individuals with cognitive disabilities, was included in Rowitz's 1992 "Mental Retardation in the Year 2000" (in a chapter titled, A brief look at technology and mental retardation in the 21st century, Vanderheiden, 1992).
Companion/tool technologies are those which not only extend or enhance the individual's cognitive abilities, but also augment the user's cognitive abilities with a second separate cognitive entity. One example of such a device is a hypothetical device called the "Companion. The Companion consists of a small device approximately the size of a large wallet. It has four or five large buttons on it, which are brightly and distinctly colored and have symbols on them. One of the buttons stands for "Help." Two other buttons stand for "Yes" and "No." Another button is a request button. The Companion has voice output and speech recognition. It has an artificial intelligence system programmed within it which is specifically designed to facilitate problem-solving and crisis resolution. In addition, the Companion acts as a reminder and monitor system for the individual. The Companion has a built in GPS system which allows it to keep track of its exact position using signals from navigation satellites. Finally, the Companion has a cellular communication system similar to a cellular telephone, allowing it to put the individual into instant contact with a crisis line in case of an emergency which cannot be easily handled by the Companion. The purpose of the Companion would, of course, be to allow individuals with mental retardation to live more independently. If the Companion could enable an individual to live safely in a less supervised or more independent fashion, the cost savings would very quickly cover the cost of the Companion.
The final scenario is a description of a hypothetical multi-purpose communication and information device from the not too distant future. (See Figure 2). All of the techniques and technologies described in the product exist today in laboratories.Figure 2 The “window to the world” is a 3.5” x 4.5” device which is ¼” thick when closed. The device opens up much like a men’s billfold to form a device 3.5” x 9” which is 1/8” thick except for ½” strip down the right edge which remains ¼” thick and contains two cameras, a microphone and controls (See figure X). When open, most of the device is a large touch screen interface which covers the entire inside of the unfolded device with the exception of the aforementioned ½” strip down the side. The device has four small buttons, two each on the extreme bottom sides. With the Window To The World (WTTW) a person can access any information that they have permission to access anywhere in the world - any book that has been published; any movie; any document; any report; all of their own personal information, files, letters, records, etc.; their children's report cards, lunch menus; movie times, prices for all the products at all of the stores, instruction manuals for any product, etc. The WTTW connects wirelessly to the network. The device also acts as a telecommunication device and there are network based translation services available so that it is possible to carry on a conversation with anybody from any language and automatically have the conversation interpreted into the persons native tongue. The device can be operated entirely verbally or visually and the input controls are flexible. The device also has a capture and read function that can also translate. With its optional heads up display and earbud - it provides great input and output flexibility to support its use while walking, at a reception, in a meeting, in noisy environments, while driving etc.
The result is a product that, though designed for flexible, mobile use by all people, also happens to be cross disability accessible. The general trend in future technologies overall is one toward flexibility of input and output that make them either naturally cross-disability accessible or very close to it. We need to capitalize on this trend.
This paper is based on research carried out under funding from the National Institute on Disability and Rehabilitation Research (NIDRR) of the US Dept of Education under grant #H133A60030. The opinions contained in this publication are those of the grantee and do not necessarily reflect those of the Department of Education.
[Barnicle et al., 2000] Barnicle, Kitch; Vanderheiden, Gregg; Gilman, Al. "On-Demand" Remote Sign Language Interpretation. In: 9th International World Wide Web Conference, Poster Proceedings, Amsterdam, May 15 - 19, 2000. http://www9.org/final-posters/poster26.html.
[Vanderheiden, 1992] Vanderheiden, Gregg. A brief look at technology and mental retardation in the 21st century. In L. Rowitz (Ed.), Mental Retardation in the Year 2000 (pp. 268-278). New York, NY: Springer-Verlag New York, Inc., 1992.
[Vanderheiden, 1995] Vanderheiden, Gregg. Access to global information infrastructure (GII) and next-generation information systems. In A. Weisel, Ed., Proceedings of the 18th International Congress on Education of the Deaf - 1995. Tel Aviv, Israel: Ramot Publications - Tel Aviv University, 1995.
[Vanderheiden, in press] Vanderheiden, Gregg. Telecommunications -- Accessibility and Future Directions. In: Julio Abascal, Colette Nicolle (eds.), Inclusive Guidelines for HCI. Taylor & Francis Ltd., in press.
Go to previous article
Go to next article
Return to 2001 Table of Contents
Return to Table of Proceedings