1994 VR Conference Proceedings

Go to previous article 
Go to next article 
Return to the 1994 VR Table of Contents 


Information Modeling Aspects in Applications for Blind Persons

By: Lars Reichert, Martin Kurze
Free University of Berlin
Department of Mathematics and Informatics
Takustr. 9
D-14195 Berlin, Germany
Tel: +49 30 838 75 134
Fax: +49 30 838 75 109
E-mail: {reichert, kurze}@inf.fu-berlin.de

and

Thomas Strothotte
Otto-von-Guericke-UniversitS t Magdeburg
Department of Simulation and Graphics
UniversitS tsplatz 2
D-39106 Magdeburg, Germany
Tel: +49 391 5592 3772
Fax: +49 391 5592 2447
E-mail: tstr@isg.cs.tu-magdeburg.de

Abstract

VR substantially depends on the modeling of the virtual worlds that are presented to the users. While much effort has been put into developing techniques to present these models to sighted users, methods for creating these models efficiently, especially for the needs of blind people, are rare. In this paper we focus on two aspects of information modeling that can be used in a prototypical electronic travel aid (ETA) using VR technologies to enhance blind people's independent mobility.

There is a wide field to incorporate modeling technologies to give blind users optimal assistance in such an ETA. One problem, however, is an appropriate Geographic Information System (GIS) to be used for the special needs of blind persons. In addition, most relevant information cannot be gained automatically because it is coded in graphical form. Therefore, we introduce a method being a combination of automatic data recognition and incorporation of a sighted person which provides the system with special information relevant to the blind end users. As a case study we describe a new modeling technique for producing 2-D theatre wings. Finally, a methodology is introduced to present this information to blind users by acoustic and tactile output which can be used in VR environments.

1. Selected applications of information modeling for blind persons.

The world - as sighted people know it - is full of objects (and concepts) that can easily be grasped visually. The visual sense is capable of acquiring information in various representations. Since (sighted) people receive most of their information using their eyes, information presentation is often organized to fit into this perception method. In this chapter, we analyse the problems resulting from the real world's organization being adapted to sighted people but not fitting for blind persons.

Many documents and texts (on screen as well as on paper) contain diagrams in various forms. Fig. 1 shows an example of a fancy 3D barchart diagram, also called junk-chart. The subject matter of this chart if of no importance in our context.

Fig. 1: A typical diagram as it can be found in magazines.

These Diagrams convey two classes of graphics to the sighted user:

a) the numerical data on which the diagram is based and

b) the message in which the author implicitly shows his intention.

While the former could be conveyed to the blind user by reading the figures to him, the latter must either be explained explicitly or ignored at all. So we can identify at least two problems which diagrams in texts cause to the blind user:

a) Which figures (data) form the basis for the diagram (the "table")?

b) What is the author's intention and how can we convey it implicitly to the blind person much as it is presented to the sighted one.

This problem is addressed by [Kurze94a]. Probably the most challenging type of picture form photographs (or photorealistic images). In (virtual and non-virtual) reality, complex scenes may contain various objects of several sizes, types and importance.Fig. 2 shows an early version of a "photorealistic" image.

Fig. 2 : Medieval drawing of a scenery with objects of different sizes. The most exact way to represent these objects would be to create a 3D model of the complete scene. Of course, this is not a realistic approach for regular photos and most other pictures. We have to reduce the exactness of the picture without losing key features of the scene. First of all, we have to analyze which are the key features and then we have to think about an appropriate implementation (see chapter 3 for details).

Very big or very small objects are another problem. Many sighted people, even some physicists, have difficulty building a mental model of a galaxy or an atom . For blind people this is even more difficult. As they cannot see the relation in size of a building to a man, they have to take (think) at least one more step: in a solid model (wooden, paper or plastic) they can feel the size-relation of man and building, but they still have to recalculate the size of the modeled man in comparison to a real-world man.

A somewhat special problem arises with the upcoming computers with graphical user interfaces. These interfaces are quite easy to use for a sighted person because they address the visual 2D capabilities of sighted persons which enable the user to use advanced interaction techniques like mouse-selection and direct manipulation. This problem is addressed by the European project GUIB (Graphical User Interfaces for Blind People). In this project an off-screen model is used and presented to the blind person using some techniques known from conventional screen readers and some new forms of interaction. See [GUIB93] for details.

Another class of problems blind and visually disabled people have is the restriction of tools that normally help people to be independently mobile. For instance, maps of unfamiliar environments help sighted people find their way. However, even sighted people may have considerable problems in interpreting the more or less abstract representation of streets and other localities in a city. Even though they have the chance to visually compare the map representation with the reality, many people get lost in unfamiliar places. There are tactile maps of big cities already available. However, the problem for blind and visually disabled people is that the visual feedback which could easily verify the actual location is missing or highly restricted. Alternative strategies such as listening to sounds or tactile feedback with the long cane are harder to manage and slower to realize. One solution to this whole problem is the realization of an electronic travel aid which helps the disabled person to navigate by using a Global Positioning System (GPS) receiver and an electronically available map (compare [Loomis93]). Although such systems are already being used more or less successfully for car navigation, there are additional problems (which we will address in detail in chapter 2) with the Geographical Information Systems (GIS's) and the map representation if used for blind pedestrians.

In the context of all these problems which visually disabled people have to face in everyday life another point seems to be interesting: the use of the TV. Though at first glance TV seems to be a highly visual medium, there is a strong desire of visually impaired people to watch TV. On the one hand it is the most popular leisure activity and on the other hand it serves the information needs of the population. Being excluded from the use of this medium would mean a big restriction in the social lives of many visually impaired people.

Although some programs have a high aural content (e.g. news), watching TV depends a lot on the visual channel. So visually disabled people miss a lot of information that is presented through pictures alone which sometimes makes it hard to follow the program. The solution to this problem is to install a separate commentary channel for the visually transmitted action like scenery, facial expressions, body language, etc. In the USA an extra sound channel, the Secondary Audio Program(SAP) has already been provided to the television system. In Europe a project named AUDETEL (Audio Description of Television) is investigating the problem of the European television sound systems which make it impossible to add an extra audio channel with reasonable bandwidth (for details see [AUDETEL93]).

Among the various applications for information modeling for blind people mentioned above, we now wish to focus mainly on two fields: the modeling of geographical information in chapter 2 and the modeling of pictorial information in chapter 3. In chapter 4 we describe methods for presenting graphical information to blind users. In chapter 5, Conclusions, we draw conclusions and outline future research.

2. Modeling Geographical Information

As mentioned in the previous chapter one severe problem for blind people is the lack of independent mobility in unfamiliar environments. People find their way by orientation and navigation, but blind and visually disabled people not only have problems in orientation but also have no access to tools for navigation like maps. A solution to this problem could be Electronic Travel Aids which use electronic maps and Geographic Information Systems (GISs) as suggested for instance by Loomis et al. in [Loomis93].

A GIS is a database system supporting spatially-referenced data. Typically the data are connected logically with coordinates. In addition, the algorithms which are operating on these data, for example for data capture, storage, checking, integration, manipulation and analysis, are normally included in the GIS. Spatial data ranges from information about the boundaries and ownership of land parcels to location and arrangement of streets.

A GIS uses a data model which is a formal definition of the relevant objects and appropriate operations and integrity rules on them. For spatial information, two data models are normally used: the raster model and the vector model. The raster model is based on a regular division of the space into cells. The vector model uses the concept of topology so that operations like scaling can be easily realized. For operations on the model, such as path finding, a vector model is essential.

GISs were independently designed for various purposes, including civil administration, area planning, geodesy, traffic control, etc., which makes it hard to integrate different GISs. In accordance with various purposes of the GISs, the electronic maps represented in them are also realized quite differently. On the one hand, the land registry offices are only interested in parcels of land, their use and to whom they belong and therefore have no notion of streets, street furniture or public transport facilities as independent entities. On the other hand, these features are extremely important for applications of traffic control where information about buildings is of minor interest. In addition, GISs vary substantially in their user interface which influences their capabilities for access to data, maintaining existing data and acquiring new data (compare [Voisard94]).

The complexity of different kinds of data included in these electronic maps makes it essential that several layers be introduced which can be viewed and edited independently but which can still communicate with each other. Examples of different layers are borderline layers, street layers, public transport line layers, building site layers, individual data layers. For blind users there is a need for even more layers, for example one which represents street-crossing facilities such as whether there is a special traffic light for blind persons, a zebra crossing or a subway crossing.

The bottleneck of a GIS to be used for the purpose of navigation for blind people is the dependence on reliable data. Existing maps are usually too imprecise to fulfill their special needs. In digitizing maps of small scale for example the problem often occurs that polygons which should be of identical location drift apart. Another irregularity is that the identification of houses is not unmistakable in all cases: there are houses with different addresses which make the data acquisition more complex.

Blind and visually disabled people do not usually have high incomes. Therefore, an electronic travel aid must be reasonably priced to penetrate the market. The digitizing of maps, however, is a time consuming and difficult procedure making most commercially available GISs quite expensive. This requires the use of existing maps (from public services, etc.) and sophisticated methods to integrate them.

3. Modeling Pictorial Information

As mentioned above, pictorial information can in some cases be represented as a 3D data model. These models are common in CAD (Computer Aided Design) and Virtual Reality. In both cases they are restricted to quite simple geometric shapes or relatively few objects in the scene or both. Even with this restriction it is hard work to model a sector of the real world and some objects cannot be modeled at all (trees, animals with a coat or feathers, and human beings).

People in the VR business are familiar with these problems. In addition, the 3D model, once it is completed, requires a large amount of calculating power to be rendered. VR applications require even more power: the visual aspect of VR is one of the most challenging applications for graphics supercomputing.

By the way, the acoustic aspect must also be taken into account. The problems arising here are in several aspects even more complicated than the problems of the visual sense: it is easy to lie to the eye but it is not easy to lie to the ear. The human sense of hearing is quite sensitive to un-natural sounds or effects. (We will not cover this topic here any further.)

To reduce the workloads of man and machine, we propose an approach, which allows a (sighted) person to model a scene quickly and easily and still results in a 3D scene.

The first decision to be made in this field is which information is most essential in pictures. To find a solution, which is applicable to a large proportion of all the pictures, we need a universal classification of the contents of pictures. As long as we only look at non-abstract pictures, i.e. pictures showing real scenes or photos, we come to the following solution:

  1. Objects represented: type of object ("house"), name of specific object ("my hotel")
  2. Shape of object in 2D: rectangular, circular, narrow, ...
  3. Shape in 3D: cubic, spherical, cylindrical, ...
  4. Position on 2D picture
  5. Position in 3D space
  6. Position relative to other objects
  7. Direction in 3D space

Some information can be derived from other information, e.g. the position relative to other objects can be derived from the 3D position. For most purposes some of the information is not essential, e.g. the shape in 3D if the type is known: you do not need to know that a tree has a cylindrical trunk and a spherical top as this is part of your experience. You also do not really need to know the exact position in 3D. If you want to have an overview of the scene displayed, you may be satisfied if you know what is in the foreground, in the middle and in the background.

These observations lead us to the concept of wings as they are used in the theatre and in some (old) movies. These wings are flat cardboards, so you could look upon them as 2D objects, possibly with a 2D shape. These wings are located in the 3D space; they have a position relative to other wings and objects (as well as actors) and they can be moved in 3D to change the scene.

This concept has advantages in terms of efficiency, too: It is quite easy to generate wings from 2D input (e.g. photos). This can be done automatically (in simple cases) or interactively using a pointing device like a mouse, a joystick or the pen of a pen-computer.

Once you have created the 2D shapes (bounding polygons) of the objects in the scene, you can set the z-coordinate using a tool that displays a scale. The user moves this scale along the z-axis and over the picture. This helps him to identify the z-coordinate (distance) of the objects next to the scale. Whenever the scale has the right size compared with one of the wings, the user marks this wing and the system can now derive the z-coordinate (directly from the scale's z-coordinate) and the size (height and width) of the wing, comparing its length in the 2D picture with the wings height and width.

The information derived by this method may not be very exact, but it is as exact as a human being can recognize distances and sizes on a picture. So basically we can determine all information which a sighted user can derive from a picture.

The primary result of the procedure is a 3D model containing 2D objects (wings). The model also contains a description of the scene as the user is forced to enter a name for each wing.

The secondary result can be a modified model. Once you have a 3D model at hand, you can modify it. You can delete wings (take care! you may have surprising effects if you don't have a complete background!), add new ones (from a library) and move wings to other places. This is quite a simple way to model a scene using a picture (e.g. a photo) as input.

The concept of wings is very useful even in rendering-tasks. You can use wings instead of solid 3D objects for less important parts of the scene These wings can be polygons onto which textures (raster graphics derived from the picture) can be mapped with fast algorithms or even hardware. This opens a wide field of applications for wing-based scene descriptions.

Obviously this concept can be used in fields other than access to graphics for blind people. However, our main interest remains the use in this field of application. We use the wing model, derived from the interactive process described above to create a scene description, an abstract representation which the blind user can explore interactively. One way to explore it could be the Image Description Machine (see next chapter or [Strothotte93] for details).

All methods described below rely on a good abstract representation of the pictorial information. This chapter and the previous one describe two examples of methods to generate the abstract representation of graphical information. We call this procedure "modeling" and the result the "model". The model can then be used to give blind people access to the information displayed graphically for the sighted user.

4. Presenting Graphical Information to Blind Users

This chapter deals with the problem of giving blind people access to the now available model (the abstract representation). Since the model may be 3D, even the presentation as a 2D graphic for the sighted user implies some loss of accuracy. Sighted people can live with a deficit of information as they do not have a real 3D input channel. The eye (one eye) is capable of receiving 2D information only and this does not change very much with two eyes. Basically we live with 2D input.

Blind persons do not have an input channel for 2D information. The ear has only very limited capabilities for 2D sound; and the tactile sense of the skin is neither very accurate nor does it cover larger regions. So all information for blind persons must be presented as 1D. This 1D presentation can be a sequence of characters in a text (on a Braille display or on paper), spoken language, non-verbal acoustics or tactile drawings (on paper or presented using a 2D pin device which is described by [Schweikhardt85]). Even this seemingly 2D presentation is perceived one-dimensional (over the time).

There are two classes of methods to present graphics to blind persons:

The (possibly edited adequately) graphics can be presented in a tactile form. The blind person now has to explore the raised lines (or pins) with his fingers section by section to grasp the content and meaning. Several devices and technologies already exist to produce such tactile output. For details see for example [Edman92].

  1. The graphics can be explained verbally. This explanation could either be a one-directional process in which a ready-made text is presented to the user, or it could be an interactive process which allows the blind person to ask questions about the graphics. At least two devices can be used to present this text to the blind person: a text-to-speech synthesizer can read the text to the blind person aloud, or a Braille display can produce a string of Braille letters which the blind person can read.
  2. The more difficult problem is the pre-processing of the data on which the graphic is based (for example the map or the scene description). To be able to (pre-)process this data it must first be available. Therefore, we need modeling techniques such as those described above. More details about the processing of the derived model can be found in [Kurze94b].

5. Conclusion and Future Work

We have introduced a variety of problems which have to be dealt with in modeling information for VR applications for blind people. Some problems arise because these persons need additional information which sighted people do not need (as in the case of electronic maps). Others arise because information for sighted people is implicitly coded in a picture not accessible for blind persons.

In many cases (such as applications for blind people) it is not necessary to have the exact and complete data about the displayed objects at hand; it is often sufficient to know which objects are in a specific scene and where they are located. Our newly introduced concept of wings provides an easy way to gain such information. Refinements on this concept are presently being examined, especially to generate textual scene descriptions.

A set of methods has been described and classified to present graphical information such as maps and wing sceneries to blind people. A problem still to be solved is to use a 2D tactile output device simultaneously as an input device.

As we have shown in previous chapters, modeling for blind people requires some special techniques which are not used in other situations. However, the methods developed for our purposes can be used in many other situations outside the field of modeling for blind people.


Acknowledgments

We would like to thank Axel Schalt for his investigations on electronic maps and the GUIB consortium for their valuable hints.


References

Go to previous article 
Go to next article 
Return to the 1994 VR Table of Contents 
Return to the Table of Proceedings 


Reprinted with author(s) permission. Author(s) retain copyright.