2003 Conference Proceedings

Go to previous article 
Go to next article 
Return to 2003 Table of Contents 


Antonio F. Rodríguez Hernández.
Departamento de fisiología, Universidad de La Laguna, España
Email: afrguez@ull.es

Rodríguez Hernández1 A.F.; Rodríguez Ramos3 L.F.; Chulani3 H.M.; Burunat2, E.;
González-Mora1 J.L.

1Departamento de Fisiología, Universidad de La Laguna.
2Departamento de Neuro-psicobiología. Universidad de la Laguna.
3Departmento de Electrónica, Instituto Astrofísico de Canarias. Tenerife, España.


We resume in this paper the aims and primary results of a research and development project in the field of sensory substitution using sounds, for blind persons mobility, orientation and perception of the environment.


Sighted people perceive a visual image of the world from the light rays coming from every coordinate of the space and objects present in the field of view. In a similar way, from the touched coordinates of an object, using the tactile and propioceptive senses, a sequential tactile image of the object can be gathered. Applying the idea that objects image experience is generated in short from spatially significant information, we have encoded the environment using spatial sounds, id est, sounds that are perceived coming from a particular location in the outer space. Using real or virtual sound sources, we make sound the surfaces of the objects inside a certain field of view in front of the subject, and explore the kind of perception that is generated.

Different devices offering distinct levels of environmental information for blind persons orientation and mobility using sounds, have been developed [1]. Some of them pursue objects shape recognition using sounds[2] and others using the tactile sense [3],[4]. Our main objective has been not only to explore the possible capability to recognize spatial features of the objects using spatial sounds, but also if it can be really generated a sensation of perceiving a whole picture of the object, present at a time and extending in the 3D space, as it could be said on the subjective experience of a visual image [5],[6],[7].

We have first explored the perception and the ability to recognize spatial patterns using real sound sources, that is, spatial configurations of loudspeakers that we make sound in a particular way. Later, because of the good results, it has been developed a device, called Virtual Acoustic Space device (VAS), which generates an auditory stimulus, sent through headphones, that makes in the user the illusion that the objects in front of him are covered by sound sources, in this case no real but virtual ones. Such effect is obtained combining the modern techniques of artificial vision [8], for recognizing the environment, and sound 3D processing with HRTFs (Head Related Transfer Functions), for sound externalization [9]. In order to explore the practical potentialities of this kind of stimulus for the blind person, preliminary studies have being just initiated with a latest portable and out of the laboratory functioning version of the VAS.

We will show next the researching devices developed to test the idea, a general presentation of the main results with the blind persons, and a brief reference to several multidisciplinary research and development lines initiated around this subject. Finally, will say some words about the fact that, for some subjects, the auditory spatial perceptions have been accompanied by simultaneous visual sensations called phosphenes.


Sound system for the excitation of a real sound sources matrix. A DSP (Digital Signal Processing) based commercial device has been specifically programmed to allow the reproduction of up to 56 simultaneous sounds through a same number of loudspeakers (fig.1).

It is possible to send an independent sound by every channel, controlling the order of presentation (in which sequence the loudspeakers sound) and the time interval between them.

Virtual Acoustic Space generating device.

Different consecutive versions of a device, the first just for laboratory condition [5], the last, a portable one, have been developed. It consists of two subsystems (fig. 5), the first one for the visual scene capture and analysis, and the second for the acoustical transduction and reproduction to the subject. Two microcameras are carried each one at every side of a pair of spectacles. From the images they capture, and using different possible types of artificial vision algorithms (geometric features detection, stereovision) (9), it is calculated what is called the depth map, that is the information about the position in 3D space of every coordinate of the surfaces of the objects present in the scene. It is then obtained where every patch of an object surface is located at height, azimuth or horizontal place, and distance (fig.2). About 6-7 depth maps per second are generated on line.

The acoustic subsystem, then, reproduces a random sequence of brief sounds, one for each one of the positions previously registered in the depth map. Every sound has been previously "spatialized" [10], that is, processed in such a way that listened through headphones it seems to come from a certain position in the environment.

Until the moment it has been used a short and impulsive sound, like a sort of click without pitch quality, to code every position. This type of sound is very locatable. The musical effect of listening this stimulus could be described as perceiving the multitude of rain drops striking the surface of a window glass, or like salt grains exploiting in the fire. A field of 80 º in the horizontal exe by 45 º in the vertical one is spatially resolved in a number of 17 (horizontal) by 9 (vertical) by 8 (distances) coordinates or pixels, which, because of containing a depth dimension, we usually refer to as "stereopixels".

To obtain the effect that a sound sent through headphones seems to come from a certain point in the space, it ha s been processed off line with the so called HRTF technique (8). This is a function that contains the effect that the head and rest of the body have on any sound wave that comes towards it. This effect is different depending on every direction for each ear. These function are calculated for every user. They are obtained from the signals recorded by a pair of little microphones located at the entrance of a blocked ear channel. A sound wave coming from a loudspeaker located in every position we want to simulate, must be recorded.


Many blind persons have tested the virtual device and the loudspeakers matrix, and at least seven of them participated in objective quantified studies. They are different in age and blindness onset, duration and degree of severity, and most of them have a fine sense of audition and orientation in known environments.

Some indications about which aspects of this novel stimulus must be attended to are given, concretely, to where the sound is coming from []. The figure perception is gathered then after a short period of exploration.

We have registered the verbal response describing what the subjects is experiencing when exposed to the stimulus, and the drawing in the air, when present, of the image they refer to perceive. They have also undergone several tasks on orientation and mobility.

Several kinds of objects to perceive have been presented: single points, lines of about half a meter in several directions, lines compounding different figures like an U or an inverted U, C and inverted C, etc., plane and curved surfaces or holes in the center of a sounding surface. We have presented to one person a room with four walls, an entrance, a window, a column in some part of it, and a table.

Spatial features of this objects, when presented to the subjects, can be correctly described by the great majority of them. A few persons show a special difficulty already present in the consecution of the externalization effect. The subjects can draw in the air where the objects seem to extend through. About the quality of their subjective experience, they refer they can perceive a whole and time maintained picture of the object, like a kind of figure against the background. This happens, in the first figure presentation, after a short exploration period of about 1 or 2 minutes. In next presentations, the figure perception is more immediate. It is possible to distinguish the presence or not of discontinuities in the figure spatial extension, for example, a sounding line versus just two soundings points at the extremes of the same line, or a sounding plane surface with or with out a 35º solid angle hole in its center. Subjects usually refer it saying they could not pass through such place. Objective tests have been carried on to obtain quantitative and objective measurements of the subjects responses. Fig.4 shows some examples of the draws comparing the one made by the subject using the virtual reality device and a reference one visually drawn. The draws by different subjects of 50 cm in width lines presented in different orientations (vertical, horizontal, and diagonal ), and the borders of a 30 by 30 cm square, all of them sounding at a distance of about one arm, are shown.

Two walls in a path are perceived as sounding objects globally present at both sides of the subject, with their vertical and in depth dimensions, and leaving in the centre a silent space, with its dimensions too, through which the subject knows she or he can and actually goes on.

We have tested just one subject in an experimental room. Without knowing such environment previously, and without using the touch sense, she was able to perceive the presence of the different accidents, move across it, and make a correct verbal and graphical description of the room and the relative position of every object and surface.

The results suggest that it is possible from an acoustic stimulus to perceive the presence, at their position and with their dimensions, of the previously non sounding detected objects, at least when single objects are presented.

It is not that one can locate many sounding positions from here and there, and then deduce which can be the shape and other spatial features of a supposed object. We believe it is perceived a whole object, extending in the space with its defined spatial features.

Auditory evoked phosphenes perception.

Some of the blind persons, and even one sighted person during a migraine episode, refer they perceive, just in the spatial location where they locate a simple sound source, luminous sparkles (described as little lights or stars) simultaneous with the auditory sensation. There are some few precedents of this strange phenomenon in the literature [11], [12], known as phosphenes induced or evoked by sounds, but here it rises their clear spatial location, and the fact that it remains present along the time through years. It seems to depend both on individual factors and on the stimulus special nature, that we are attempting to elucidate.

Started research lines.

Optimisation of the acoustic input: psychoacoustical studies; HRTF measurement of large sets of individual HRTFs(robot; individual mannequin; interpolation; other ways); signal processing (reverberation processing; sound synthesis). Study of the brain mechanisms involved in the perception of figures using sounds: MRIf studies; TMS studies. Neuro-psychological evaluation of the effects of an ordinary use of this kind of stimulus.

Loudspeakers matrix in the variable reverberation room.

Fig.1. Loudspeakers matrix in the variable reverberation room.

A scene is converted into a number of pixels, every one with an specific value of distance, what we refer to as stereopixel.

Fig.2. A scene is converted into a number of pixels, every one with an specific value of distance, what we refer to as stereopixel.

A person wearing the VAS device is signalling to a vertical continuous line.

Fig.3. A person wearing the VAS device is signalling to a vertical continuous line.

Some  figures drawn from visual (continuous blue line) and auditory (dotted black line) information.

Fig.4. Some figures drawn from visual (continuous blue line) and auditory (dotted black line) information.

Scheme showing the two subsystems constituting the VAS device. On the right, a photograph of one subject wearing one of the firsts prototypes.

Fig.5. Scheme showing the two subsystems constituting the VAS device. On the right, a photograph of one subject wearing one of the firsts prototypes.


[1] Kay, L. "Electronics aids for blind persons: an interdisciplinary subject".IEE review. IEE Proceedings, Vol. 131, Pt. A, No. 7, 559-576. September 1984.

[2] Capelle, C.H. et al. "A real time experimental prototype for enhancement of vision rehabilitation using auditory substitution". IEEE Transactions on Biomedical Engineering. Vol. 45, nº 10, pp 1279-1293. Oct. 1998.

[3] Bach-Y-Rita, P. et al. "Visual substitution by tactile image projection". Nature, 1221, 963-964. 1969.

[4] Sampaio E, Maris S, Bach-y-Rita P. Brain plasticity: 'visual' acuity of blind persons via the tongue. Brain Res. 2001 Jul 27;908(2):204-7.

[5] Rodríguez-Ramos L.F., Chulani H.M., Díaz-Saco L., Sosa N., Rodríguez-Hernández A., González Mora J.L. "Image And Sound Processing For The Creation Of A Virtual Acoustic Space For The Blind People". Signal Processing and Communications, 472-475,1997

[6] González Mora, J.L., Rodríguez-Hernández, A., Rodríguez-Ramos, L.F., Díaz-Saco, L., Sosa, N. "Development of a new space perception system for blind people, based on the creation of a virtual acoustic space". Proceedings of the International Work Conference on Artificial and Natural networks. Vol. 2, pp 321-330. Springer. 1999

[7] Rodríguez Hernández, A; Sosa, N , Rodríguez Ramos, L.F. , Chulani, H. Díaz Saco, L, y González-Mora, J.L.. "Percepción del entorno en personas ciegas a través de un estímulo sonoro espacial virtual generado por computador". Actas del Congreso Iberoamericano 3º de CAA, 1º de Tecnologías de Apoyo para la Discapacidad, Oct. 2000. pp 81-84. Ramón Cerés y Comité Organizador Editores.

[8] Faugeras. "Three dimensional computer vision. a geometric viewpoint". The MIT Press. 1993.

[9] Wightman, F., Kistler, D.J. "Headphone simulation of free field listening I: stimulus synthesis". Journal Acoustical Society of America. V. 85 858-867. 1989.

[10] Begault, D. R. "3-D sound for virtual reality and multimedia". 1994. AP Professional.

[11] Lessell, S., Cohen M.M. "Phosphenes induced by sound". Neurology, V 29, 1524-1527. 1979.

[12] Page N. G. R. et al. "Auditory evoked phosphenes in optic nerve disease". Journal of Neurology, Neurosurgery and Psychiatry. V. 45, 7-12. 1982.

The project team is deeply indebted to the blind volunteers for their collaboration. Different R + D areas of this work have been carried on by many team components whose important contribution can be found in our web page WWW.iac.es/eav. This work is supported by grants from the Spanish Science and Technology Ministry, Europeans Founds (FEDER) and the Spanish Blind Persons National Association (ONCE).

Go to previous article 
Go to next article 
Return to 2003 Table of Contents 
Return to Table of Proceedings

Reprinted with author(s) permission. Author(s) retain copyright.