2006 Conference General Sessions

ICARE INTERACTION ASSISTANT: A WEARABLE FACE RECOGNITION DEVICE TO FACILITATE SOCIAL INTERACTION

 

 

Presenter(s)
Sreekar Krishna
Center
for Cognitive Ubiquitous Computing —

Arizona State University
699 South Mill Avenue
Tempe AZ 85281

Day Phone: 480—326—6334
Fax: 480—965-1885
Email: sreekar.krishna@asu.edu

Presenter #2
John Black
Center
for Cognitive Ubiquitous Computing –

Arizona State University
699 South Mill Avenue
Tempe AZ 85281

Day Phone: 480—727—7985
Fax: 480—965—1885
Email: john.biack@asu.edu


In the last decade, digital camera technology has transformed photography, making it possible to capture and process images in real time. In addition, the size and weight of digital cameras has been shrinking drastically, allowing the design of small unobtrusive assistive devices for people who are blind or visually impaired. Some researchers such as those at the Kyoto Institute of Technology. have, created wearable devices to help people who are blind navigate streets [1]. However, the consumers we have consulted have told us that they already have canes and dogs to help them navigate, and have encouraged us to work on other problems. One of the problems that was discussed in our focus groups was day-to-day social encounters. Social protocols dictate that a person greet a friend or acquaintance when meeting them unexpectedly. However, a person who is blind might not be immediately aware of such an encounter, and must need to rely upon the other person to initiate social contact.

Even after the initial contact is made, and a conversation is started, a person who is blind does not have access to many non—verbal communication cues (such as facial expression or eye contact) that sighted people take for granted. It is with these problems in mind that we have undertaken the development of the iCARE Interaction Assistant, which is a novel camera—based wearable device, designed specifically to facilitate the social interactions of users who are blind, by allowing them to more readily initiate social interactions, and to allow them to perceive non—verbal cues during subsequent verbal interactions.

The iCARE Interaction Assistant hardware includes a tiny analog CCD camera with a 1/3—inch CCD that has a light sensitivity of 0.2 Lux. The camera’s 92-degree field if view provides good coverage of the space in front of the user. This camera (which is mounted in the nosepiece of a pair of glasses worn by the user) is powered by a 9V battery, and generates an NTSC analog video signal. That signal is routed to an Adaptec(R) video digitizer, which converts the input signal into a compressed AVI video stream, and transmits that stream over a USB cable. Because
the digitizer output is compliant with the standard Windows Driver Model (WOW) it 4 appears to an application programmer as a generic video capture device on the Windows(R) operating system.

A laptop computer (which can be carried in a backpack) then executes a face recognition algorithm. We used a tablet PC with an Intel(R) Centrino 1.5 0Hz processor, and 512 MBytes of RAM. (This particular laptop was chosen because of its compact form factor.)

The video captured by the camera might (or might not) contain any faces. Since the laptop computer provides a limited amount of processing power, it is important to identify which regions (if any) of the video frames contain a human face. Since the video frames must be Scanned in real time, it is important that the method for doing this is highly optimized. We used an adaptive boost algorithm, which starts by quickly scanning an entire video frame (to rule out regions that are unlikely to contain faces) and then iteratively processes the remaining regions, gradually reducing the candidate regions until a final decision is made for each region.

Once a region in a video frame is identified as a face, all of the processing resources are focused on analyzing it, to recognize the person. In the iCARE Interactions Assistant we have employed two different face recognition algorithms to recognize people: (1) a well known method called Principal Component Analysis (PCA) [2], and our own novel method called Distinctive Feature Analysis (DFA), which recognizes each person based on distinctive facial features that distinguish him/her from others in the database. Both of these algorithms compare the face image captured by the video camera to face images in an onboard database, which were captured by that same video camera in the past. A similarity measure is used
% to compute which of the faces in the database are most similar to the newly captured face image, and the name of the person best matching the newly captured face is chosen.

The name of that person is then communicated to the user, using synthesized speech. The Interaction Assistant device speaks the name of that person into the ear of the user, using the Microsoft Speech Engine. The speech signal is routed to a sound emitter in the earpiece of the glasses (rather than through an in-the-ear device) to avoid altering the environmental audio perception of the user.

 

Face recognition algorithms have historically had to deal with two persistent problems-neither of which has yet been fully solved. First, the illumination present when the device is trying to recognize a person. (Changes are especially great between indoor and outdoor environments.) This problem can be partially solved by populating the database over time with a diverse collection of face images for each person, taken under various lighting conditions. The second problem is that the person being recognized might not be looking directly at the camera, thus slightly altering the appearance of his/her face image, moment by moment.

 

While testing the Interaction Assistant, we found that, although a majority of the frames in a video steam might be identified correctly, these two types of variations sometimes caused the device to sporadically recognize people incorrectly. To minimize user confusion, the device is configured to wait until the face recognition algorithm recognizes the same person in five consecutive frames before it speaks the name of that person.

 

The current Interaction Assistant prototype device recognizes and speaks the names of people that it has previously stored in its database. The name of the person standing in front of the device is normally delivered discretely to the user, but during demonstrations it is played through speakers, to make it audible to the audience. The prototype can be configured to use either or two different face recognition algorithms: Principal Component Analysis (PCA), or Distinctive Feature Analysis (DFA). The DFA algorithm is more reliable, but requires considerable processing time as each new face is added to the database. (Faces captured during a particular day might be added to the database overnight, to allow time for the intensive processing.) The PCA algorithm is typically used during demonstrations, because it permits members of the audience to come forward, have the device capture images of their faces, and then demonstrate that it can recognize them by speaking their name. (This “capture and learn” process takes about 30 seconds per person.)

 

In conclusion, the current implementation of the iCARE Interaction Assistant is aimed at recognizing faces to facilitate initial encounters, thus allowing a user to initiate social interactions (3). Ongoing work is aimed at facilitating subsequent verbal interactions, by interpreting non-verbal cures, such as eye contact, facial expressions, and gestures. The Interaction Assistant is just one component in the larger iCARE project (4) that is expected to produce very relevant and practical knowledge for the future design of assistive devices that go beyond navigational aids to facilitation of learning, social interaction, and communication, which are all vital to success in today’s professional world.

 

References

  1. Measurement of Pedestrian Crossing Length Using Vector Geometry-an Image Based Technqiue MA Uddin, T Shioyama-Circuits and Systems, 2004. MWSCAS’04.
  2. M. Turk, and A. Pentland. Face recognition using Eigenfaces, Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 586-591, 1991.
  3. S. Krishna, G. Little, J. Black and S. Panchanathan. A Wearable Face Recognition International ACM SIGACCESS Conference on Computers and Accessibility, Baltimore Oct. 9-12. 2005.

4. http://cubic.asu.edu (Click on iCARE Projects)

 


Go to previous article
Go to next article
Return to 2006 Table of Contents


Reprinted with author(s) permission. Author(s) retain copyright