2002 Conference Proceedings

Go to previous article 
Go to next article 
Return to 2002 Table of Contents


CONDUCTING USABILITY TESTING WITH COMPUTER USERS WHO ARE BLIND OR VISUALLY IMPAIRED

Elaine Gerber, Ph.D.
Senior Research Associate
Policy Research and Program Evaluation Department
American Foundation for the Blind
11 Penn Plaza, Suite 300
New York NY 10001
phone (212) 502-7644
fax: (212) 502-7773
email: gerber@afb.net

Summary:

This paper presents findings from extensive usability and user's experience research conducted with computer users who are blind or visually impaired. It is also designed as an instructional seminar for web designers, software and assistive technology developers, researchers, and anyone else interested in learning how to conduct these tests; methodological considerations of different techniques with this population are discussed.

Body of Paper:

I. Introduction: Making The Most Out of Section 508.

Section 508 of the Rehabilitation Act now requires, among other things, that all web sites used by federal employees and members of the public seeking information and services from the federal government be accessible. The World Wide Web Consortia (W3C) has published guidelines (i.e., Web Accessibility Initiative, or WAI) in addition to the legal requirements. Additionally, there are now a number of automated tools (such as Bobby and WAVE) which detect compliance with these standards. Although the field of accessible web design is growing, as any computer user knows, in order to be truly accessible, a site also has to be usable. There has been very little focus thus far on measuring whether what is technically "accessible" for individuals with visual impairments is actually usable. Very little research exists thus far; only one presentation at previous two CSUN conferences discussed usability findings. My research begins to redress this imbalance.

This paper presents findings from extensive usability and user's experience research conducted with computer users who are blind or visually impaired. I review previous research conducted in this arena, illustrate how it can be adapted for this population, and present generalizable lessons drawn from three different website tests, involving over 100 research participants. This paper is also designed as an instructional seminar for web designers, software and assistive technology developers, researchers, and anyone else interested in learning how to conduct these tests; methodological considerations of different techniques with this population are discussed.

II. Background: Introducing Strict "Usability" and "User's Experience"

The field of mainstream web design has incorporated usability testing as its mainstay; research methodologies have been tested and metrics developed in large part due to the effort of Jakob Nielsen. (See <http//www.useit.com/alertbox> for a review of many of these studies). According to his research, Nielsen has also found that generally five users is a sufficient sample size to determine 80% of the site level usability (see, for example, issues March 19, 2000, May 3, 1998.) There are however some exceptions, and I would add to his preexisting list some additional considerations based on my research among individuals with visual impairments. More on specific methodological considerations will be discussed below.

Nielsen, and others in the usability field tend to refer to objective, quantitative tests as "usability studies": they gather objective metrics, such as the time a task requires, the the error rate, which keystrokes were used for navigation, etc. Expanding the concept of usability a bit more broadly is often referred to as "user's experience" data: these studies include measuring users' subjective satisfaction, why they would visit one site as opposed to others, and for our purposes, how people with visual impairments conceptualize the web. Generally speaking, gathering "user's experience" data is more ethnographic in its approach; my emphasis was to understand how users approach and use a site, what they like and do not like qualitatively about it, and mainly, whether they perceive it as "accessible."

Previous usability and user's experience research has, almost without exception, been conducted with an able-bodied population, or at least with people for whom the use of assistive technologies was not noticeably apparent. To be fair, it does appear that some of the work coming out of the Nielsen Norman group has begun to look at people with disabilities. However, it is unclear at this point, how extensive these studies are, what types of disabilities were examined, how the tests were conducted, or what the results of such research were.

They are scheduled to present a one day seminar at a conference in November (As I am planning to attend, I will update this in time for CSUN). While their studies are important because they provide a wealth of insight about use of the web, their results require caution in that they may not apply to a population who is accessing the screen, and these sites, via a different medium. Below I discuss details of test methodology, and how they can be adapted to test the needs of a sample of computer users who are visually impaired.

The only other research to date conducted with people who are blind or visually impaired was a pilot study involving four test subjects. In this work, Barnicle identified a number of pertinent questions to the field, such as: How must the testing techniques be adapted to accommodate the needs of participants; Would the study yield useful (i.e., generalizable) data; and How will I know if the obstacles encountered were due to the mainstream software application, the assistive technology or the unique characteristics of an individual user? I hope to identify and answer some of those questions in this paper as well.

The research on which this paper is based consists of three rounds of web usability tests. Two of these were conducted for the purpose of revising the American Foundation for the Blind's (AFB) website <www.afb.org> and (at least) one round of testing on the Center for Medicare and Medicaid Services' (CMS, formerly HCFA) website <www.medicare.gov>. All of these tests included individuals who accessed the screen using screen reading and screen magnification technologies. AFB's testing involved 27 and 29 individual interviews in different rounds; CMS testing included a minimum of 7 at home, individual interviews, and 8 focus groups involving 43 participants. A total of 106 people have participated in the testing thus far (we anticipate further testing of the Medicare site prior to the CSUN conference, but it has not yet been completed by the time of this submission). Detail about our selection process, justification for sampling, and obstacles to recruitment will be discussed in more detail during the presentation.

III. Methods: Everything You Need To Know To Conduct Usability Testing

The Policy Research and Program Evaluation Department (PRPE) at AFB is convinced that a combination of methods, in particular focus groups and individual interviews, work best because the ideal design should incorporate a combination of qualitative and quantitative measures (i.e., usability and user's experience data), particularly because so little is known about preferences for this population.

In depth, in person, individual interviews (ideally conducted at the subject's natural workstation) take advantage of what is known in the field as the "thinking aloud method". Because the data is gathered through observation, subjects are literally asked to "think aloud", telling the researcher what they are doing and why as they perform a variety of predetermined tasks. Tasks should be based on one's research needs as well as be geared towards the research subject's interests; better data is collected when the individual involved is more highly motivated. The benefits of in-depth interviews (or IDIs) are not necessarily different for a visually impaired than for a sighted sample. That is, the researcher can observe what errors are being made, if the subject is "lost" (they think they are somewhere and aren't), and numerous other scenarios where a difference in perspective between user and observer may arise. Additionally, clarification can be sought for vague descriptions used by participants, such as "over here" or "I like that part there". Similarly, being present while someone is working gives the opportunity to probe on any new, unsuspected issues that arise while they are actively engaged on a site. There is a great need, as identified by Barnicle, for further research to take place in "real life" settings.

The second strategy, and one which deviates from the standard literature of usability, is to test using telephone focus groups. In part, Nielsen warns against using focus groups (and he refers to in-person groups, rather than ones conducted by phone) because the results are misleading: individuals tend to focus on the hypothetical or on "cool", fancy features which often obstruct ability to use a site efficiently. We circumvented this difficulty by assigning practical "tasks" in advance of focus groups. Individuals were asked to complete two tasks and to spend about 10 minutes just "surfing". These were randomly assigned. Concerns that people bend the truth to be closer to what they think you want to hear or what's socially acceptable were also avoided in this design, as only two people per group were assigned the same task, and it was clear that we, as members of the blindness community, understood consumers and had their best interests at stake.

The last two reasons that Nielsen warns against the use of focus groups actually may not apply to the majority of computer users who are visually impaired. Specifically, he suggests that in focus groups users tell you what they believe they did, not what they actually did. Although I agree that memory is highly fallible, I would argue that this population is unique because they have trained themselves to be more dependent on their memory than the average sighted user (both in terms of computer use and, most likely, in terms of life skills generally). Our data indicate that users can remember with a high degree of accuracy exactly which steps were taken to accomplish a particular task, which keystrokes or commands were used, and the wording of error messages they received as a result. Examples from our research will be presented.

The second and major reason that this population differs from that on whom all the other usability research has been conducted, is that these individuals are highly motivated. Being blind, this population has had limited access to graphical user interfaces (GUIs) and is accustomed to incompatible software and adaptive equipment. In other words, they are used to struggling to get the information they need. And, most importantly, because this medium allows users to access information independently (some for the first time), they are extremely motivated. While we encouraged users not to spend more than a half hour accomplishing their task assignment, users usually could not complete the tasks in the given time; however, very few stopped at a half hour, and some continued until they could complete it, taking upwards of 10-14 hours.

Because telephone focus groups can easily overcome the barriers to obtaining data from individuals with such low prevalence conditions, it is important to try and overcome their limitations. I will outline methodological considerations unique to telephone focus groups. Other methodological considerations specific to a visually impaired sample, such as insuring level of expertise, range of visual impairments, range of software and hardware configurations, will also be discussed.

On the other hand, there are unique challenges to working with this population that ought to be considered as well. For example, the "five users" rule may not apply in testing with users of screen magnification software. We believe that far more users are required, as residual vision creates such varied experience that the preferences of this (if they can be called a "single") group are also quite varied. For example, our results indicate that some individuals prefer lettering in all capitals because it is easier to read; for another person having all caps makes it harder to distinguish between the letters.

IV. Results and Beyond

Because one of the projects is still underway at the time of this prospectus, my preliminary conclusions are intentionally being left somewhat vague. However, I can promise that concerns with navigation, balancing the needs of content and design, and PDF will also be discussed. Additional results of these tests will be presented as "generalizable findings", so that web designers and developers can utilize them.

In order to understand why a web page is considered "inaccessible," all the elements and their characteristics need to be examined by the parties involved: the webmaster, the screen reader manufacturer, and the end user. Yet, the only way to understand whether a web page is "usable" for the average user is to have a number of different people try it. The ultimate goal would be to build "user friendly" concerns into accessibility guidelines.


Go to previous article 
Go to next article 
Return to 2002 Table of Contents 
Return to Table of Proceedings


Reprinted with author(s) permission. Author(s) retain copyright.