2001 Conference Proceedings

Go to previous article 
Go to next article 
Return to 2001 Table of Contents

Transcoding System for the Non-Visual Web Access (2) -- Annotation-based Transcoding --

Chieko Asakawa
Hironobu Takagi
IBM Japan Ltd.
Tokyo Research Laboratory
1623-14, Shimotsuruma
Yamato-shi, Kanagawa-ken 242-8502


These days, the role of the Web has been becoming wider, and Web authors tend to present as much information as possible in one page. A news site, for example, contains not only articles, but also shopping lists, hobby-related information, advertisements and so on. This information is visually fragmented into groupings, using various types of visual effects such as different background colors, different fonts, layout tables, spacing and so on. The blind users read the Web contents in tag order, but visually fragmented groupings are not accessible using tag order reading, so this authoring trend has been making the non-visual Web access harder.

Therefore, we decided to develop a transcoding system to improve non-visual Web access, which works as an intermediary between a server and a user. Our system consists of two parts, one for automatic transcoding and one for annotation-based transcoding. Both methods have pros and cons. Automatic transcoding can simplify a Web page without any manually produced annotations, however, the use is sometimes limited since it cannot deal with visually fragmented groupings nor with the distinct roles of the groups. Annotation-based transcoding can create an accessible page for voice output without removing any content, but it requires external annotations. Therefore we use both methods to improve the non-visual Web access. In this paper, we focus on the annotation-based transcoding. The most important objective is to transcode existing Web pages which are presented as two dimensional information to make the pages accessible as one dimensional information. After introducing the system architecture, we will describe our proposed annotations which can be provided both by sighted and blind annotators. We will then show examples of transcoded pages with the annotations and discuss conclusions and plans.

Annotation-based transcoding

Architecture Figure 1 -- Architecture of the Annotation-based Transcoder

Figure 1 shows the system architecture. The proxy server is the main component that transcodes a target HTML document. Users access our system simply by setting a proxy server for their browser. We use the IBM WebSphere Transcoding Publisher (WTP) as a proxy server and our transcoding system is implemented as a plug-in for WTP. It consists of three main components, a transcoding module, an annotation manager and an annotation database. When the transcoding module receives a target HTML document, the annotation manager also searches the annotation database with the same URL, and WTP transcodes the target HTML document using the annotation file returned from the annotation database. The annotation server (not pictured) receives annotation files which are created by sighted annotators and registers them into the annotation database. Each annotation file is basically linked to one URL. However, one page is sometimes similar to others, and in such cases, the annotation files can be shared among those pages. This is possible because our system evaluates the similarity of HTML documents.

Visually-specified annotations

These have two components, one for structural annotations and one for commentary annotations.

Structural annotations

The system uses the structural annotations to recognize visually fragmented groupings as well as to show the importance and basic role of each group. Basically, there are three XML tags in this kind of annotation, /< member />, /< role /> and /< importance />. A /< member /> tag consists of one HTML element or two or more HTML elements which belong to one group. A /< role /> tag describes a role for each group. A visually fragmented group generally has a role such as main content, header or footer of a page, index, advertisement, and so on. An /< importance /> tag indicates an importance of each group.

We have been prototyping a WYSIWYG authoring tool for this purpose. With this tool, sighted annotators can indicate each grouping using a mouse and a keyboard while looking at the IE screen. First, a sighted annotator needs to collect the elements of a visually fragmented group. After all of the elements in one group are selected, they can be registered to a /< member /> tag as one group. Next, he or she needs to annotate the grouping with a role. The system provides the default roles as selection items, but when none of them fits for the target group's role, it can be described by annotators. When a role is selected from the selection items, an /< importance /> tag is automatically defined. For example, the main content is assigned as the most important group, while an advertisement has the lowest importance. The importance can be indicated by annotators when they describe a role for the group.

Commentary annotations

Commentary annotations are used to give a useful description of each group. They can be also used to describe HTML tags, such as tags, <form></form> tags, <img />tags, /< area /> tags and so on. In the case of an <img />tag, currently there is no way to give users any description of an image on the fly when there is no alternative attribute (text description) of the image. Our annotation is described externally, so when an annotator thinks it is important to annotate an image with an explanation, it can be done easily.

In the case of a tag, it might be annotated as "With this form, you can search for titles of the video library," or "For this input box, you can only input numbers." These are just examples, but actual commentary annotations are written freely. Any comment to improve the non-visual Web access is helpful and appreciated.

Basically, there are two XML tags for commentary annotation, /< member /> and /< comment />. A /< member /> tag consists of one HTML element or two or more HTML elements which form one grouping. A <comment></comment> tag provides a comment for a /< member /> tag. With the authoring tool, annotators first select a group for a commentary annotation the same way they visually select fragmented elements for structural annotation. Then they describe the group in a comment.

Both structural and commentary annotations are saved as external XML documents and registered into the annotation database.

User annotationUser annotations consist of two kinds of annotations, one for selecting the main content and one for selecting the most useful form. Any page that is shown via our transcoding proxy has "Settings" as a link at the bottom of the page. When a user selects this link, the two links for "user annotation", appear in the settings menu.

An annotation authoring tool for blind users differs from one used by sighted annotators. We have been prototyping it as a form of Web application. This allows blind users to use it in an integrated way, as an extension to their regular surfing of the Net.

Selecting main content

When a user selects the "main content" command, the system will insert links with commands to the proxy at certain candidate elements on the screen. When a user selects one of these links, it registers the position as the starting point of the main content in that page. The system has heuristically analyzed that these locations might be the main content based on our experience. For example, a string which has more than 40 characters without any link, or a string which starts under a horizontal separation line might be the start of the main content. After registering the starting point of the main content in a page, it can be used in two ways. One use is as the target for an image link with "skip to main content" in its Alt attribute. The other usage is for moving information that was originally above the main content to the bottom of the page. In this way, a user can find the main content of the page very easily.

Selecting the most useful form

When the "the most useful form" option is selected, a link appears before the beginning of each
tag. The user can select the proper form that is most frequently used by him or her in that page by selecting one of these links. After selecting it, the form always appears at the top of that page. If a user prefers to have only an input box and a submit button in the form, he or she can chose that option from the settings menu. In this way, a user can easily use Web search engines.

Examples of annotation-based transcoding
Figure 2 -- Original and Transcoded Web Pages

Figure 2-1 shows an original page of "ABC News". There are 8 groups which are fragmented into clusters. These groupings should be marked by sighted annotators. The system reorders each group by its role and importance, referring to the corresponding annotation file. Figure 2-2 shows the transcoded page. The bar labeled (A) in Figure 2-2 is an image map and each attribute in the map links to the beginning of a group. Each /< alt /> attribute of an contains the role of each group. In this way, a user can understand the overview of a page and find out the logical grouping of the contents.

Conclusion and plans

We have prototyped both automatic and annotation-based transcoding systems and started user evaluations. On the Web there are many HTML documents with unexpected formats. Therefore, we need to test as many pages as possible before making the system more widely available. We plan to make this proxy system publicly available so its functions can be evaluated in practical use. Together, we also need to develop a mechanism to encourage annotations by sighted volunteers. Our next plan for new features is to add support for low vision requirements. We have started studying and prototyping color and size fitting for text and images.

There will be some limitations for such transcoding systems that are intended to make existing Web pages totally accessible. We hope that our annotation authoring tool will be used directly by Web authors. If they annotate own their pages with accessibility information, Web pages in general could be transcoded to be accessible based on the authors' annotations and using our system. Therefore we need to keep emphasizing the importance of Web accessibility to Web authors. Without their cooperation, full accessibility of the Web cannot be completely realized.

Go to previous article 
Go to next article 
Return to 2001 Table of Contents 
Return to Table of Proceedings

Reprinted with author(s) permission. Author(s) retain copyright.