Using XSLT to render accessible documents to the web
Dr. Carlos A. Velasco
GMD - German National Research Center for Information
Technology
Institute for Applied Information Technology (FIT.HEB)
Schloß Birlinghoven, D53757 Sankt Augustin (Germany) Carlos.Velasco-Nunez@gmd.de
CSUN 2001, 19-24/March/2001
Abstract:
This session aims to present an introduction to XML and its
transformation language XSLT. It will discuss the characteristics
of XML, how to incorporate accessibility requirements when
defining XML vocabularies, and how XSLT properties can help to
render accessible documents in the Internet.
1. Introduction
The landscape of new languages based upon the Extensible Markup
Language (XML) [1] of the World Wide Web Consortium (W3C) is
growing at a fast pace. The flexibility of these markup languages
makes them specially suitable to develop applications and
services for upcoming technologies.
With the great variety of applications and rendering devices
foreseen, a key role will be played by the Transformations
Language (XSLT) of the Extensible Stylesheet Language (XSL),
which transforms XML into other formats. The importance of XSLT
is due to the fact that HTML has a prevalent position over any
new technology willing to take-over in the net, specially in the
client side, where the adoption of XML by most of the available
user agents is slow. From the accessibility standpoint, there are
many elements to be considered. The paper aims to discuss issues
related to the creation of accessible XML vocabularies, how to
implement XSL Transformations, and some of the available tools.
2. XML and Accessibility
The Extensible Markup Language is a markup meta-language
developed on the basis of the Standard Generalized Markup
Language (SGML, ISO-8879) [2] by the World Wide Web Consortium to
extend the capabilities of HTML, and of the web. Despite SGML is
a widespread standard in the industry, and it is in use since the
late-eighties providing strong capabilities in the area of
document management, it was not developed with the Internet in
mind. The flexibility of SGML was on the cost of its complexity,
and the W3C sought for a simplification of SGML for web
development.
As defined in the W3C recommendation, XML is a subset of SGML
developed to enable generic SGML to be served, received, and
processed on the Web in the way that is now possible with HTML.
Despite an initially slow take-up, the industry is adopting XML
at an increasing pace. XML will shortly become a de-facto
standard for e-commerce and distributed applications. There are
several elements that make the future of XML promising:
It is well suited for data interchange between different
organizations and individuals. It will represent the «end
of proprietary file formats,» were the structure of your
documents is decided by the vendor of the tool used in your
organization. Even if the structure of the Document Type
Definition (DTDs) is not available, XML documents can be read and
exchanged, because documents are standard Unicode text files, and
its well-formedness can be checked against the recommendation,
which defines an unambiguous mechanism for constraining the
structure. Thus XML bridges the gap between human-readable
documents, and machine-readable documents, allowing a smooth and
seamless interchange of information.
It allows to store the information in a hierarchical format. It
must be stressed that XML is better adapted to exchange
information with Object Databases, although they have not reached
yet the maturity and spread of Relational Databases.
It enables the use of smart software agents, and provides to
Internet and local search engines with meaningful elements to
classify and sort documents with the assistance of the document
markup and content. It also allows the employment of meta-data
[3,4].
It has a strict and consistent syntax that makes documents easy
to process. The need of a new meta-language to create new markup
languages is justified by several problems presented by HTML: it
is not extensible by the author (although browser manufacturers
defined proprietary tags in the past); it is display-centric (a
well-known accessibility hurdle); it is not directly reusable; it
only provides one view of the data; it has little or no semantic
structure; and it is not suitable for data exchange.
W3C tried to eliminate many of the accessibility barriers
presented by HTML's display-centric elements by creating
Cascading Style Sheets [5] and deprecating most of the
presentational elements in the latest specification of HTML [6].
XML goes one step beyond because it allows to create languages
with a clear distinction between content and presentation. This
characteristic is emphasized in the latest draft of the document
on XML accessibility published by the W3C [7]. The document
outlines four general guidelines to develop accessible DTDs and
accessible documents. It also defines an accessible document as a
document that can be equally understood by its targeted audience
regardless of the device used to access it.
From our point of view, these guidelines could be classified in
two groups: those related to documentation access, and those
related to design. Within the first group lies the need to export
the semantics of the Document Type Definition (DTD) used. This
implies to document it in an accessible way (HTML or text),
publish the specifications in known repositories (Schema.net or
XML.org) or make use of those already published by W3C and
others. The rest of the recommendations lead to a set of
practical techniques to implement the guidelines:
Identify clearly multimedia elements and its format. If the use
of W3C recommendations (SMIL [8], SVG [9]) is not feasible,
provide alternative content similarly to the