Go to previous article
Go to next article
Return to 2002 Table of Contents
Kynn Bartlett ,Director, Accessible Web Authoring
HTML Writers Guild, Inc.
Resources and Education Center, HTML Writers Guild
110 E. Wilshire Avenue, Suite G-1
Fullerton, California 92832
Fax (714) 526-4972
Extensible Markup Language (XML) is not quite a language itself, but rather a set of rules and restrictions for constructing other languages: a metalanguage, or data format. Files created according to these rules can be processed by XML-aware programs, which means you can use similar tools and processes with any XML-based language.
XML is structured and hierarchical in nature, which means it looks like a tree of data, with child and parent nodes -- similar to a file structure on a hard drive. Like HTML, XML uses the concept of "tags" which surround text and define elements. Elements can have attributes, text values, and child elements.
The rules for defining an XML language are encoded in an XML Document Type Definition (DTD), or in an XML Schema. A DTD is a file which describes what types of elements and attributes constitute the language and what values they can take. It's interesting to note that a DTD (or Schema) is optional; as long as a document follows the rules of XML, it's an XML document even if the specifics of the language aren't fully defined.
Here is a simple example of an XML file:
<?xml version="1.0"?> <family> <human sex="male"> <name>Kynn</name> <age>33</age> </human> <human sex="female"> <name>Liz</name> <age>unknown</age> </human> <dog sex="male"> <name>Kim</name> <age>12</age> </dog> <dog sex="female"> <name>Angie</name> <age>12</age> </dog> <dog sex="female"> <name>Nying</name> <age>12</age> </dog> </family>
This file defines a list of family members -- two humans and three dogs -- and lists their genders, names, and ages. The tagging system used here should be familiar to anyone who has used HTML before, although the elements themselves are not HTML. In this case, this would be something like "Family Markup Language" and we may or may not have a formal DTD describing the language.
XML and HTML both come from the same source -- SGML, Standard Generalized Markup Language, which is also a metalanguage. XML is simpler and more restrictive than SGML. The "rules" for XML include:
Extensible Hypertext Markup Language (XHTML) is simply HTML rewritten to conform to the rules given above. This means that instead of writing <BODY> or <Body>, authors writing in XHTML would use <body>. (Why lowercase? When the XHTML specification was written, the case-sensitive nature of XML meant that a standard way of writing HTML tags had to be chosen, and so it was arbitrarily decided that XHTML would be all lower-case. It wasn't quite a coin-toss, but it was close.)
XHTML is stricter than HTML, because of the rules listed above -- for example, you can't simply "forget" to close a <p> element. Also, all attribute values need to be quoted, and empty elements, such as <img>, <hr>, or <br> -- which don't "contain" any content -- have to be written with slashes: <img/>, <hr/>, and <br/>.
The tricky part about XHTML, however, is that it's not inherently backwards compatible with HTML. Tags like <hr/> confuse older HTML browsers, which don't understand the XML way of closing an element by putting a slash in the tag -- they think it's a tag called "aitch arr slash", which they ignore since it's not understood. However, by adding a space before the slash -- so it reads <hr /> ("aitch arr space slash") -- the HTML browser is pleased and displays the HTML naturally.
Here's a short and simple XHTML page:
<?xml version="1.0"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "DTD/xhtml1-transitional.dtd"> <html xml:lang="en"> <head> <title>This is my page</title> </head> <body> <p> This is my page! </p> <hr /> <p> <a href="mailto:firstname.lastname@example.org">Kynn</a> </p> </body> </html>
By itself, XHTML is not necessarily any more accessible than HTML; depending on how you create the page and what elements and attributes you use, you could create a highly accessible page, or a highly inaccessible page. The use of XHTML itself (or XML) does not automatically guarantee a page's accessibility.
Another XML-based language is Extensible Style Sheet Transformations, or XSLT for short. The name is a bit misleading, especially if you are used to Cascading Style Sheets (CSS) -- XSLT actually specifies a way to transform from one XML-based format to another. For example, you could write an XSLT transformation which changes from Family Markup Language to XHTML, as they both follow the rules of XML languages. In such as case, Family ML would be a "data language" that contains information on the specifics of the content's structure, while XHTML would be a "delivery language" sent to a browser for display to a user.
This type of transformation capability has a number of useful functions in web accessibility, specifically in the ability to separate content (data) from presentation (delivery).
While it's beyond the scope of this paper to describe the specifics of XSLT syntax, it's possible to define the types of transformations possible. XSLT allows for complete restructuring of the content, adding or removing content, selecting specific pieces or large portions of the data document and creating an entirely new delivery document. The use of differing stylesheets allows for that content to be transformed for any number of purposes, including specialized interfaces for users with specific requirements.
Composite Capabilities/Preferences Profiles -- known as CC/PP -- is a W3C specification under development which allows for users to record information about the way their system can display or gather information, and then transmit that to a web server. For example, a CC/PP profile could contain a statement, "I prefer not to see images" or "I don't have a sound card in my computer."
A CC/PP-enabled server is then able to respond with an appropriate version of the web page, tailored to the user's stated needs and desires. For example, it could remove images and send textual equivalents, if the user requested "no images."
CC/PP was originally developed with the W3C by developers of cellular phones and programs that run on them, which need to know the physical characteristics of tiny displays in order to effect the best presentation for hundreds of different phone types. The same technology can also be used to deliver web pages to people with disabilities -- if they are willing to provide appropriate CC/PP profiles to those servers.
Okay, so let's put it all together and describe a system by which XML, XSLT, XHTML, and CC/PP can work together to produce an accessible user interface for users with specific needs.
The accessibility benefit of this approach, which is a single source, multiple interface model (rather than the traditional single source, single interface model of earlier web design) is that it allows for each user to receive an optimal user interface -- one which is not merely "accessible" but also "usable." Rather than the screenreader version being a derivate of the graphical user's design, the screenreader user receives her own interface to the same content, made to work with her needs and preferences. Conflicts between accomodations necessary different disability types can be mitigated with such an approach, since different transformations can be used for different users.
XML 1.0, W3C Recommendation, 10 February 1998, updated 6 October 2000 http://www.w3.org/TR/2000/REC-xml-20001006
XHTML 1.0, W3C Recommendation, 26 January 2000 http://www.w3.org/TR/xhtml1
XSL Transformations, W3C Recommendation, 16 November 1999 http://www.w3.org/TR/xslt
CC/PP Structure and Vocabularies, W3C Recommendation, 15 March 2001 http://www.w3.org/TR/CCPP-struct-vocab/
What is CC/PP?, Kynn Bartlett, 1999 http://www.ccpp.org/
Principles of Device Independence, W3C working draft, 19 September 2001 http://www.w3.org/TR/2001/WD-di-princ-20010918/
Go to previous article
Go to next article
Return to 2002 Table of Contents
Return to Table of Proceedings