Basic XML

The W3C Schools What is XML tutorial (especially the first two pages) is an excellent introduction to the concepts of XML and its differences from HTML.

In essence, the philosophy is different. Web browsers are designed to compensate for coding errors and to display web pages even if the HTML is not well-formed. This causes web browsers to be slow and unpredictable. Programs designed to read XML are designed to stop working when they encounter coding errors. Thus an XML document's code must be entirely well-formed.

This document is intended to be a quick summary for you to refer to so that you can write well-formed XML.

Rules for Well-Formed XML

1. XML elements must have a closing tag.

  • Elements with content look like this: <name>John Smith</name>.
  • Empty elements (such as one for a line break) look like this: <lb/>.

2. XML tags are case sensitive. The element <p> refers to something different from <P>. Opening and closing tags must have the same case forms. So <p>This is a paragraph</P> is not well formed.

3. XML tags must be properly nested:

  • <p><name>John Smith</p></name> -- Not well formed
  • <p><name>John Smith</name></p> -- Well formed

4. XML attribute values must be enclosed in quotes:

  • <name type=last>Smith</name> -- Not well formed
  • <name type="last">Smith</name> -- Well formed

Other Considerations

1. An XML document must begin with the XML declaration:

<?xml version="1.0"?>

2. XML Documents must have a root node -- an element that contains all other elements. In (x)html, the root node is <html>, but in XML it can be anything:

<?xml version="1.0"?>
<root>
... Content goes here ..
</root>

3. As in (x)html, some characters have special meanings in xml. If you want to use them in your content, you need to replace them with special entity references.

Character XML Entity Character Name
< &lt; less than
> &gt; greater than
& &amp; ampersand
' &apos; apostrophe
" &quot; quotation mark

For further information on special characters used for Old English, see Special Characters in XML.

4. In XML white-space is preserved, unlike in (x)html, where it is eliminated:

Code This is a      big space.
(X)HTML Rendering This is a big space.
XML Rendering This is a     big space.

5. XML elements are extensible. That is, the names of the elements are pre-defined in a schema to which you can add new elements. If you wish to extend the schema with your on elements, you should see the XML Schools advice for defining elements and attribute names and usage.

Return to Top