(X)HTML Tutorial

A Basic Web Page and Its Structure

A basic web consists of an HTML declaration, a <head> element, and a <body> element. The code looks like this:



   <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
   <html xmlns="http://www.w3.org/1999/xhtml">
      <head>
         <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
         <title>A Basic Web Page</title>
      </head>
      <body>
         <p>Hello!</p>
      </body>
   </html>

See what this code looks like


The Document Type Declaration (DTD)

The first two lines of this code declare the Document Type--which version of (X)HTML you are using. By declaring the document type at the beginning of the file, you indicate to the browser which language it is supposed to be interpreting. You will note that the version here is XHTML 1.0 Transitional. There is considerable debate about which version of HTML should be used, or whether you should use HTML or XHTML. This tutorial does not delve into those issues.

You will notice that the Document Type Declaration (DTD) is contained within angular brackets. This tells the browser that the declaration is an element in the page. The next element is the <html> element, which indicates where the HTML code begins and ends. Generally, web pages should begin and end with an <html> element. In the example above, the <html> element also has an attribute called "xmlns". This stands for XML Namespace. It provides the browser with further information about where to find the document type definitions.

Note that you will see many web pages which lack the document type declaration and/or namespace, but where the web page still displays. This is because many web browsers are programmed to supply this information if it is missing from the web page file. However, this comes at some cost. The more browsers have to compensate for coding lacking or incorrect in the web page, the more bloated and slow they become. In general, you should stick with best practices in your coding in order to ensure that your web pages will display quickly and correctly.


The <head> Element

The next element is the <head> element. A variety of information can be contained within the <head> element, but the important thing to remember is that nothing inside with will appear on the web page. In the example above, the <head> element contains two elements, a <meta> element and a <title> element. The <meta> element can contain a variety of attributes; in the example above, the most important is the "charset" (character set) attribute. The code "iso-8859-1" indicates the International Standards Organization Western (Latin-based) character set. It is slowly being rendered obsolete by the Unicode character set ("utf-8"), which allows the browser to display many accents and other characters not available in older character sets. If the <meta> element is not provided, most browsers will default to whatever is in their option settings (generally the character set of the language used in the country where the browser is distributed).

The <title> element tells the browser what the title of the web page is. However, this title is not displayed on web page itself. It generally appears in the browser's status bar, or in browser tabs, depending upon the which browser you are using.

The <head> element may also contain any of the following:

  1. Further <meta> elements containing various types of information such as identifiers used by search engine for their indexes.
  2. Stylesheets or links to stylesheets written in languages like CSS. A typical stylesheet might define the size of the text or other formatting features on the page.
  3. Scripts written in languages such as JavaScript. A typical script does something on the page such as pop up an alert when an element in the browser window is clicked.

Nothing in the <head> element is required, but it is good practice to include a <meta> element indicating the page's character set and <title> element.


The <body> Element

The <body> element contains the content which will actually appear on the web page. In the example above, the <body> element has a <p> element inside of it, which contains the message "Hello!". The code <p> stands for "paragraph", although this message is not, strictly speaking, a paragraph. This shows that the (X)HTML code is not always strictly semantic.