(X)HTML Tutorial -- The Semantic Web

The Semantic Web

At this point, we need to return to the history of (X)HTML. As early as 1999, there was talk of a Web 2.0 (necessitating the invention of a previously existing Web 1.0) as the future of the internet. Whereas Web 1.0 largely involved using web browsers to retrieve information in the form of web pages, Web 2.0 would involve greater interaction between users and the data, as well as the formation of communities around sets of data. This vision of Web 2.0 has very largely come to pass, both commercially and socially, as we can see from web sites like Amazon.com, Facebook, or innumerable blogs. One of the requirements of this sort of internet interactivity is an attention to coding standards and, increasingly, methods of representing data that can be analysed by computers. Here is an example of what this means. Consider the following code:


   <p>Box of Apples: $5.99</p>

This would naturally be displayed in the web browser without the <p> tags. To a human, the meaning is obvious: a box of apples costs $5.99. But a computer can only say that the string of characters "$5.99" follows the string of characters "Box of Apples", or that "of Apples: $5.99" follows "Box ", and so on. This is meaningless. There is no way to ask a computer to find the product (apples) or its cost on a large web page. At best, we can ask it to find a paragraph, but that might not have the information we want.

This is why semantic coding has taken on increasing importance in Web 2.0. The more we can describe data using semantic elements, the more that information can be analysed by computers. The importance of this coding method has even led people to begin to envision a Web 3.0, in which all data on the web would be described semantically, opening all of it to computer-based analysis and processing. This is known as the Semantic Web. There is considerable debate about whether such a semantic web is possible (or desirable), but one thing is clear: (X)HTML is not the technology to produce it. Although, as we have seen, (X)HTML has some semantic elements, there are not nearly enough to describe all the possible types of data. A much better technology for Web 3.0 is XML, where the data can have an unlimited number elements and is accompanied by a description of those elements.

Nevertheless, the thinking behind the semantic web has ramifications for how we produce web pages. The most important is the principle of separating content from presentation. Wherever possible, content should be represented in (preferably semantic) (X)HTML elements. The presentation, style, or appearance of that content should be conveyed using a different system of markup--normally CSS--which can be ignored by computers mining the text for data. Whilst we may not be able to fulfill the vision of the semantic web through (X)HTML, we can bring our coding practices closer to it, thus ensuring that our web pages have greater longetivity and compatibility with emerging Web 2.0 (and Web 3.0) technologies.

Learning (X)HTML therefore requires learning the CSS alternatives to traditional (X)HTML forms of presentational markup. As a result, this tutorial will discuss some basic CSS syntax before introducing further (X)HTML techniques such as changing fonts or creating tables.