The TEI schema specifies that the document root be the <TEI>
element. Within this element, there is a <teiHeader>
and a <text>
. The header contains descriptive information about the document (e.g. authors, titles, revision history, etc.). It is distinguished from the prolog (the xml declaration) the front matter of the text itself, which is contained within the <text>
element.
The <text>
element contains a <front>
element (title page, table of contents, etc.), a <body>
element (the main body of the text), and a <back>
element (appendices, indices, etc.).
So a basic TEI document would look like this:
<TEI>
<teiHeader>
<!-- [ description of the file ... ] -->
</teiHeader>
<text>
<front>
<!-- [ front matter ... ] -->
</front>
<body>
<!-- [ body of text ... ] -->
</body>
<back>
<!-- [ back matter ... ] -->
</back>
</text>
</TEI>
Additionally, the <text>
element can be placed inside a <group>
element for sequences of continuous but distinct texts (e.g. in anthologies).
<p> |
Marks paragraphs in prose texts. |
<div> |
Marks divisions in texts without specifying the semantic nature of the division. |
<divs>
can be nested inside each other, and <p>
elements can go inside of <divs>
. Two other elements may be useful:
<head> |
Used for material at the beginning of a <div> . |
<trailer> |
Used for material at the beginning of a <div> . |
As in (x)html, elements can have attributes. Some attributes can be applied to any element. These are known as global attributes. They are listed below:
type |
This attribute can have values like “chapter”, “story”, “poem”, “song”, etc. |
xml:id |
The value is a unique identifier for the element. Useful for links and cross references. |
n | The value is a unique name for the element. It can be more descriptive than an ID number. |
xml:lang | The value indicates the language of the element. Conventional language codes are: en: English, enm: Middle English, fr or fra: French, la or lat: Latin |
Attributes are referred to using the @
sign (e.g. @type
). However, when placed in an element, they are coded like in (x)html, e.g. <div xml:lang=”enm”>
.
<l> |
Indicates a verse line. |
<lg> |
Indicates a group of lines functioning as a formal unit (e.g. a stanza). |
<pb/> |
Indicates a typographic page break. |
<lb/> |
Indicates a typographic line break. |
<hi> |
Indicates that the text is graphically distinct from surrounding text (but does not specify a reason). |
<emph> |
Indicates that the text is more linguistically or rhetorically emphatic. |
<foreign> | Indicates that the language of the text is different from the surrounding text. |
<gloss> | Indicates that the text is a gloss. |
<title> | Indicates that the text is a title. |
<note> | Indicates that the text is a note, footnote, or endnote. |
<label> | Indicates that the text is a label |
<g> | Indicates that the text is distinct from the surrounding text (but does not specify how). |
If it becomes necessary to specify how the text should appear (e.g. when reproducing meaningful formatting from a printed edition), it is possible to use the @rend
attribute. For instance, <hi rend="italic">
The Great Gatsby</hi>
. Note that in most cases it would be better to use the <title>
element, e.g. <title>
The Great Gatsby</title>
.
Cross references and links to other documents can be created with reference, pointer, or anchor elements.
<ref> |
Indicates that the text is a reference to an id or URL. |
<ptr/> |
Indicates a pointer to an id or URL before or after an element. |
<anchor/> | Indicates a link to an id or URL not attached to a specific element. |
<seg> | Indicates a segment or sequence of words to which a link can be attached. |
Note that the <ref>
element contains text, e.g. ‘See <ref target="#SEC12">
Section 12</ref>
’. By contrast, the <ptr/>
is an empty element, e.g. ‘See <ptr target="#SEC12"/>
’. If the reference is not to a specific element, it can be placed in an <anchor/>
element.
The segment (<seg>
) element can also be used to attach a reference to a sequence of words not otherwise tagged: e.g. <seg type="target" xml:id="EFGH">
.
@target
may contain a URL (a link to another document).
<corr> |
Indicates that the text is a correction. |
<sic> |
Indicates that the text is reproduced from the original, even though it is apparently incorrect or inaccurate. |
<reg> | Indicates that the text has been regularised. |
<orig> | Indicates that the text is an original reading has been kept. |
The last two can be combined:
<choice>
<orig>Reading A</orig>
<reg>Reading B</reg>
</choice>
<add> |
Indicates that the text is a scribal addition. |
<gap> |
Indicates that the text has been left out because of illegibility or editorial decision. |
<del> | Indicates that the text is a scribal deletion. |
<unclear> | Indicates that the text is illegible. |
@resp |
Indicates which editor is responsible for the change. |
@hand |
Indicates which editor is responsible for the change. |
<abbr> |
Indicates that the text is an abbreviation. |
<expan> |
Indicates that the text is an expansion of an abbreviation. |
<name> |
Indicates that the text is a name or proper noun. |
<date> |
Indicates that the text is a date. |
<num> | Indicates that the text is a number. |
<bibl> |
Indicates that the text a loosely-structured bibliographic citation of which the sub-components may or may not be explicitly tagged. |
<author> |
In a bibliographic reference, contains the name of the author(s), personal or corporate, of a work. |
<biblScope> | Defines the scope of a bibliographic reference, for example as a list of page numbers, or a named subdivision of a larger work. |
<date> | Contains a date in any format. |
<editor> | Indicates an editor. |
<publisher> | Indicates a publisher. |
<pubPlace> | Indicates a place of publication. |
<title> | Indicates a title. |
<s> |
Indicates an orthographic sentence. |
<seg> |
Indicates a larger structure containing sentences. |
@scribe |
Indicates the scribe of the text. |
@script | Indicates the script of the text. |
@medium | Indicates the tint or type of ink. |
@scope | Indicates whether the hand is the sole, major, or minor hand in the manuscript. |