%oddext;
%dtdmods;
]>
Skeleton file for TEI P2, Chapter for Prose Base.22 Jun 92 : CMSMcQ : made file to drive LB's text
Base Tag Sets
Dummy
Dummy Div2
Dummy Div3
Dummy Div4
Base Tag Sets
Base Tag Set for Prose
This section describes the basic tag set for prose documents. The
textual features considered here include such basic structural features
as the chapters into which the body of a modern prose text is divided;
the sections and subsections into which chapters are in turn divided;
the index, glossary, bibliography, and other parts of the back matter;
the title page, contents, preface and other components which together
make up the front matter.
For discussions of these parts of the
book see e.g. Esdaile's Manual of
Bibliography revised edition by Roy Stokes, (London: George
Allen and Unwin, Ltd., 1931; 1967); The
Chicago Manual of Style (Chicago and London: University of
Chicago Press, 1969; 13th edition 1982), especially chapter
1. The TEI core structural elements tag set was developed with special
attention to Annex E of ISO 8879, the SGML standard, and to BK-1, the
SGML document type definition for books and dissertations included in
the AAP tag set, for which see AAP (Association of American
Publishers), Author's Guide to Electronic Manuscript Preparation
and Markup, Version 2.0 Revised Edition ([Dublin, Ohio]:
Electronic Publishing Special Interest Group (EPSIG), 1989).
Prose texts may be regarded either as unitary, that is,
forming an organic whole, or as composite that is,
consisting of several components which are in some important sense
independent of each other. The distinction is not always entirely
obvious: for example a collection of essays might be regarded as a
single item in some circumstances, or as a number of distinct items in
others. In such borderline cases, the encoder must choose whether to
treat the text as unitary or composite; each may have advantages and
disadvantages in a given situation.
Whether unitary or composite the text is marked with the
text tag and may contain front matter, a text body, and back
matter. In unitary texts, the text body is tagged body; in
composite texts, where the text body consists in a series of subordinate
or included texts, it is tagged group. The overall structure
of any text, unitary or composite, is thus defined by the following
elements:
contains a single text of any kind, whether unitary or
composite, for example a poem, drama, collection of essays, or novel.contains any prefatory matter (headers,
title page, prefaces, dedications, etc.)
found before the start of a text proper.contains the whole body of a single unitary
text, excluding any front or back
matter.contains the body of a composite text, grouping together a
sequence of distinct texts (or groups of such texts) which are regarded
as a unit for some purpose, for example the collected works of an
author, a sequence of prose essays, etc.contains any appendixes etc. following the main part of a text.
For further discussion of composite texts and corpora, see chapter . The remainder of this section deals primarily with unitary
texts consisting of prose.
The overall structure of a unitary text is thus:
...
]]>
Each of these elements and their content is further described
in the following subsections. The formal declarations for
text and body are as follows:
]]>
Other textual elements, such as paragraphs, lists or phrases, which nest
within these major structural elements, are discussed in
chapter .
Divisions of the Body
In some texts, the body consists simply of a sequence of low-level
structural items such as the paragraphs or lists discussed in chapter
. In many cases however sequences of such elements will
be grouped together hierarchically into textual divisions and
subdivisions, such as chapters or sections. The names used for these
structural subdivisions of texts vary with the genre and period of the
text, or even with the whim of the author, editor or publisher. For
example, a major subdivision of an epic or of the bible is generally
called a book, that of a report is usually called a part,
that of a novel a chapter --- unless it is an epistolary novel,
in which case it may be called a letter.
To cater for this variety, these Guidelines propose that all such
textual divisions be regarded as occurrences of the same
neutrally-named elements, with an attribute type
used to categorize further elements at a given hierarchic level.
Two alternative styles are provided for the naming of these
neutral divisions: numbered and un-numbered.
Numbered divisions are named div0, div1,
div2, etc., where the number indicates the depth of this
particular division within the hierarchy, the largest such division
being div0, any subdivision within it being div1, any
further sub-sub-division being div2 and so on. Un-numbered
divisions are simply named div, and allowed to nest recursively
to indicate their hierarchic depth.
Un-numbered Divisions
The following element is used to identify textual subdivisions
in the un-numbered style:
contains a subdivision of the front, body or back of a text.
Attributes include:
specifies a name for this level of subdivision, e.g. 'act',
'volume', 'book', 'section'
Using this style, the body of a text containing two parts, each
composed of two chapters, might be represented as follows:
]]>
Note that end-tags are mandatory for un-numbered divisions, to avoid
ambiguity. Note also that the type attribute must be
specified each time its value changes, for reasons discussed in section
below.
The div element has the following formal definition:
]]>
Numbered Divisions
The following elements are used to identify textual subdivisions
in the numbered style:
contains the largest possible subdivision of the body
of a text.
Attributes include:
specifies a name for this level of subdivision, e.g. 'act',
'volume', 'book', 'section', 'part'.contains a first-level subdivision of the front, body or back
of a text (the largest, if
div0 is not used, the second largest if it is).
Attributes include:
specifies a name for this level of subdivision, e.g. 'chapter',
'act', 'volume', 'book', 'section'contains a second-level subdivision of the front, body or back of
a
text.
Attributes include:
specifies a name for this level of subdivision, e.g. 'scene',
'part', 'chapter' etc.contains a third-level subdivision of the front, body or back of a
text.
Attributes include:
specifies a name for this level of subdivision, e.g. 'subsection',
'canto'contains a fourth-level subdivision of the front, body or back of
a text.
Attributes include:
specifies a name for this level of subdivision, e.g. 'clause',
'stanza' etc.contains a fifth-level subdivision of the front, body or
back of a text.
Attributes include:
specifies a name for this level of subdivision, e.g. 'clause',
'stanza' etc.contains a sixth-level subdivision of the front, body or
back of a text.
Attributes include:
specifies a name for this level of subdivision, e.g. 'clause',
'stanza' etc.contains the smallest possible subdivision of the front, body or
back of a text, larger than a paragraph.
Attributes include:
specifies a name for this level of subdivision, e.g. 'clause',
'stanza' etc.
The largest possible subdivision of the body may be tagged either
div0 or div1,
This convention (corresponding with the idea that a
type-set document may begin either with a level 0 or a level
1 heading) is provided for convenience and compatibility with some
widely used formatting systems.
and the smallest possible div7. If numbered divisions are in
use, a division at any one level (say, div3), may contain only
numbered divisions at the next lowest level (in this case,
div4).
Using this style, the body of a text containing two parts, each
composed of two chapters, might be represented as follows:
]]>
Formal definitions for these elements are as follows:
]]>
Numbered or Un-numbered?
The choice between numbered and un-numbered divisions will depend to
some extent on the complexity of the material: un-numbered divisions
allow for an arbitrary depth of nesting, while numbered divisions limit
the depth of the tree which can be constructed. Where divisions at
different levels should be processed differently (chapters, but not
sections, for example, beginning on new pages), numbered divisions
simplify the task of defining the desired processing for each level.
Some software may find numbered divisions easier to process, as there is
no need to maintain knowledge of the whole document structure in order
to know the level at which a division occurs; such software may however
find it difficult to cope with some other aspects of the TEI scheme.
The two styles may not be mixed within the same front,
body or back element.
Whichever style is used, the global n and id
attributes (section ) should be used where appropriate
to provide reference strings for each division of a text which is
regarded as significant for referencing purposes (on reference systems,
see further section ). As indicated above, the
type attribute is used to provide a name or description for
the division. Typical values might be book, chapter,
section, part, or (for verse texts) book,
canto, stanza, or (for dramatic texts) act,
scene. This attribute has a declared value of #CURRENT,
which implies that if defaulted, the value used will be that most
recently specified on any element of the same kind, scanning the text
left to right. Hence, if un-numbered divisions are used, the
appropriate value must be specified each time a change of level occurs,
both down and up the document hierarchy.
The following extended example uses numbered divisions to indicate
the structure of a novel, and illustrates the use of the attributes
discussed above. It also uses some elements discussed in the next
section () and the p element discussed in
section .
Book I.
Of writing lives in general, and particularly of
Pamela, with a word by the bye of Colley
Cibber and others.
It is a trite but true observation, that examples work
more forcibly on the mind than precepts: ...
Of Mr. Joseph Andrews, his birth, parentage, education,
and great endowments; with a word or two concerning ancestors.
Mr. Joseph Andrews, the hero of our ensuing history, was
esteemed to be the only son of Gaffar and Gammar Andrews,
and brother to the illustrious Pamela, whose virtue is at present
so famous ...
The end of the first Book
Book II
Of divisions in authors
There are certain mysteries or secrets in all
trades, from the highest to the lowest, from that of
prime-ministering, to this of
authoring, which are seldom discovered
unless to members of the same calling ...
I will dismiss this chapter with the following
observation: that it becomes an author generally to
divide a book, as it does a butcher to joint his meat,
for such assistance is of great help to both the reader
and the carver. And now having indulged myself a little
I will endeavour to indulge the curiosity of my reader,
who is no doubt impatient to know what he will find
in the subsequent chapters of this book.
A surprising instance of Mr Adams's short memory, with
the unfortunate consequences which it brought on Joseph.
Mr. Adams and Joseph were now ready to depart different
ways ...
]]>
Contents of Prose Divisions
The divisions of any kind of text may sometimes begin with a brief
heading or descriptive title, with or without a byline, an epigraph or
brief quotation, or a salutation such as one finds at the start of a
letter. They may also conclude with a brief trailer, byline, or
signature. The following special-purpose elements are provided to mark
these features:
contains any title or heading appearing at the start of a
division of a text, list, etc.contains a quotation, anonymous or attributed, sometimes used at
the start of a section or chapter, or on a title page. contains a salutation or greeting prefixed to a foreword,
dedicatory epistle or other division of a text. contains a closing title or footer appearing at the end of
a division of a text. contains the closing salutation etc. suffixed to a foreword,
dedicatory epistle or other division of a text. contains the primary statement of responsibility given for a work
on its title page, at the head of the work, or at the end of the work.
The following example demonstrates the use of the epigraph
element:
Chapter 19
I pity the man who can travel
from Dan to Beersheba, and say 'Tis all
barren; and so is all the world to him
who will not cultivate the fruits it offers.
Sterne: Sentimental Journey.
To say that Deronda was romantic would be to
misrepresent him: but under his calm and somewhat
self-repressed exterior ...
]]>
Other than these elements, a division element (numbered or
un-numbered) may contain an optional sequence of ungrouped lower-level
structural elements (paragraphs, lists, etc., in the case of prose;
speeches or stage directions in the case of drama; verse lines in the
case of verse), followed by other division elements. The lower-level
structural elements appropriate to texts of specific types are defined
in the chapters of Part III devoted to those types; the basic
lower-level structural unit of prose is the paragraph, which is
described in section . Since texts of all types may
have prose in their front matter, back matter, notes, etc.,
the p element is available in all TEI DTDs.
The elements discussed in this section are formally defined as
follows:
]]>
Front Matter
The front matter of an encoded text should not be confused with
the TEI header described in chapter , which
serves as a kind of front matter for the computer file itself,
not the text it encodes.
With the exception of the title page, (on which see section ) the contents of the front matter should be encoded using
the same elements as the rest of a text. As with the divisions of the
text body, no other specific tags are proposed here for the various
kinds of subdivision which may appear within front matter: instead
either numbered or un-numbered div elements should be used.
The following suggested values
As with all lists of suggested
values for attributes, it is recommended that software
written to handle TEI-conformant texts be prepared to recognize and
handle these values when they occur, without limiting the user to the
values in this list.
for the type attribute may be used to distinguish various
kinds of division characteristic of front matter:
The following extended example demonstrates how various parts of the
front matter of a text may be encoded. The front part begins with a
title page, which is presented in section below. This
is followed by a dedication and a preface, each of which is encoded as a
distinct div:
To my parents, Ida and Max Fish
Preface
The answer this book gives to its title question is
there is and there isn't.
...
Chapters 1-12 have been previously published in the
following journals and collections:
chapters 1 and 3 in New literary
History
...
chapter 10 in Boundary II (1980). I am grateful for permission to reprint.
S.F.
]]>
The front matter concludes with another div element,
shown in the next example,
this time containing a table of contents, which contains a
list element (as described in section ).
Note the use of the ptr element to provide page-references:
the implication here is that the target identifiers supplied
(P1, P68 etc.) may correspond with identifiers used either for
div elements representing chapters of the text, or
for pb elements marking page divisions of the text.
Alternatively, the literal page numbers present in the source text might
be transcribed, but they are likely to be of little direct use in work
with the electronic text.
Contents
Introduction, or How I stopped Worrying and Learned
to Love Interpretation Part One: Literature in the Reader
Literature in the Reader: Affective
Stylistics What is Stylistics and Why Are They
Saying Such Terrible Things About It?
...
]]>
The following example uses numbered divisions to mark up the front
matter of a medieval text. (Entity references are used to represent the
characters thorn, yogh, and ampersand, as discussed in section .) Note that in this case no title page in the modern sense
occurs; the title is simply given as a heading at the start of the front
matter. Note also the use of the type attribute to indicate
document elements comparatively unusual in modern books such as the
initial prayer:
Here bygynni&th; a book of contemplacyon,
&th;e whiche is clepyd &Th;E CLOWDE OF VNKNOWYNG,
in &th;e whiche a soule is onyd wi&t GOD.
Here biginne&th; &th;e preyer on &th;e prologe.
God, unto whom alle hertes ben open, & unto
whome alle wille speki&th;, & unto whom no priue
&th;ing is hid: I beseche &th;ee so for to clense &th;e
entent of myn hert wi&th; &th;e unspekable &yog;ift of
&th;i grace, &th;at I may parfiteliche loue &th;ee
∓ wor&th;ilich preise &th;ee. Amen.
Here biginne&th; &th;e prolog.
In &th;e name of &th;e Fader ∓ of &th;e Sone &
of &th;e Holy Goost.
I charge &th;ee & I beseeche &th;ee, wi&th; as moche
power & vertewe as &th;e bonde of charite is sufficient
to suffre, what-so-euer &th;ou be &th;at &th;is book schalt
haue in possession ...
Here biginne&th; a table of &th;e chapitres.
&th;e first chapitre
Of foure degrees of Cristen mens leuing;
& of &th;e cours of his cleping &th;at &th;is
book was maad vnto.
&th;e secound chapitre
A schort stering to meeknes & to &th;e
werk of &th;this book
...
&th;e fiuve and seuenti chapitre
Of somme certein tokenes bi &th;e whiche
a man may proue whe&th;er he be clepid of God
to worche in &th;is werk.
& here eende&th; &th;e table of &th;e chapitres.
]]>
Front matter tags are defined in file TEIfron2.dtd:
]]>
Title Pages
Detailed analysis of the title page and other
preliminaries of older printed books and
manuscripts is of major importance in descriptive bibliography
and the cataloguing of printed books: such analysis may require a
rather more detailed tag set than that proposed here.
Definition
of such a tag set remains a work item for the TEI; such tag sets
for contemporary printed matter already exist or are being
created within the publishing industry, for example
the Majour (Modular Application for Journals) Project of the
European Workgroup on SGML. See for example
MAJOUR: Modular
Application for Journals: DTD for Article Headers
([n.p.]: EWS, 1991).
The following elements are therefore proposed as an interim
measure; they constitute a useful descriptive tag set
for the major features of most title pages.
titlePagecontains the title page of a text, appearing within the front
or back matter. docTitlecontains the title and all its constituents, as given on a title
page.titlePartcontains a subsection or division of the title of a work, as
indicated on a title page.
Attributes include:
typespecifies the role of this subdivision of the title.
Sample values include:
mainmain title of the worksubsubtitle of the workaltalternative title of the workdescdescriptive paraphrase of the work included in titlebylinecontains the primary statement of responsibility given for a work
on its title page, at the head of the work, or at the end of the work.
docAuthorcontains the name of the author of the document, as given on the
title page.epigraphcontains a quotation, anonymous or attributed, sometimes used at
the start of a section or chapter, or on a title page. imprimaturcontains a formal statement authorizing the publication of
a work, sometimes required to appear on a title page. docImprintcontains the imprint statement (place and date of publication,
publisher name), as given
(usually) at the foot of a title page.docEditioncontains an edition statement as presented on a title page of a
document.ornamentmarks the position of a printers device, ornament or figure
for example on a title page or elsewhere in a printed text.
Two examples of the use of these elements follow. First, the title page
of the work discussed earlier in this section:
Is There a Text in This Class?>
The Authority of Interpretive Communities>
Stanley Fish>
Harvard University Press
Cambridge, Massachusetts>
London, England>
]]>
Second, a characteristically verbose 17th century example:
THE
Pilgrim's Progress
FROM
THIS WORLD,
TO
That which is to come:>
Delivered under the Similitude of a
DREAM>
Wherein is Discovered,
The manner of his setting out,
His Dangerous Journey; And safe
Arrival at the Desired Countrey.>
I have used Similitudes,
Hos. 12.10
By John Bunyan>.Licensed and Entred according to Order.>
LONDON,
Printed for Nath. Ponder>
at the Peacock> in the Poultrey>
near Cornhil>, 1678>.
]]>
Those elements in the above list which are
not defined elsewhere have the following formal declarations:
]]>
Where title pages are encoded at all, their
physical rendition is often of considerable importance. One
approach to this requirement would be to use the s tag,
described in section ,
to segment the typographic content of each part of the title page, and
then use the global rend attribute to specify its rendition.
Another would be to use a tag set specialized for the description of
typographic entities such as pages, lines, rules etc., bearing special
purpose attributes to describe line height, leading, degree of kerning,
font etc. For more information, see chapter
.
Back Matter
Conventions vary as to which elements are grouped as back matter
and which as front. For example, some books place the table of
contents at the front, and others at the back. Even title pages
may appear at the back of a book as well as at the front. The
content model for back and front elements are
therefore identical.
The following suggested values for
the type attribute may be used to distinguish
various kinds of division characteristic of back matter:
appendixAn ancillary self-contained section of
a work, often providing additional but in some sense extra-canonical
text
glossaryA list of terms associated with definition texts
(glosses): this should be encoded as a list type=gloss
(see section )
notesA section in which textual or
other kinds of notes are gathered together
bibliogrA list of bibliographic citations: this should be encoded
as a list.citn (see section )
indexAny form of index to the work.
colophonA statement appearing at the end of a book describing the
conditions of its physical production, in modern books often with
details of the fonts and design used.
No additional elements are proposed for the encoding of back matter
at present. Some characteristic examples follow:
Index
Actors, public, paid for the contempt attending
their profession, Africa, cause assigned for the barbarous state of
the interior parts of that continent, Agriculture
ancient policy of Europe unfavourable to, artificers necessary to carry it on, cattle and tillage mutually improve each other,
...
wealth arising from more solid than that which proceeds
from commerce Alehouses, not the efficient cause of drunkenness,
...
...
]]>
A letter written to his wife, founde with this booke
after his death.
The remembrance of the many wrongs offred thee, and thy
unreproued vertues, adde greater sorrow to my miserable state,
than I can utter or thou conceiue....
... yet trust I in the world to come to find mercie, by the
merites of my Saiour to whom I commend thee, and commit
my soule.
Thy repentant husband for his disloyaltie,
Robert Greene.Faelicem fuisse infaustumFINIS
...
]]>
Addenda
M. Scriblerus Lectori
Once more, gentle reader I appeal unto thee, from the
shameful ignorance of the Editor, by whom Our own Specimen
of Virgil> hath been mangled in such miserable
manner, that scarce without tears can we behold it. At the
very entrance, Instead of prolegomena,
lo! prolegumena with an Omega!
and in the same line consulâas
with a circumflex! In the next page thou findest
leviter perlabere>, which his ignorance took to be the
infinitive mood of perlabor> but ought to
be perlabi> ... Wipe away all these monsters,
Reader, with thy quill.
]]>
The back element is defined in file
TEIback2.dtd; since there are no other specialized
back-matter tags, nothing else is defined there:
]]>
Specifying the Prose Base
To make the prose base accessible within a TEI document, the document
should define the following parameter entity declaration (or the
equivalent) in its document-type-declaration subset:
]]>
The overall document structure might thus be as follows:
]>
... ...
]]>
Within prose texts, as within texts of any type, smaller texts may be
embedded. In many cases, these may be easy to treat as quotations,
using the elements defined in section . In other
cases, however, the embedded text may not be prose: it may be a poem,
or a fragment of a dictionary, or a scene from a play. Such embedded
texts should be treated as texts in their own right and tagged with the
text element, as described in chapter .
If the embedded text is not of the same type as the enclosing text
(if a poem is embedded in a prose text, for example), the base for
mixed-type texts should be used instead of the base for prose. For a
prose text with an embedded text in verse, the document type declaration
might look like this:
]>
]]>
For more information on TEI base tag sets and their invocation, consult
section and section . For more
information on the base tag set for mixed-type texts, consult chapter
.
Overall Structure of the Prose DTD
The TEI tag set for prose is found in file teipros2.dtd;
it has the following overall structure:
%TEI.front;
%TEI.back;
]]>