%oddext; %dtdmods; ]>
Skeleton file for TEI P2, Chapter for Prose Base. No source; created in m-r form. 22 Jun 92 : CMSMcQ : made file to drive LB's text
Base Tag Sets Dummy Dummy Div2 Dummy Div3 Dummy Div4 Base Tag Sets Base Tag Set for Prose

This section describes the basic tag set for prose documents. The textual features considered here include such basic structural features as the chapters into which the body of a modern prose text is divided; the sections and subsections into which chapters are in turn divided; the index, glossary, bibliography, and other parts of the back matter; the title page, contents, preface and other components which together make up the front matter.

For discussions of these parts of the book see e.g. Esdaile's Manual of Bibliography revised edition by Roy Stokes, (London: George Allen and Unwin, Ltd., 1931; 1967); The Chicago Manual of Style (Chicago and London: University of Chicago Press, 1969; 13th edition 1982), especially chapter 1. The TEI core structural elements tag set was developed with special attention to Annex E of ISO 8879, the SGML standard, and to BK-1, the SGML document type definition for books and dissertations included in the AAP tag set, for which see AAP (Association of American Publishers), Author's Guide to Electronic Manuscript Preparation and Markup, Version 2.0 Revised Edition ([Dublin, Ohio]: Electronic Publishing Special Interest Group (EPSIG), 1989).

Prose texts may be regarded either as unitary, that is, forming an organic whole, or as composite that is, consisting of several components which are in some important sense independent of each other. The distinction is not always entirely obvious: for example a collection of essays might be regarded as a single item in some circumstances, or as a number of distinct items in others. In such borderline cases, the encoder must choose whether to treat the text as unitary or composite; each may have advantages and disadvantages in a given situation.

Whether unitary or composite the text is marked with the text tag and may contain front matter, a text body, and back matter. In unitary texts, the text body is tagged body; in composite texts, where the text body consists in a series of subordinate or included texts, it is tagged group. The overall structure of any text, unitary or composite, is thus defined by the following elements: contains a single text of any kind, whether unitary or composite, for example a poem, drama, collection of essays, or novel. contains any prefatory matter (headers, title page, prefaces, dedications, etc.) found before the start of a text proper. contains the whole body of a single unitary text, excluding any front or back matter. contains the body of a composite text, grouping together a sequence of distinct texts (or groups of such texts) which are regarded as a unit for some purpose, for example the collected works of an author, a sequence of prose essays, etc. contains any appendixes etc. following the main part of a text. For further discussion of composite texts and corpora, see chapter . The remainder of this section deals primarily with unitary texts consisting of prose.

The overall structure of a unitary text is thus: ... ]]>

Each of these elements and their content is further described in the following subsections. The formal declarations for text and body are as follows: ]]> Other textual elements, such as paragraphs, lists or phrases, which nest within these major structural elements, are discussed in chapter . Divisions of the Body

In some texts, the body consists simply of a sequence of low-level structural items such as the paragraphs or lists discussed in chapter . In many cases however sequences of such elements will be grouped together hierarchically into textual divisions and subdivisions, such as chapters or sections. The names used for these structural subdivisions of texts vary with the genre and period of the text, or even with the whim of the author, editor or publisher. For example, a major subdivision of an epic or of the bible is generally called a book, that of a report is usually called a part, that of a novel a chapter --- unless it is an epistolary novel, in which case it may be called a letter.

To cater for this variety, these Guidelines propose that all such textual divisions be regarded as occurrences of the same neutrally-named elements, with an attribute type used to categorize further elements at a given hierarchic level. Two alternative styles are provided for the naming of these neutral divisions: numbered and un-numbered. Numbered divisions are named div0, div1, div2, etc., where the number indicates the depth of this particular division within the hierarchy, the largest such division being div0, any subdivision within it being div1, any further sub-sub-division being div2 and so on. Un-numbered divisions are simply named div, and allowed to nest recursively to indicate their hierarchic depth. Un-numbered Divisions

The following element is used to identify textual subdivisions in the un-numbered style: contains a subdivision of the front, body or back of a text. Attributes include: specifies a name for this level of subdivision, e.g. 'act', 'volume', 'book', 'section'

Using this style, the body of a text containing two parts, each composed of two chapters, might be represented as follows:

]]>

Note that end-tags are mandatory for un-numbered divisions, to avoid ambiguity. Note also that the type attribute must be specified each time its value changes, for reasons discussed in section below.

The div element has the following formal definition: ]]> Numbered Divisions

The following elements are used to identify textual subdivisions in the numbered style: contains the largest possible subdivision of the body of a text. Attributes include: specifies a name for this level of subdivision, e.g. 'act', 'volume', 'book', 'section', 'part'. contains a first-level subdivision of the front, body or back of a text (the largest, if div0 is not used, the second largest if it is). Attributes include: specifies a name for this level of subdivision, e.g. 'chapter', 'act', 'volume', 'book', 'section' contains a second-level subdivision of the front, body or back of a text. Attributes include: specifies a name for this level of subdivision, e.g. 'scene', 'part', 'chapter' etc. contains a third-level subdivision of the front, body or back of a text. Attributes include: specifies a name for this level of subdivision, e.g. 'subsection', 'canto' contains a fourth-level subdivision of the front, body or back of a text. Attributes include: specifies a name for this level of subdivision, e.g. 'clause', 'stanza' etc. contains a fifth-level subdivision of the front, body or back of a text. Attributes include: specifies a name for this level of subdivision, e.g. 'clause', 'stanza' etc. contains a sixth-level subdivision of the front, body or back of a text. Attributes include: specifies a name for this level of subdivision, e.g. 'clause', 'stanza' etc. contains the smallest possible subdivision of the front, body or back of a text, larger than a paragraph. Attributes include: specifies a name for this level of subdivision, e.g. 'clause', 'stanza' etc. The largest possible subdivision of the body may be tagged either div0 or div1, This convention (corresponding with the idea that a type-set document may begin either with a level 0 or a level 1 heading) is provided for convenience and compatibility with some widely used formatting systems. and the smallest possible div7. If numbered divisions are in use, a division at any one level (say, div3), may contain only numbered divisions at the next lowest level (in this case, div4).

Using this style, the body of a text containing two parts, each composed of two chapters, might be represented as follows: ]]>

Formal definitions for these elements are as follows: ]]> Numbered or Un-numbered?

The choice between numbered and un-numbered divisions will depend to some extent on the complexity of the material: un-numbered divisions allow for an arbitrary depth of nesting, while numbered divisions limit the depth of the tree which can be constructed. Where divisions at different levels should be processed differently (chapters, but not sections, for example, beginning on new pages), numbered divisions simplify the task of defining the desired processing for each level. Some software may find numbered divisions easier to process, as there is no need to maintain knowledge of the whole document structure in order to know the level at which a division occurs; such software may however find it difficult to cope with some other aspects of the TEI scheme. The two styles may not be mixed within the same front, body or back element.

Whichever style is used, the global n and id attributes (section ) should be used where appropriate to provide reference strings for each division of a text which is regarded as significant for referencing purposes (on reference systems, see further section ). As indicated above, the type attribute is used to provide a name or description for the division. Typical values might be book, chapter, section, part, or (for verse texts) book, canto, stanza, or (for dramatic texts) act, scene. This attribute has a declared value of #CURRENT, which implies that if defaulted, the value used will be that most recently specified on any element of the same kind, scanning the text left to right. Hence, if un-numbered divisions are used, the appropriate value must be specified each time a change of level occurs, both down and up the document hierarchy.

The following extended example uses numbered divisions to indicate the structure of a novel, and illustrates the use of the attributes discussed above. It also uses some elements discussed in the next section () and the p element discussed in section . Book I. Of writing lives in general, and particularly of Pamela, with a word by the bye of Colley Cibber and others.

It is a trite but true observation, that examples work more forcibly on the mind than precepts: ... Of Mr. Joseph Andrews, his birth, parentage, education, and great endowments; with a word or two concerning ancestors.

Mr. Joseph Andrews, the hero of our ensuing history, was esteemed to be the only son of Gaffar and Gammar Andrews, and brother to the illustrious Pamela, whose virtue is at present so famous ... The end of the first Book Book II Of divisions in authors

There are certain mysteries or secrets in all trades, from the highest to the lowest, from that of prime-ministering, to this of authoring, which are seldom discovered unless to members of the same calling ...

I will dismiss this chapter with the following observation: that it becomes an author generally to divide a book, as it does a butcher to joint his meat, for such assistance is of great help to both the reader and the carver. And now having indulged myself a little I will endeavour to indulge the curiosity of my reader, who is no doubt impatient to know what he will find in the subsequent chapters of this book. A surprising instance of Mr Adams's short memory, with the unfortunate consequences which it brought on Joseph.

Mr. Adams and Joseph were now ready to depart different ways ... ]]> Contents of Prose Divisions

The divisions of any kind of text may sometimes begin with a brief heading or descriptive title, with or without a byline, an epigraph or brief quotation, or a salutation such as one finds at the start of a letter. They may also conclude with a brief trailer, byline, or signature. The following special-purpose elements are provided to mark these features: contains any title or heading appearing at the start of a division of a text, list, etc. contains a quotation, anonymous or attributed, sometimes used at the start of a section or chapter, or on a title page. contains a salutation or greeting prefixed to a foreword, dedicatory epistle or other division of a text. contains a closing title or footer appearing at the end of a division of a text. contains the closing salutation etc. suffixed to a foreword, dedicatory epistle or other division of a text. contains the primary statement of responsibility given for a work on its title page, at the head of the work, or at the end of the work.

The following example demonstrates the use of the epigraph element: Chapter 19

I pity the man who can travel from Dan to Beersheba, and say 'Tis all barren; and so is all the world to him who will not cultivate the fruits it offers.

Sterne: Sentimental Journey.

To say that Deronda was romantic would be to misrepresent him: but under his calm and somewhat self-repressed exterior ... ]]>

Other than these elements, a division element (numbered or un-numbered) may contain an optional sequence of ungrouped lower-level structural elements (paragraphs, lists, etc., in the case of prose; speeches or stage directions in the case of drama; verse lines in the case of verse), followed by other division elements. The lower-level structural elements appropriate to texts of specific types are defined in the chapters of Part III devoted to those types; the basic lower-level structural unit of prose is the paragraph, which is described in section . Since texts of all types may have prose in their front matter, back matter, notes, etc., the p element is available in all TEI DTDs.

The elements discussed in this section are formally defined as follows: ]]> Front Matter

The front matter of an encoded text should not be confused with the TEI header described in chapter , which serves as a kind of front matter for the computer file itself, not the text it encodes.

With the exception of the title page, (on which see section ) the contents of the front matter should be encoded using the same elements as the rest of a text. As with the divisions of the text body, no other specific tags are proposed here for the various kinds of subdivision which may appear within front matter: instead either numbered or un-numbered div elements should be used. The following suggested values As with all lists of suggested values for attributes, it is recommended that software written to handle TEI-conformant texts be prepared to recognize and handle these values when they occur, without limiting the user to the values in this list. for the type attribute may be used to distinguish various kinds of division characteristic of front matter:

The following extended example demonstrates how various parts of the front matter of a text may be encoded. The front part begins with a title page, which is presented in section below. This is followed by a dedication and a preface, each of which is encoded as a distinct div:

To my parents, Ida and Max Fish

Preface

The answer this book gives to its title question is there is and there isn't. ...

Chapters 1-12 have been previously published in the following journals and collections: chapters 1 and 3 in New literary History ... chapter 10 in Boundary II (1980) . I am grateful for permission to reprint. S.F.

]]>

The front matter concludes with another div element, shown in the next example, this time containing a table of contents, which contains a list element (as described in section ). Note the use of the ptr element to provide page-references: the implication here is that the target identifiers supplied (P1, P68 etc.) may correspond with identifiers used either for div elements representing chapters of the text, or for pb elements marking page divisions of the text. Alternatively, the literal page numbers present in the source text might be transcribed, but they are likely to be of little direct use in work with the electronic text. Contents Introduction, or How I stopped Worrying and Learned to Love Interpretation Part One: Literature in the Reader Literature in the Reader: Affective Stylistics What is Stylistics and Why Are They Saying Such Terrible Things About It? ... ]]>

The following example uses numbered divisions to mark up the front matter of a medieval text. (Entity references are used to represent the characters thorn, yogh, and ampersand, as discussed in section .) Note that in this case no title page in the modern sense occurs; the title is simply given as a heading at the start of the front matter. Note also the use of the type attribute to indicate document elements comparatively unusual in modern books such as the initial prayer: Here bygynni&th; a book of contemplacyon, &th;e whiche is clepyd &Th;E CLOWDE OF VNKNOWYNG, in &th;e whiche a soule is onyd wi&t GOD. Here biginne&th; &th;e preyer on &th;e prologe.

God, unto whom alle hertes ben open, & unto whome alle wille speki&th;, & unto whom no priue &th;ing is hid: I beseche &th;ee so for to clense &th;e entent of myn hert wi&th; &th;e unspekable &yog;ift of &th;i grace, &th;at I may parfiteliche loue &th;ee ∓ wor&th;ilich preise &th;ee. Amen. Here biginne&th; &th;e prolog.

In &th;e name of &th;e Fader ∓ of &th;e Sone & of &th;e Holy Goost.

I charge &th;ee & I beseeche &th;ee, wi&th; as moche power & vertewe as &th;e bonde of charite is sufficient to suffre, what-so-euer &th;ou be &th;at &th;is book schalt haue in possession ... Here biginne&th; a table of &th;e chapitres. & here eende&th; &th;e table of &th;e chapitres. ]]>

Front matter tags are defined in file TEIfron2.dtd: ]]> Title Pages

Detailed analysis of the title page and other preliminaries of older printed books and manuscripts is of major importance in descriptive bibliography and the cataloguing of printed books: such analysis may require a rather more detailed tag set than that proposed here.

Definition of such a tag set remains a work item for the TEI; such tag sets for contemporary printed matter already exist or are being created within the publishing industry, for example the Majour (Modular Application for Journals) Project of the European Workgroup on SGML. See for example MAJOUR: Modular Application for Journals: DTD for Article Headers ([n.p.]: EWS, 1991). The following elements are therefore proposed as an interim measure; they constitute a useful descriptive tag set for the major features of most title pages. contains the title page of a text, appearing within the front or back matter. contains the title and all its constituents, as given on a title page. contains a subsection or division of the title of a work, as indicated on a title page. Attributes include: specifies the role of this subdivision of the title. Sample values include: main title of the work subtitle of the work alternative title of the work descriptive paraphrase of the work included in title contains the primary statement of responsibility given for a work on its title page, at the head of the work, or at the end of the work. contains the name of the author of the document, as given on the title page. contains a quotation, anonymous or attributed, sometimes used at the start of a section or chapter, or on a title page. contains a formal statement authorizing the publication of a work, sometimes required to appear on a title page. contains the imprint statement (place and date of publication, publisher name), as given (usually) at the foot of a title page. contains an edition statement as presented on a title page of a document. marks the position of a printers device, ornament or figure for example on a title page or elsewhere in a printed text. Two examples of the use of these elements follow. First, the title page of the work discussed earlier in this section: Is There a Text in This Class? The Authority of Interpretive Communities Stanley Fish Harvard University Press Cambridge, Massachusetts London, England ]]>

Second, a characteristically verbose 17th century example: THE Pilgrim's Progress FROM THIS WORLD, TO That which is to come: Delivered under the Similitude of a DREAM Wherein is Discovered, The manner of his setting out, His Dangerous Journey; And safe Arrival at the Desired Countrey.

I have used Similitudes,

Hos. 12.10 By John Bunyan. Licensed and Entred according to Order. LONDON, Printed for Nath. Ponder at the Peacock in the Poultrey near Cornhil, 1678. ]]>

Those elements in the above list which are not defined elsewhere have the following formal declarations: ]]>

Where title pages are encoded at all, their physical rendition is often of considerable importance. One approach to this requirement would be to use the s tag, described in section , to segment the typographic content of each part of the title page, and then use the global rend attribute to specify its rendition. Another would be to use a tag set specialized for the description of typographic entities such as pages, lines, rules etc., bearing special purpose attributes to describe line height, leading, degree of kerning, font etc. For more information, see chapter . Back Matter

Conventions vary as to which elements are grouped as back matter and which as front. For example, some books place the table of contents at the front, and others at the back. Even title pages may appear at the back of a book as well as at the front. The content model for back and front elements are therefore identical.

The following suggested values for the type attribute may be used to distinguish various kinds of division characteristic of back matter:

No additional elements are proposed for the encoding of back matter at present. Some characteristic examples follow:

Index Actors, public, paid for the contempt attending their profession, Africa, cause assigned for the barbarous state of the interior parts of that continent, Agriculture ancient policy of Europe unfavourable to, artificers necessary to carry it on, cattle and tillage mutually improve each other, ... wealth arising from more solid than that which proceeds from commerce Alehouses, not the efficient cause of drunkenness, ...
... ]]>
A letter written to his wife, founde with this booke after his death.

The remembrance of the many wrongs offred thee, and thy unreproued vertues, adde greater sorrow to my miserable state, than I can utter or thou conceiue.... ... yet trust I in the world to come to find mercie, by the merites of my Saiour to whom I commend thee, and commit my soule. Thy repentant husband for his disloyaltie, Robert Greene. Faelicem fuisse infaustum FINIS

... ]]>
Addenda M. Scriblerus Lectori

Once more, gentle reader I appeal unto thee, from the shameful ignorance of the Editor, by whom Our own Specimen of Virgil hath been mangled in such miserable manner, that scarce without tears can we behold it. At the very entrance, Instead of prolegomena, lo! prolegumena with an Omega! and in the same line consulâas with a circumflex! In the next page thou findest leviter perlabere, which his ignorance took to be the infinitive mood of perlabor but ought to be perlabi ... Wipe away all these monsters, Reader, with thy quill.

]]>

The back element is defined in file TEIback2.dtd; since there are no other specialized back-matter tags, nothing else is defined there: ]]> Specifying the Prose Base

To make the prose base accessible within a TEI document, the document should define the following parameter entity declaration (or the equivalent) in its document-type-declaration subset: ]]> The overall document structure might thus be as follows: ]> ... ... ]]>

Within prose texts, as within texts of any type, smaller texts may be embedded. In many cases, these may be easy to treat as quotations, using the elements defined in section . In other cases, however, the embedded text may not be prose: it may be a poem, or a fragment of a dictionary, or a scene from a play. Such embedded texts should be treated as texts in their own right and tagged with the text element, as described in chapter .

If the embedded text is not of the same type as the enclosing text (if a poem is embedded in a prose text, for example), the base for mixed-type texts should be used instead of the base for prose. For a prose text with an embedded text in verse, the document type declaration might look like this: ]> ]]> For more information on TEI base tag sets and their invocation, consult section and section . For more information on the base tag set for mixed-type texts, consult chapter . Overall Structure of the Prose DTD

The TEI tag set for prose is found in file teipros2.dtd; it has the following overall structure: %TEI.front; %TEI.back; ]]>