Digital Renaissance Editions

Become a FriendSign in

Toolbox

Appendix: Tags

The objective of this tagging scheme is to make the task of creating effectively tagged texts that are at the same time reasonably readable. The tags are based on the similar conventions in HTML; at a later stage the files will be converted to both XML and HTML versions automatically. Editors will be provided with a model "base" text, usually consisting of a few pages, with much of the tagging in place, from which to get started and become familiar with the structure.

A 1. Transcriptions

This tag set is derived originally from the "Encoding Guidelines" for the Renaissance Electronic Texts, developed by Ian Lancashire. The XML version was developed by Peter van Hardenberg. Please note that all tags supplied in the base texts should be carefully proofread.

Where an editor wishes to add additional information not anticipated by these tags, he or she should correspond with the Coordinating Editor to see if the tagset should be expanded. In general, however, it will be sufficient to add the additional information in the form of a comment, using the SGML/HTML convention of the exclamation mark followed by two hyphens, thus:

<!-- NOTE: there is a printing ambiguity in the Quarto.
The previous word may be "quallitie" or "quallirie". -->

The comment is closed by two hyphens before the final angle bracket.

A 1.1. Header

Each document will have a "header" providing metadata about the document as a whole. The structure of the metadata conforms to the Dublin Core standard, with some additional items developed specifically for DRE/ISE/QME texts. Metadata describe in detail the provenance of the text, the names of those involved in developing it, and a list of tags and abbreviations used. The header will be compiled by the Coordinating Editor from information supplied by the editor.

A 1.2. Structural elements

<WORK> </WORK> The XML document root for all texts.
<FRONTMATTER> </FRONTMATTER> Surrounds frontmatter.
<BACKMATTER> </BACKMATTER> Surrounds backmatter.
<DIV> </DIV> Divisions of the FRONTMATTER and/or BACKMATTER, indicating the type of material contained.
name="[name]" Title page | Prologue | Dedication | Epistle | Dumb Show | Epilogue | List of Characters | etc.
<ACT n="[number]"> </ACT> Act division as in the modern edition.
<SCENE n="[number]"> </SCENE> Scene division as in the modern edition.
Note: In plays with a non-diegetic chorus, use <SCENE n="0"> to designate chorus passages that are not prologues or epilogues. SCENEs may be used with or without enclosing ACTs.
<TITLEHEAD> </TITLEHEAD> Initial heading in Folio.
<LD> </LD> Literary Division (e.g. Act, Scene.). Note that this tag encloses original literary divisions; modern divisions are indicated by separate tags (see 1.4 below)
<S> </S> Speech. Includes the speech prefix and any included stage directions (</S> may appear at the end of a hung word).
<SP> </SP> Speaker Prefix.
norm="Name" Normalized form of the name; must be included in every instance of a speech prefix, but not if the name is used in the course of a speech. Example: <SP norm="William">Will.</SP>.
Where there are more than one speaker, separate the normalized names by commas; if there is a speech without a speech prefix, put <SP norm="normalized name"></SP> to indicate its omission.
<SD> </SD> Stage Direction. Each line of a split direction in the right hand margin should be tagged separately; directions different in kind should also be tagged separately.
t="[type]" entrance | exit | setting | sound | delivery | whoto | action | location | other | optional | uncertain
Example: <SD t="exit"> <I>Exeunt.</I></SD> <SD t=sound><I>Alarum</I></SD>. Of course there will be single instructions that include more than one kind of direction, in which case you should separate the types with a comma: <SD t="entrance, setting"><I>Enter Macbeths Wife alone with a Letter</I>.</SD>.
<VERSEQUOTE> </VERSEQUOTE> Verse quotation (e.g. song).
source="name" When the verse is a quotation from another source, the source should be recorded. For matters concerning display, see the tags <SPACE> and <INDENT>.
<PROSEQUOTE> </PROSEQUOTE> Prose quotation (e.g. quoted letter).
source="name" (As for verse.)
<MODE> </MODE> Indicator of verse or prose.
t="[mode]" prose | verse | uncertain
Example: <MODE t="prose"> . . . </MODE>.
Note that you may choose to use a type of "uncertain" where it is not clear that the section is either verse or prose.
<FOREIGN> </FOREIGN> Language when not English. Used for the content of speeches only, not Latin stage directions, literary divisions etc.
lang="language" Example <FOREIGN lang="French">Diable!</FOREIGN>.

A 1.3. Printing elements

a) Page

<PAGE n="[number]"> </PAGE> defines the extent of a printed page. n gives the page number.
<SIG> </SIG> Page Signature (encloses the printed signature, and appears at the end of the page, where it will be displayed). Note that the signature itself may include other tags where necessary (if, for example, an italic letter is used).
n="[signature]" The signature in its normalized and accurate form. This will include all signatures that are implied rather than actually printed, and must include either "r" or "v" for recto and verso.
Examples: <SIG n="A2v"></SIG><BR> <SIG n="aaa1r"><LS>aaa</LS></SIG>
<CW> </CW> Catchword.
<RULE /> Rule. (Note that the final forward slash is required to "close" the tag.)
<RT> </RT> Running title.
<PN> </PN> Page number as printed.
<COL n="0"></COL> Defines pages with no columns in a document that elsewhere has columns; where there are columns, indicates a print element that spans both columns.
<COL n="1"></COL> Column 1 (Folio). Placed at the beginning of the column.
<COL n="2"></COL> Column 2 (Folio).
<CL> </CL> Closing (e.g. Finis).

b) Typesetting

<I> </I> Italic text. Note: intermediate spaces should be italicized.
<BLL> </BLL> Black Letter.
Note: for texts basically in black letter, use the metadata tag <META name="ISE.DefaultFont" content="BLL"/>.
<R> </R> Roman text. Note: this tag will only be used in texts for which black letter is the default.
<LS> </LS> Letter-Spaced (e.g., "G O D").
<SUP> </SUP> Superscript characters (this follows the HTML 3.2 convention).
<SUB> </SUB> Subscript characters.
<J> </J> Justified line(s). Only fully justified lines are tagged. Note that verse lines that reach to the end of the column should not be tagged as justified (though many draft texts do), since these are not justified in the way that prose lines are.
<HW> </HW> Hung Word(s). Note that the hung word should be restored to the line it continues; the "type" indicates whether it was originally displaced to the previous or next line.
t="prev | next" The type of hung word indicates whether it appears on the previous or next line from the line it continues.
<C> </C> Centered text. As in HTML this tag applies to a whole line. Each centered line should be tagged.
<RA> </RA> Right Aligned text. This tag can be applied to a separate part of a line, so is the equivalent of a tab rather than right alignment for the whole line.
<ORNAMENT /> Ornament (will be shown by a graphic in the HTML version).
<ORNAMENT> </ORNAMENT> Ornamental drop-letter.
drop="n" The number of lines of type taken up by the drop-letter. 
<L></L> Blank line. (Previously tagged <BL>.)

c) Abbreviations

There are three kinds of abbreviations that involve early type-forms: 

i) Brevigraphs or single characters representing a longer word, as in "y" with a small "e", "u" or "t" above it to represent "the", "that" or "thou" respectively. This single character should be tagged {ye}, {yt} or {yu}, depending on the case. A similar case is the abbreviation for "which" with a small "c" over the "w", tagged as {wc}. 

ii) Superscripts or small characters printed above the line to indicate a longer word, as in "Mr" for "Master." This should be tagged M<SUP>r</SUP>.

iii) Diacritics or accents used to indicate additional characters, such as a macron accent over a vowel to indicate that an "m" or "n" should follow. 

Editors should always expand abbreviations, while also tagging their abbreviated form.

Abbreviations in the old-spelling texts are tagged in a form that indicates the full, or expanded, version of the word as part of the tagging. The basic syntax is this: <ABBR expan="[full word">[abbreviation]</ABBR>

Examples of the three kinds of abbreviations outlined above:

<ABBR expan="the">y<SUP>e</SUP></ABBR>
<ABBR expan="that">y<SUP>t</SUP></ABBR>
<ABBR expan="which">w<SUP>c</SUP></ABBR>

<ABBR expan="Master">M<SUP>r</SUP></ABBR>

<ABBR expan="man">m{_a}</ABBR>

d) Characters and ligatures

{-} Soft hyphen (e.g. where a single word is split across two lines).
{s} Long s.
{r} and {R} Rotunda r (ꝛ) or R (Ꝛ) 
{P} Paragraphus, capitulum, or pilcrow (¶).
{sm} Section mark (§).
{th} and {TH} Thorn (þ) and (Þ)
{ye} {yt} {yu} {wc}  Brevigraphs for "the", "that", "thou", and "which". See subsection (c) for usage.
{qp} Quoth or quod brevigraph (ȹ)
{^e} letter with circumflex (e.g. ê).
{"e} letter with dieresis (e.g. ë).
{'e} letter with acute accent (e.g. é).
{`e} letter with grave accent (e.g. è).
{_e} letter with macron accent (e.g. ē).
{~n} letter with tilde accent (e.g. ñ).
{ae} and {AE} æ and Æ digraphs.
{oe} and {OE} œ and Œ digraphs.
ligatures {as} {ct} {ee} {ffi} {ffl} {ff} {fi} {fl} {fr} {ij} {is} {oe} {oo} {pp} {us} {st}
ligatures with long s {s} {{s}h} {{s}i} {{s}l} {{s}p} {{s}t} {{s}{s}i} {{s}{s}l} {{s}{s}}
Black Letter ligatures {ee} {oo}
vv/VV used for w/W {w} and {W}

e) Word spacing

Word spacing will be normalized throughout, with a single space separating all words. 

f) Indents and significant spaces

<SPACE n="[number]" /> Indicates significant space to be left in the text. The most common instance of this will be in formatting the lines of verse in a song or sonnet, where some lines will be indented further than others. The number of m-spaces should be indicated. There is no </SPACE> tag (the forward slash at the end of the tag is the equivalent of a closing tag). In the modern text, prose and verse will automatically be indented when the <PROSEQUOTE> or <VERSEQUOTE> tag is used
<INDENT n="[number]"></INDENT> In the old-spelling transcription, indicates indentation for a whole block of text (prose or verse)
n="[number]" The number of m-spaces indented. Further indentation in verse should be shown by the use of the <SPACE n="[number]" /> where again n is the number of m-spaces.

A 1.4 Manuscript elements

<HAND> </HAND> Hand or scribe.
id="[name]" The standard name for the hand, e.g. "Hand A". Where desirable, names may be further distinguished by number or letter (e.g. "Munday A" and "Munday B", or "A1" and "A2") to indicate the same hand used at a later time.
ink="[colour]" Ink colour (where important to note). Colour is black by default; other options include "brown", "green", "blue", and "red".
<SCRIPT> </SCRIPT> Script.
t="[type]" The type of script used. Examples include "secretary", "roman", "italic", and "round".
<INS> </INS> Insertion.
loc="[location]" above | below | inline | marginLeft | marginRight
Unless specified, insertions will be treated as in-line by default.
hand="[name]" Standard name of the hand responsible for the insertion if different from the original text.
<DEL> </DEL> Deletion.
hand="[name]" Standard name of the hand responsible for the insertion if different from the deleted text.
<OBLIT> </OBLIT> Obliterated, damaged, or lost letters.
Indicate by periods the number of lost letters, e.g. POE<OBLIT>...</OBLIT>AL

A 1.5. References and modern act, scene divisions

<TLN n="[number]" /> Through Line Number. The basic method of internal reference for the editions will be the TLN number. Where a quarto or modern edition omits material the numbers will be omitted; where they add material the numbers will be added decimally (<TLN n="1033.1" /> etc.); where the line division varies from the copy-text the TLN number will be that of the first word of the line.
<ACT> </ACT> Act division as in the modern edition.
n="[number]"> The number of the act. If the original includes an act division which has been retained, the notation would be thus: </ACT><ACT n="2"> <LD>Actus Secundus</LD>.
<SCENE> </SCENE> Scene division as in the modern edition.
n="[number]"> Example: </SCENE>
</ACT><ACT n="2">
<LD>A{ct}us Secundus, Sc{oe}na Prima</LD>
<SCENE n="1">
<SD><I>Enter Sandman</I>.</SD> ...

A 1.6. Multiple tags and hierarchical structures

The tags in the Renaissance texts of the Internet Shakespeare Editions are on the whole not hierarchical, since they are representational rather than logical. Thus one tag can cross the boundary of another where necessary. Nonetheless, it is good manners to keep the boundaries logical and consistent wherever possible. In this example the overall <S> </S> tag encloses both the speech prefix (which is logically necessary) and the tag that indicates a justified line. In turn the <J> </J> tag encloses the speech prefix tag.

<TLN n="452" /><S><J><SP><I>Nurse</I>.</SP>
Goe Gyrle, seeke happie nights to happy daies.</J></S>

In the next passage, the hung word complicates the process because it also ends a speech:

<TLN n="19" /><S><J><SP><I>Samp</I>.</SP> True, and therefore women being the weaker</J>
<TLN n="20" /><J>Vessels, are euer thrust to the wall: therefore I will push</J>
<TLN n="21" /><J><I>Mountagues</I> men from the wall, and thrust his Maides to</J>
<TLN n="22" />the wall</S>
<TLN n="23" /><S><J><SP><I>Greg</I>.</SP> The Quarrell is betweene our Masters, and vs</J> <HW t="prev"><RA>(their men.</RA><HW></S>
<TLN n="24" /><S><J><SP><I>Samp</I>.</SP> 'Tis all one, I will shew my selfe a tyrant: when</J>
<TLN n="25" /><J>I haue fought with the men, I will bee ciuill with the</J>
<TLN n="26" />Maids, and cut off their heads.</S>

This example includes a stage direction divided between two lines, right aligned.

<TLN n="721" />And euery Greeke of mettell let him know,
<TLN n="722" />What Troy meanes fairely, shall be spoke alowd. <SD><RA><I>Sound</I> <HW t="next"><I>trumpet</I>.</HW></RA></SD>
<TLN n="724" />We haue great <I>Agamemnon</I> heere in Troy,
<TLN n="725" />A Prince calld <I>Hector</I>, <I>Priam</I> is his father,

Where there is a combination of tags indicating logical structures (speech, stage direction etc.) and physical characteristics (justification, etc.), the logical tags should wherever possible surround the physical tags, as in the examples above.

Previous | Table of Contents | Next