4.6 ABBREVIATIONS

4.6 Abbreviations

Abbreviations are symbolic characters that stand for a (normally longer) string of characters that stand for one or more letter-number symbols. Any expansion of an abbreviation involves interpretation, a choice among alternatives. For this reason, it is important to record the brevigraph character itself, independent of whatever expansion that character is given. In RET coding, vertical bars surround any abbreviation. Its expansion appears within a tag or ignore-delimiters.

Anthony Petti distinguishes five kinds of abbreviations. These are

contractions (words written without certain letters and sometimes lacking any mark of abbreviation, e.g., Mr for Master),
curtailments or suspensions (a subset of contractions: words whose last letters are dropped and with which some abbreviating mark often appears, e.g., Lo: for Lord),
brevigraphs (words with special characters standing for one or more letters in a word, e.g., _a (i.e., an) in `man'),
superior or superscript letters (e.g., sometimes with omitted characters, as in w+t+ for with, and other times without, as in they+r+), and finally
elisions, which are either curtailments or brevigraphs employed, generally with an apostrophe, to indicate that part of a word was not sounded (e.g., 'tis).

RET treats contractions and curtailments as the same kind of abbreviation and includes any form in which a brevigraph may be found, that is, a character that never stands on its own in a common word but only signals abbreviated characters. (The interpretation of the brevigraph itself appears in a separate subsection.) Thus RET classifies the form Lo: as a contraction-curtailment but the form Lord, (where d, perhaps abbreviates de) as a word with a brevigraph, whereas Petti treats both as curtailments.

TEI represents abbreviations by enclosing them in the <abbrev expan=""> ... </abbrev> tag (TEI P3, pp. 847-49) or in the <expan abbr="">...</expan> attribute (pp. 967-69). The abbreviated characters appear within the tag, often as entity references, and are expanded following the expan attribute. The result is hard to read.

For that reason RET texts employ vertical-bar delimiters instead of the TEI <abbrev> tag.

There are very few ISO entities for abbreviations. One exception is the ampersand, for which the entity & exists.

4.6.1 Contractions and curtailments

Although relatively few English words are contracted in the Renaissance, they vary considerably according to which letters are cut, whether a mark of contraction appears, and which such mark occurs (e.g., period, apostrophe, tittle, colon, etc.). RET conventions are to surround the verbatim contraction with a vertical bar. Font changes and special characters appear as elsewhere in the text. Here are some examples. They include some common Latin contractions, but recommendations on how to handle the great variety of forms found in Capelli are beyond the scope of these guidelines.

Note that these have no ISO entity references, except for ampersand and thus et cetera.

RET Abbrev.      Description
Code   
   
|{_bn}|        shortened form of bene or benedicite
|Esq+re+|      shortened form of Esquire
|{_hu}c|       shortened form of Latin hunc
|Ier{_lm}|     shortened form of Latin Ierusalem
|Iero{9}\      shortened form of Latin Ieronimus
|Ihc|          shortened form of Ihesu
|Ihs|          ditto
|Ihu|          ditto
|Ih{-u}|       ditto
|I{_hu}|       ditto
|Ihus|         ditto
|I{_hu}s|      ditto
|Io{_hn}|      shortened form of Iohan
|K+t+|         shortened form of Knight
|l+i+|         shortened form of libri (`pounds')
|Lp:|          shortened from Lordship
|l+re+|        shortened from letter
|lres|         shortened from letters
|Ma:|          shortened from Maiestie
|M+r+|.        shortened from Master
|M+rs+|        shortened from Mistress
|n{Ao}|        shortened form of nota
|o{_mi}|       shortened form of omnium
|o+r+|         shortened from our
|Poli{4}|      shortened form of Policratici (Hengwrt)
|Ro_bt|        shortened form of Robert
|S+r+|         shortened from Sir
|{the}|        single-letter abbreviation for the using thorn
|{th}e|        two-letter abbreviation for the using thorn
|{th}+e+|      two-letter abbreviation for the using thorn 
               and superscript e
|{ye}|         single-letter abbreviation for the using y 
|ye|           two-letter abbreviation for the using y 
|y+e+|         two-letter abbreviation for the using y and 
               superscript e
|{th}t|        single-letter abbreviation for that using thorn 
|yt|           single-letter abbreviation for that using y 
|{th}y\        single-letter abbreviation for they using thorn 
|yy|           single-letter abbreviation for they using y 
|wh|           single-letter abbreviation for which  
|w+h+|         double-letter abbreviation for which using w and 
               superscript h
|wt\           single-letter abbreviation for with
|w+t+|         two-letter abbreviation for with and superscript t
|Wm|           shortened from William
|yr|           single-letter abbreviation for your
|y+r+|         double-letter abbreviation for your using y and 
               superscript r
|&c'|          abbreviation for et cetera
|{9A}|         abbreviation for contro

4.6.2 Brevigraphs

Brevigraphs consist either of combinations of abbreviating mark and base-line character, or of otherwise unfamiliar shapes themselves on the base-line.

Often editors distinguish the mark of abbreviation from the character or characters to which it is somehow attached, either by prefixing, suffixing, superscripting, subscripting, or writing through. RET codes represent the mark of abbreviation, and the character or characters under which it appears, as a single thing whenever possible. This is done because marks such as the tittle or macron (taken by Petti, pp. 23-4, to abbreviate i, m, or n), the superscript hook (taken by Petti, p. 23, to abbreviate er, ir, ier, or ire), and the tail cannot be expanded in themselves; their function is dictated by the character to which they are attached. Thus RET does not always encode the brevigraph with the letters it may abbreviate, because these vary according to the context and are subject to interpretation.

Brevigraph codes have the following sequence of three elements:

vertical-bar delimiters, to distinguish brevigraphs from character-strings that are not abbreviations (e.g., ll and ll),
a code for the mark, and
the base-line letter or letters to which the mark applies.

To facilitite the reading of these codes, the mark is placed either before or after the base-line letters (whatever facilitates reading), and the two are distinguished by case. Where marks are letter-forms, they are upper-case; base-line letters are always lower-case.

      +   mark   +   BASE-LINE LETTER(S)   +

      +   BASE-LINE LETTER(S)   +   mark   +

Sometimes a brevigraph consists of only the mark, which is itself the base-line letter (e.g., swash e, ampersand, and other Tironian marks for con and rum). The absence of a capital letter-form in a brevigraph code normally signals such a mark.

Codes for the marks are arbitrary but selected to imitate the shape of the original character and to facilitate reading. In general, they are

_ (underline)               macron above letter
~                           tilde or wavy line above letter
` (opening quot. mark)      like raised i without the dot
' (apostrophe)              ascending stroke hooking backwards over 
                            the letter
, (comma)                   tail descending from letter
A                           curled or flourished form of a, 
                            sometimes like a u, 
                            sometimes a serrated line
U                           curled or flourished form of u, sometimes
                            a serrated line

For example, a raised hook at the end of a letter is represented by an apostrophe, and where shapes suggest numbers, letter-number combinations are employed (e.g., 9, r2, etc.). The variant forms of p and r are least satisfactorily coded. Here a second letter is added to suggest the shape of the descender of the letter.

Abbrev. Code      Description      

|_a|               a-tittle:  line or double curve of  abbreviation, 
                     tittle or tilde over a, often indicating omitted n 
                     or m  
|_an|              an-tittle:  line or double curve of abbreviation, tittle 
                     or tilde over an, often indicating omitted u
|Am|               curled or flourished form of superscript u or a, 
                     sometimes only a serrated line, written above m, 
                     indicating iam (Petti, p. 24, terms this a u-form) 
                     or uam
|An|               curled or flourished form of superscript a, sometimes 
                     only a serrated line, written above n and often 
                     standing for ran
|Aun|              curled or flourished form of superscript a, sometimes 
                     only a serrated line, written above un, often indicating
                     aun but taken as a single unit in these guidelines
                     (Petti, p. 23)
|c|                superscript form for c or ac (Petti, p. 23)
|c'|               c ending with a raised hook, sometimes abbreviating cer
|c`|               c ending with a raised opening quotation mark or i, 
                     often standing for cri
|d,|               d with a descending hook, sometimes abbreviating de but 
                     more often otiose (Petti, p. 23)
|es|               so-called swash e, a form for terminal es, is, 
                     or ys with looped descender (Petti, p. 23)
|_e|               e-tittle:  line or double curve of abbreviation, tittle or
                     tilde over e, often indicating en or em 
|_eu|              eu-tittle: line or double curve of abbreviation, tittle or
                     tilde over eu, often indicating eum 
|g,|               g with a tail or sometimes a superscript hook (Petti, p.
                     23), sometimes taken for ge but generally otiose
|gA|               g with a curled or flourished u or a, sometimes only a
                     serrated line, written above g and standing normally 
                     for gra
|_hu|              hu-tittle: line or double curve of abbreviation, tittle or
                     tilde over hu, often indicating hum  
|_i|               i-tittle:  line or double curve of abbreviation, tittle or
                     tilde over i, often indicating omitted n or m 
|/ll|              double-l (the letter) crossed by a slanted or curved line
                     (Petti, p. 23), sometimes taken for lle but generally
                     otiose
|_m|               m-tittle:  line or double curve of abbreviation, tittle or
                     tilde over m, standing for various things 
|m'|               m with final raised hook often standing for mer 
|_mi|              mi-tittle:  line or double curve of abbreviation, tittle or
                     tilde over mi, often indicating mni 
|n'|               n with final raised hook often standing for omitted er and
                     othertimes for ne, in which case the mark is generally
                     otiose (Petti, p. 24)
|~n|               n-tilde: n with tittle often standing for an
|9|                letter shaped like 9, Tironian abbreviation for initial
                     con for terminal us or ous (Petti, pp. 23-4)
|_n|               n-macron:  n with final otiose raised line of abbreviation
                     or tittle 
|_o|               o-tittle:  line or double curve of abbreviation, tittle or
                     tilde over o , sometimes indicating omitted i, y, n, 
                     or m
|_on| or |_ou|     on-tittle or ou-tittle: line or double curve of
                     abbreviation, tittle or tilde over on or ou, often
                     abbreviating ion or ioun (Petti, p. 23)
|oO|               o- or a-like mark written above o to abbreviate our (Petti,
                     p. 24)
|_p|               p with macron accent (Oxford), sometimes take for pe but
                     generally thought otiose
|p'|               p with a superscript hook, often standing for pre (Petti,
                     p. 24)
|pp'|              pp with a superscript hook, often standing for ppre 
|p`|               p with a superscript stroke like an opening quotation 
                     or i, infrequently abbreviating pri (Petti, p. 24)
|p+| or |P+|       p or P with cross bar through descender, often standing 
                     for par or per (Petti, p. 24)
|pO|               p with a raised o or a, normally standing for pur
|p1|               p with convex curl (frown) through descender, made to 
                     the left and down and around through the shaft 
                     towards the right, often standing for par and per
|p2|               p with a loop made from the base-line to the left and 
                     down to meet the midpoint of the descender, yet 
                     not crossing it to make either a convex or concave 
                     curl; normally standing for pro
|p2p|              p with a loop made from the base-line to the left and 
                     down to meet the midpoint of the descender (yet not
                     crossing it to make either a convex or concave curl),
                     joined with a following p and normally standing for prop
|p3|               p with concave curl (smile) through descender, made from 
                     the bottom up to the left and around through the 
                     shaft towards right, often standing for pro (the 
                     bottom half of 0 is concave, like an o with a tail 
                     given a quarter counterclockwise turn )
|q/|               q with a slanted down-to-up cross through the descender,
                     often contacting quod (Petti, p. 24)
|q;|               q followed by a semi-colon, often abbreviating que 
                     (Petti, p. 24)
|q3|                  q with 3 subscript to right, often abbreviating que 
                     (Petti, p. 24)
|~q3|              q with macron accent and 3 subscript to right (Oxford)
|'q3|              q with acute accent and 3 subscript to right (Oxford)
|_q|               q with cross-bar through the descender (Oxford)
|~_q|              q with macron accent and cross-bar through the 
                     descender (Oxford)
|~q|               q with macron accent (Oxford)
|r|                raised r, often abbreviating ur 
|r1'|              long secretary r, shaped like the number 1, with 
                     horizontal bar to left and final ascender with left 
                     curl, normally abbreviating er, re, etc.
|r3'|              secretary r shaped like 3 with a right hooked ascender,
                     raised to act as a abbreviation for er, ur, etc. 
                     (Petti, p. 24)
|4|                character, like a 4 (or a 2 with a 1 drawn through 
                     its base), generally standing for Latin rum (Petti, 
                     p. 24; Moxon)
|{s}8|             long-s with double curl made through descender to form 
                     an 8-shape, or with a / made through its descender, 
                     often a abbreviation of ser, sir, sur, or syr (Petti, 
                     p. 24)
|t'|               t with a superscript hook, standing often for ter (Petti, 
                     p. 24)
|_u|               u-tittle:  line or double curve of abbreviation, tittle 
                     or tilde over u, often indicating omitted n or m  
|u'|               u with a superscript hook, standing often for uer
|v'|               v with a superscript hook, standing often for ver
|y|                letter like OE yogh, 3, or z that often indicates
                     abbreviated final es 
|x'|               x with a superscript hook, possibly standing for xe
|&|                ampersand in all forms, short for and; including 
                     the versions of Tironian nota for et and the 
                     related secretary form (Petti, p. 23, Oxford, Moxon)

4.6.3 Elisions

Elisions are abbreviations that indicate elided pronunciation. The SGML tag <term type="elision" expansion=""> is used.

<term type="elision" expan="could not">couldn't</term>
<term type="elision" expan="I will">I'll</term>
<term type="elision" expan="It is">'tis</term>
<term type="elision" expan="will not">wont</term>

In COCOA encoding, the number or hash sign # is used to split the ellided words into two, and that elision is different from wrongly-joined words (for which % is used, as in in%the).

couldn't as could#n't
I'll as I#'ll
'tis as 't#is
wont as wo#nt