5.4 Word-level Tags
The final group of tags applies to words individually. Although RET editions may be seldom tagged at the word-level, such analytic texts can prove useful for corpus linguistics, the study of languages empirically by means of electronic text databases.
5.4.1 SGML Word-level Tags
Variable-Attribute Attribute Values End Tag
*form type= simple </form>
lemma
variant
compound
derivative
inflected
phrase
category= colour
tool, etc.
gender= f m n
inflect= poss, etc.
modernsp=
originsp=
pos= noun, det, etc.
syntact= subj, pred, etc.
5.4.2 COCOA Word-level Tags
Some of these are employed by TACT.
Variable Sample Value Description
raw <raw -> original spelling of the word in the text
lemma <lemma -> dictionary word-form or headword
pos <pos -> part of speech
modern <modern -> normalized or modernized spelling
concept <concept -> member of class
gend <gend -> gender
inflect <inflect -> morphological inflection
syntact <syntact -> grammatical role
Not all texts in this corpus use everything in this tagging scheme, but whenever a text departs from this default, an explanation is given.