The table below lists the different categories used to annotate nodes in order to indicate probable annotation errors caused by parsing errors, together with the frequency of each flag in the SUC part of the Swedish Treebank. Note that the flags are not mutually exclusive and that a single annotation error often triggers more than one flag. In total, 30,588 sentences have at least one flag, while 43,655 sentences have no flag.
During the manual revision of the gold standard section of SUC, we observed that a correctly annotated sentence never gets a flag. Unfortunately, the inverse implication does not hold, but the absence of flags usually indicates that there are only minor annotation errors in the sentence.
Flag | Frequency | Explanation | Examples |
Unary | 2764 | Unary branching nonterminal node with a nonterminal child. Permitted exception: ROOT with XP child. | |
Nonterminal | 2353 | Node with (probably) incorrect phrase label. | Phrase label is ?? (unknown). Phrase label is XP but the phrase has the typical structure of a more specific phrase (e.g., PR+PA for PP). |
Function | 7318 | Node with incorrect function label. | Function label is ??. |
ForbiddenFunction | 7383 | Node with function label that does not occur with this phrase type in Talbanken. | Preposition with function head (HD) instead of prepositional (PR). |
ForbiddenChild | 13535 | Node with child whose function label does not occur under this phrase type in Talbanken |
Subject (SS) under noun phrase (NP). Functions other than MS and punctuation under ROOT. |
ForbiddenSibling | 27796 | Node with function label that is incompatible with the function label of a sibling node. | Multiple occurrences of IV, SS, OO inside the same phrase. Formal subject (FS) together with ordinary subject (SS). |
ObligatoryChild | 17942 | Node whose phrase label requires a child with a specific function label but no such child exists. | A noun phrase (NP) without either a head (HD) or at least one conjunct (CJ). |
ObligatorySibling | 592 | Node whose function label requires a sibling with a specific function label but no such sibling exists. | A logical subject (ES) without a formal subject (FS). |