From LinguisticAnnotation
Revision as of 12:24, 11 October 2006 by Herrner (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

NEGRA Corpus (Thorsten Brants)

The NEGRA corpus consists of approximately 10,000 sentences of German newspaper text. The corpus is a type of treebank, but with a novel annotation scheme for discontinuous constituents. An example tree showing the visual format, the annotation format, and the Treebank equivalent, is available here. Annotate is a sophisticated tool which supports human-machine collaboration on the construction of syntactic trees.