From LinguisticAnnotation
Jump to: navigation, search
 
(70 intermediate revisions by 4 users not shown)
Line 1: Line 1:
Linguistic Annotation
+
==Linguistic Annotation Wiki==
  
This page describes tools and formats for creating and managing ''linguistic annotations''. `Linguistic annotation<nowiki>‘</nowiki> covers any descriptive or analytic notations applied to raw language data. The basic data may be in the form of time functions -- audio, video and/or physiological recordings -- or it may be textual. The added notations may include transcriptions of all sorts (from phonetic features to discourse structures), part-of-speech and sense tagging, syntactic analysis, "named entity" identification, co-reference annotation, and so on. The focus is on tools which have been widely used for constructing annotated linguistic databases, and on the formats commonly adopted by such tools and databases. This page began as a set of links to systems for speech annotation, and the coverage of textual annotation is still inadequate.  
+
This wiki describes tools and formats for creating and managing ''linguistic annotations''. `Linguistic annotation<nowiki>‘</nowiki> covers any descriptive or analytic notations applied to raw language data. The basic data may be in the form of time functions -- audio, video and/or physiological recordings -- or it may be textual. The added notations may include transcriptions of all sorts (from phonetic features to discourse structures), part-of-speech and sense tagging, syntactic analysis, "named entity" identification, co-reference annotation, and so on. The focus is on tools which have been widely used for constructing annotated linguistic databases, and on the formats commonly adopted by such tools and databases.  
  
This page is no longer being actively maintained.  
+
This wiki is based on these webpages:
 +
*[http://www.ldc.upenn.edu/annotation/ Linguistic Annotation] (Steven Bird and Mark Liberman)
 +
*[http://www.ldc.upenn.edu/annotation/gesture/ Gesture Annotation] (Craig Martell)
  
Related pages: [<u>Open Language Archives Community-http://www.language-archives.org/]</u>, [<u>Linguistic Exploration-http://www.ldc.upenn.edu/exploration/]</u>, [<u>Gesture Annotation-http://www.ldc.upenn.edu/annotation/gesture/]</u>, [<u>Italian version of this page by Piero Cosi-http://nts.csrf.pd.cnr.it/biblos/annotazione-linguistica.htm]</u>, [<u>Speech Annotation and Corpus Tools-http://www.ldc.upenn.edu/annotation/specom.html]</u>
+
These are no longer maintained. Used with permission.
  
This page is the home of the [<u>COCOSDA-http://www.atr.co.jp/slt/cocosda/]</u> technical topic domain [<u>Corpus Annotation Tools-http://www.atr.co.jp/slt/cocosda/td_cat.html]</u>.
+
Related pages:
 +
*[http://www.language-archives.org/ Open Language Archives Community]
 +
*[http://www.ldc.upenn.edu/exploration/ Linguistic Exploration]
  
This page has been prepared in conjunction with our research on the logical structure of linguistic annotation, based on [<u>annotation graphs-http://www.ldc.upenn.edu/AG/]</u>.
+
----
 +
For quicker reference, there's a page with transcription and annotation [[Tools]] only
  
<u>[IRCS Workshop on Linguistic Databases <nowiki>[</nowiki>Dec 2001<nowiki>]</nowiki>-http://www.ldc.upenn.edu/annotation/database/]</u>
+
'''A'''
 +
*[[Alembic Workbench]] (DT/U,W)
 +
*[[Annotation Graph Toolkit (AGTK)]] (TDP)
 +
*[[ANNIS]]
 +
*[[annotate]] (TD)
 +
*[[Anvil]] (TP)
 +
*[[ATLAS]] (FP)
 +
'''C'''
 +
*[[CA]] (P)
 +
*[[Callisto]] (TD/W,U,M)
 +
*[[C-BAS]] (T/W)
 +
*[[CES]] (FC)
 +
*[[CHILDES]] (FTDPRC/W,M)
 +
*[[CLinkA]] (T)
 +
*[[COCOSDA]] (R)
 +
*[[CSAE]] (TDC/W)
 +
*[[CSLU]] (TDPRC/W)
 +
*[[CWB/CQP]] (TP/U)
 +
'''D'''
 +
*[[DAISY]] (FTP/U,W)
 +
*[[DAMSL]] (FTRC/U,W)
 +
*[[Delta]] (TP/U,W)
 +
*[[Dexter]] (T)
 +
*[[DRI]] (R)
 +
'''E'''
 +
*[[EAGLES]] (FR)
 +
*[[ELAN]] (FTD)
 +
*[[E-MELD]] (R)
 +
*[[Emu]] (FTDP/U,W)
 +
*[[EXMARaLDA]] (FTDP/U,W,M)
 +
'''F'''
 +
*[[Festival]] (TD/U)
 +
*[[FLEX (Fieldworks Language Explorer)]]
 +
*[[FORM]] (C)
 +
*[[FSA's]] (TD)
 +
'''G'''
 +
*[[GATE]] (FTDP/U)
 +
*[[Gsearch]] (T/U)
 +
'''H'''
 +
*[[HIAT]] (FTDPRC/W)
 +
*[[HIAT-DOS]] (T) ([[HIAT-DOS (Review)|Review]])
 +
*[[Hyperlex]] (TP/U)
 +
'''I'''
 +
*[[Intex]] (F/U,W,M)
 +
*[[ISIP]] (TDP/U)
 +
*[[ISLE]]
 +
'''L'''
 +
*[[LACITO]] Linguistic Data Archiving Project  (Boyd Michailovsky, John B. Lowe, Michel Jacobson) (FTD)
 +
*[[LAF]] Linguistic Annotation Framework
 +
*[[LDC]] (FTDPRC)
 +
*[[LT]] (T/U,W)
 +
'''M'''
 +
*[[MacShapa]] (TP)
 +
*[[MacVissta]] (TD)
 +
*[[MATE]] (FT)
 +
*[[MediaStreams]] (P)
 +
*[[MediaTagger]] (P)
 +
*[[MICASE]] (TDC/W)
 +
*[[MMAX]] (TD)
 +
*[[MPEG]] (FPR)
 +
*[[MPI]] (FT/UWM)
 +
*[[Multitext]] (F)
 +
'''N'''
 +
*[[NEGRA]] (FTPC/U)
 +
*[[NITE]]
 +
'''O'''
 +
*[[Observer]] (T/W)
 +
'''P'''
 +
*[[Partitur]] (FT)
 +
*[[PAULA]] (F)
 +
*[[Praat]] (TD/U,W,M)
 +
'''R'''
 +
*[[RSTTool]] (TD)
 +
'''S'''
 +
*[[SABLE]] (FP)
 +
*[[SAMPA]] (C)
 +
*[[SGREP]] (TDP/U,W)
 +
*[[SignStream]] (TDP/M)
 +
*[[SIL]] (TDPF/W,M)
 +
*[[SLAM]] (TDP/W)
 +
*[[SMDL]] (P)
 +
*[[SNACK]] (TDP/U,W,M)
 +
*[[SUSANNE]] (CP)
 +
*[[SyncWriter]] (T) ([[SyncWriter (Review)|Review]])
 +
'''T'''
 +
*[[TalkBank]] (R)
 +
*[[TASX]] (TD/U,W,M)
 +
*[[TEI]] (F)
 +
*[[Tipster]] (F)
 +
*[[Transcriber]] (TDP/U,W,M) ([[Transcriber (Review)|Review]])
 +
*[[Transana]] (T) ([[Transana (Review)|Review]])
 +
*[[Transformer]] (TDP)
 +
*[[TransTool]] (TD/U,W)
 +
*[[Treebank]] (C)
 +
*[[TSNLP]] (FT)
 +
*[[TUSNELDA]]
 +
'''U'''
 +
*[[Unicode]] (RC)
 +
'''V'''
 +
*[[Verbmobil]] (FC)
 +
*[[VisLab]] (TDP)
 +
*[[vPrism]] (T/W)
 +
 
 +
=Key=
  
'''KEY:'''
 
 
F:  a systematically-documented annotation format
 
F:  a systematically-documented annotation format
  
T: an available tool for creation, display or search
+
T: an available tool for creation, display or search (W=Windows, U=Unix, M=MacOS)
  
 
D: a tool is downloadable
 
D: a tool is downloadable
Line 25: Line 132:
  
 
C: methods and standards for transcribing content
 
C: methods and standards for transcribing content
 
 
*[[Alembic Workbench (David Day)]] (DT/U,W)
 
* ATLAS
 
* CA
 
* CES
 
* CHILDES
 
* CLinkA
 
* COCOSDA
 
* CSAE
 
* CSLU
 
* DAISY
 
* DAMSL
 
* Delta
 
* DRI
 
* EAGLES
 
* Emu
 
*[[EXMARaLDA]] (FTDP/U,W,M)* Festival
 
* FSA's
 
* GATE
 
* Gsearch
 
* HIAT
 
* Hyperlex
 
* Intex
 
* ISIP
 
* ISLE
 
*[[LACITO]] Linguistic Data Archiving Project  (Boyd Michailovsky, John B. Lowe, Michel Jacobson) (FTD)
 
* LDC
 
* LT
 
* XML
 
* MATE
 
* MICASE
 
* MPEG
 
* MPI
 
* Multitext
 
* NEGRA
 
* NITE
 
* Observer
 
* Partitur
 
* Praat
 
* SABLE
 
* SAMPA
 
* SGREP
 
* SignStream
 
* SIL
 
* SLAM
 
* SMDL
 
* SNACK
 
* SUSANNE
 
* TalkBank
 
* TASX*
 
* TEI
 
* Tipster
 
* Transcriber
 
* TransTool
 
* Treebank
 
* TSNLP
 
* TUSNELDA
 
* Unicode
 

Latest revision as of 11:14, 15 November 2007

Linguistic Annotation Wiki

This wiki describes tools and formats for creating and managing linguistic annotations. `Linguistic annotation‘ covers any descriptive or analytic notations applied to raw language data. The basic data may be in the form of time functions -- audio, video and/or physiological recordings -- or it may be textual. The added notations may include transcriptions of all sorts (from phonetic features to discourse structures), part-of-speech and sense tagging, syntactic analysis, "named entity" identification, co-reference annotation, and so on. The focus is on tools which have been widely used for constructing annotated linguistic databases, and on the formats commonly adopted by such tools and databases.

This wiki is based on these webpages:

These are no longer maintained. Used with permission.

Related pages:


For quicker reference, there's a page with transcription and annotation Tools only

A

C

D

E

F

G

H

I

L

  • LACITO Linguistic Data Archiving Project (Boyd Michailovsky, John B. Lowe, Michel Jacobson) (FTD)
  • LAF Linguistic Annotation Framework
  • LDC (FTDPRC)
  • LT (T/U,W)

M

N

O

P

R

S

T

U

V

Key

F: a systematically-documented annotation format

T: an available tool for creation, display or search (W=Windows, U=Unix, M=MacOS)

D: a tool is downloadable

P: there is a citeable paper which documents the format/system

R: other kinds of resource, such as books and associations

C: methods and standards for transcribing content