From LinguisticAnnotation
Jump to: navigation, search

EXMARaLDA is an acronym of "Extensible Markup Language for Discourse Annotation". It is a system of concepts, data formats and tools for the computer assisted transcription and annotation of spoken language. EXMARaLDA is being developed in a project at the Collaborative Research Center "Multilingualism" (Sonderforschungsbereich "Mehrsprachigkeit" - SFB 538) at the University of Hamburg. The system's software tools - an editor for transcriptions in musical score notation, a corpus manager for administrating corpus meta-data, and a concordancing tool - are freely available to users outside the SFB.

The main features of EXMARaLDA are:

  • XML based data formats - All EXMARaLDA transcriptions are stored in XML files. The use of this W3C standard ensures flexible usability and long-term archivability of the data.
  • Java based tools - All software tools for creating and working with EXMARaLDA data are JAVA applications. This makes them suitable for all currently used operating systems (Windows, Macintosh, Linux, Unix).
  • Interoperability - The EXMARaLDA concept is loosely based on the annotation graph framework (Bird/Liberman 2001) and thus aims at a maximal exchangeability and reusability of transcription data. Hence, it is possible to create and edit EXMARaLDA data not only with the system's own tools, but also with other popular software (like Praat, ELAN, Transcriber, CHILDES, WinPitch or the TASX Annotator). An import and export facility for TEI data is also available, as well as import filters for SyncWriter and HIAT-DOS data.

Furthermore, EXMARaLDA data can be transformed into a number of widely used presentation formats (RTF, HTML, PDF,SVG) for web-based or printed publication. Last but not least, EXMARaLDA supports several important transcription systems (HIAT, DIDA, GAT, CHILDES) through a number of parameterised functions.