GALE: Task Specifications and Annotation Guidelines
Task specifications state needs and assumptions for each task, describe the process for collecting and/or selecting data for that task, define annotation and quality control procedures associated with the task, and describe the distribution formats for the resulting data. LDC's GALE tasks include
- Collection
- Broadcast data (news and talk shows) (updated 4/1/2008)
- Broadcast
auditing (updated 4/1/2008)
- Web data (blogs and newsgroups)
- Transcription
- XTrans (speech annotation toolkit) manual V3 (updated 10/11/2007)
- Translation
- Word Alignment
- Arabic Word Alignment V4.0 (updated 04/08/2009)
- Chinese Word Alignment V4.0 (updated 4/16/09)
- Chinese Tagging Guidelines V1.0 (updated 4/10/09)
- Chinese Tagging Guidelines V1.0 (updated 4/10/09)
- XBanks
- Arabic Treebank (updated 1/15/2009)
- English-Arabic Treebank (updated 4/9/2009)
- Distillation
- Phase 3
Training Data Annotation Guidelines V1.0 (updated 06/18/2008)
- Evaluation Resources
- Data Selection Guidelines V2.2 (updated 01/02/2007)
- Machine Translation Post Editing
- Post Editing Guidelines V3.0.2 (updated 05/25/2007)
- Resource Distribution
- Resource
Distribution (updated 4/1/2008)














