GALE: Task Specifications and Annotation Guidelines
Task specifications state needs and assumptions for each task, describe the process for collecting and/or selecting data for that task, define annotation and quality control procedures associated with the task, and describe the distribution formats for the resulting data. LDC's GALE tasks include
- Collection
- Broadcast data (news and talk shows) (updated 4/1/2008)
- Broadcast
auditing (updated 4/1/2008)
- Web data (blogs and newsgroups)
- Transcription
- XTrans (speech annotation toolkit) manual V3 (updated 10/11/2007)
- Translation
- Word Alignment
- Arabic Word Alignment V2.0 (updated 5/20/2008)
- Chinese Word Alignment V2.0 (updated 5/20/2008)
- XBanks
- Arabic Treebank (updated 2/21/2008)
- English-Arabic Treebank
- Distillation
- Phase 2
Training Data Annotation Guidelines V2.3 (updated 03/01/2007)
- Evaluation Resources
- Data Selection Guidelines V2.2 (updated 01/02/2007)
- Machine Translation Post Editing
- Post Editing Guidelines V3.0.2 (updated 05/25/2007)
- Resource Distribution
- Resource
Distribution (updated 4/1/2008)














