Automatic Content Extraction (ACE): Previous Annotation Efforts

A summary of data and annotation guidelines for each evaluation can be found on the ACE Data Matrix.

ACE07

ACE Evaluation

New tasks for ACE 2007 included a pilot evaluation using Spanish data for the tasks of entity detection and recognition (EDR) and temporal expression recognition and normalization (TERN).

Annotation Tasks
Description of corpora
Data selection process
Data format and DTD
ACE07 Evaluation (NIST Website)
 

Entity Translation Pilot Evaluation

A pilot evaluation of "Entity Translation" was conducted as part of ACE07. Systems participating in the pilot ET task are evaluated on their ability to take in a text document in one language (either Mandarin Chinese or Arabic) and emit an English language catalog of the entities mentioned in the document.  LDC created reference translations and ACE annotations to support the ET pilot task, with support from the REFLEX program.  

Annotation Tasks
Description of corpora
Data selection process
Data format and DTD
ET Pilot Evaluation (NIST Website)
 

ACE05

In 2005 the ACE Program expanded to include Events annotation for Arabic, English and Chinese .  For the 2005 TIDES Extraction Evaluation, LDC created new training data of approximately 300,000 words per language, plus test sets of approximately 50,000 words per language.

Annotation Tasks
Description of corpora
Data selection process
Data format and DTD
ACE05 Annotation Toolkit
ACE05 Evaluation (NIST Website)

ACE04

In 2004 the ACE Program expanded to include Relation annotation for Arabic.  For the September 2004 TIDES Extraction Evaluation, LDC created new training data of approximately 150,000 words per language, plus test sets of approximately 50,000 words per language.  Annotations include Entities and Relations for English, Chinese and Arabic.
 
Annotation Tasks
Data format and DTD
ACE04 Annotation Toolkit
ACE04 Evaluation (NIST Website)

ACE03

In 2003 the ACE program expanded to include Chinese and Arabic as well as English.  For the September 2003 TIDES Extraction evaluation, LDC created new training data of approximately 100,000 words per language, plus test sets of up to 50,000 words for each language.  Annotations included Entities and Relations for Chinese, and Entities only for Arabic.

Annotation Tasks
ACE03 Evaluation (NIST Website)

ACE Phase 2

To support the November 2002 Extraction evaluation, LDC created 180,000 words of English training data (from ACE Phase 1), plus an newly-defined 45,000 word development set and 45,000 words of new evaluation data.  This data was annotated for both Entities and Relations, to support EDT and RDC technology evaluations.

Annotation Tasks
ACE Phase 2b Evaluation Summer 2002 (NIST Website)
ACE Phase 2 Evaluation 2001/2002 (NIST Website)


ACE Phase 1

LDC created a 180,000 word English training corpus and 45,000 words of test data to support the February 2002 ACE evaluation.

Annotation Tasks
ACE Phase 1 Evaluation (NIST Website)


ACE Pilot

LDC joined ACE research sites to create an English pilot corpus of 15,000 words tagged for Entities.  This effort supported EDT evaluations in May and November 2000.

Annotation Tasks
ACE Pilot Study (NIST Website)