TIDES
Data for
2004
Updated
2/9/2005
|
Project |
Project Manager |
Delivery Description |
Delivery Made/Due |
eCorpus/ Catalog Number |
Notes |
||
|
Detection |
HARD |
Stephanie Strassel |
2003 Relevance Judgments |
5/18/2004 |
LDC2004E25 |
|
|
|
2004 Corpus |
5/21/2004 |
LDC2004E30 |
|
||||
|
2004 Training Topics |
6/11/2004 |
LDC2004E32 |
|
||||
|
2004 Evaluation Topics |
6/25/2004 |
LDC2004E34 |
|
||||
|
2004 Evaluation Topics with Metadata |
7/12/2004 |
LDC2004E34 |
|
||||
|
Clarification Forms |
7/16/2004 |
|
|
||||
|
2004 Relevance Judgments |
10/1/2004 |
|
|
||||
|
TDT |
Stephanie Strassel |
TDT-4 Topic Judgments |
5/12/2004 |
LDC2004E20 |
|
||
| TDT-5 Corpus | LDC2004E41 | ||||||
|
TDT-5 2004 Annotations |
|
|
|
||||
|
Extraction/ACE |
Stephanie Strassel |
Pilot Corpus |
V1.1 |
1/20/2004 |
LDC2004E03 |
|
|
|
V1.2 |
2/6/2004 |
LDC2004E03 |
|
||||
|
Training Data |
V1.0 |
4/1/2004 |
LDC2004E17 |
|
|||
|
V1.1 |
5/12/2004 |
LDC2004E17 |
|
||||
|
V1.2 |
7/1/2004 |
LDC2004E17 |
|
||||
|
DevTest Data |
|
LDC2004E38 | |||||
|
Evaluation Data |
8/3/2004 |
|
|
||||
|
Machine Translation |
Xiaoyi Ma |
Arabic |
Arabic News Translation Corpus Part 3 |
1/16/2004 |
LDC2004E07 |
|
|
|
Arabic English Parallel News Text Part 1 (2M words) |
1/29/2004 |
LDC2004E08 |
Official publication to be released with additional QC and documentation on 9/15/2004 |
||||
|
UN Arabic English Parallel Text Version 2 (101M words) |
3/30/2004 |
LDC2004E13 |
|
||||
|
Arabic News Translation Corpus Part 3 (524K words) |
3/30/2004 |
LDC2004E11 |
|
||||
|
Arabic News Translation Corpus Part 4 (200K words) |
3/30/2004 |
LDC2004E11 |
|
||||
|
Arabic News Translation Text |
8/15/2004 |
LDC2004T08 |
|
||||
|
Arabic Eval Data |
5/2/2004 |
|
|
||||
|
Human Assessment of Arabic to NIST |
8/31/2004 |
|
|
||||
|
Chinese |
Hong Kong Hansards Parallel Text (36M en words) |
3/30/2004 |
LDC2004E09 |
|
|||
|
UN Chinese-English Parallel Text (147M en words) |
3/30/2004 |
LDC2004E12 |
|
||||
|
Multiple-Translation Chinese Part 3 |
7/15/2004 |
LDC2004T07 |
Ready for publication |
||||
|
Hong Kong Parallel Text |
8/15/2004 |
|
|
||||
|
Chinese-English News Magazine Parallel Text |
12/15/2004 |
|
|
||||
|
Chinese Eval Data (80k Chinese Chars) |
5/2/2004 |
|
|
||||
|
Human Assessments of Chinese to NIST |
8/31/2004 |
|
|
||||
|
Summarization |
Stephanie Strassel |
50 Topic Summaries, 4 Annotators |
12/21/2004 |
LDC2004E46 | |||
|
Tagged Text and X Banks |
Mohamed Maamouri |
Arabic Treebank |
Part 2 V2.0 (144K words - newswire) |
1/30/2004 |
LDC2004T02 |
|
|
|
Part 3 V1.0 (340K words - newswire) |
4/19/2004 |
LDC2004T11 |
|
||||
|
Part 3 (a) V1.1 |
7/30/2004 |
|
|
||||
|
Part 3 V2.0 |
2/15/2005 |
|
Full Arabic Treebank |
||||
|
Part 1 V2.3 |
12/20/2004 |
|
Will include new annotation passes for morphology, POS, gloss and added vocalization |
||||
|
Xiaoyi Ma |
Chinese Treebank |
Version 4.0 (404K words – newswire, press release) |
3/15/2004 |
LDC2004T05 |
|
||
|
Lexicons |
Mohamed Maamouri |
Buckwalter Lexicon and Morphological Analyzer |
Version 2.0 |
12/2004 |
|
|
|