(074) previous ~ index ~ next
To: TDT distribution <tdt-distrib@ldc.upenn.edu>
From: George Doddington <doddington@nist.gov>
Subject: Clarification of TDT3 language conditions
Date: Wed, 31 Mar 1999 13:02:00 -0500
For the formal evaluation this Fall, the REQUIRED condition
will be a composite multilingual task for all tasks. However,
there will be a variety of conditions under which performance
should be evaluated. Here is the complete and definitive
list of language conditions to be evaluated:
For STORY SEGMENTATION, there will be 1 language condition:
[1] Input from all sources for all languages.
For TOPIC TRACKING, there will be 3 language conditions:
[1] Training data taken from all sources
(# of training stories: 1E/1M, 2E/2M, 4E/4M)
Input from all sources for all languages
(evaluation conditioned on input source language)
[2] Training data taken from all English language sources
(# of training stories: 1E, 2E, 4E)
Input from all sources for all languages
(evaluation conditioned on input source language)
[3] Training data taken from all Mandarin language sources
(# of training stories: 1M, 2M, 4M)
Input from all sources for all languages
(evaluation conditioned on input source language)
The REQUIRED condition will be [1], with Nt = 4E/4M.
For TOPIC DETECTION, there will be 3 language conditions:
[1] Input from all sources for all languages.
[2] Input from all English language sources.
[3] Input from all Mandarin language sources.
The REQUIRED condition will be [1].
For FIRST-STORY DETECTION, there will be 3 language conditions:
[1] Input from all sources for all languages.
[2] Input from all English language sources.
[3] Input from all Mandarin language sources.
The REQUIRED condition will be [1].
For LINK DETECTION, there will be 1 language conditions:
[1] Input from all English language sources.
- - - - - - - - - - - - - - - - - - -
For the June dry run, the original plan was to make Mandarin
and cross-language processing optional. However, the existence
of MT'd Mandarin in the form of English makes these tasks easy,
and so the required language conditions for the June dry run
will be the same as for the Fall evaluation.
As a reminder, note that the TDT system is allowed to know both
the language and the source of the input data.
--
George Doddington at NIST: doddington@nist.gov or 301/975-3261
(074) previous ~ index ~ next
Last updated Thu May 13 09:28:22 1999