(324) previous ~ index ~ next

To: TDT Distrib <tdt-distrib@ldc.upenn.edu>
From: Jonathan Fiscus <jonathan.fiscus@nist.gov>
Subject: [Fwd: [Fwd: Latest TDT Software and Evaluation Plan]]
Date: Thu, 31 May 2001 13:07:06 -0400

Folks,

I have two new resources available for participants of the TDT project:
an updated evaluation plan and updated evaluation software, which are
available at the URLs
http://www.nist.gov/speech/tests/tdt/tdt2001/evalplan.htm and
ftp://jaguar.ncsl.nist.gov/tdt/tdt2001/software/TDT3eval_v2.2.tgz
respectively.

The remainder of this email provides descriptions of the nature of the
updates.

- Jon

Evaluation Plan
---------------

The evaluation plan was updated so that it codifies how the TDT3 corpus
can be used as development test for the tracking evaluation. I'd like
to draw you attention to Section 3.1 CORPORA RESOURCES FOR TDT 2001
(excerpted below). The text was modelled from the email I sent around a
while ago. Since there appears to be little support for using BRIEF
stories as testable stories, I've omitted the idea, although there is
still time to revise the evaluation.

> 3.1 CORPORA RESOURCES FOR TDT 2001
>
> The 2001 TDT Evaluation will make use of the TDT-Pilot, TDT2 and TDT3
> corpora. The TDT-Pilot and TDT2 corpora are designated as training
> resources. Systems may make use of these corpora in any way.
>
> The TDT3 corpora will be reserved as the evaluation corpus for all
> tasks except the Topic Tracking task.
>
> For the Topic Tracking task, TDT3 may be used for development test in
> addition to the evaluation test material. As such, the corpus must be
> used according to the following guidelines so as to minimize the effect
> of training to the corpus. The LDC has released 60 TDT3 topics.
> These topics and the TDT3 corpus may be used as development test
> data for the Topic Tracking task. Thus, systems may be tuned to
> optimal performance on the TDT3 corpora and topics, but systems
> should not use the language data to pre-compute any type of model
> to be used during the formal evaluation.
>
> The 2001 TDT Evaluation corpus will consist of an augmented version
> of the TDT3 corpus. Additional language data will be added from the
> period of time between the TDT2 and TDT3 corpora, (July-September 1998)
> and also from the same period of time as the TDT3 data, (October-
> December 1998). The evaluation corpus will be shipped to the
> participating research sites as specified by the TDT schedule
> (which is September 5, 2001 as of May 29, 2001).
>
> In addition to the TDT3 corpora usage restrictions, these rules cover
> the use of these and other corpus resources:
>
> · The LDC-supplied corpora may be supplemented by whatever other
> resources participants may choose to use.
> · All other training must predate 1 October, 1998.

Evaluation Software
-------------------

There was a single modification to the evaluation software this time.
The tracking evaluation script will read system output files that use
corpus 'DOCNO' tags as pointers for decisions. The old, and still valid
form, looks like this:

System1 YES 4 30001 RECID
tkn/19981014_0842_0913_APW_ENG.tkn 1 NO 0.079013
tkn/19981014_0842_0913_APW_ENG.tkn 690 NO 0.017507
tkn/19981014_0842_0913_APW_ENG.tkn 803 NO 0.026012
tkn/19981014_0842_0913_APW_ENG.tkn 933 NO 0.030905
tkn/19981014_0842_0913_APW_ENG.tkn 1052 NO 0.021855

and the newly accepted form looks like this:

System1 YES 4 30001 DOCNO
. APW19981014.0505 NO 0.079013
. APW19981014.0508 NO 0.017507
. APW19981014.0510 NO 0.026012
. APW19981014.0512 NO 0.030905
. APW19981014.0513 NO 0.021855


The aim of providing this capability is to reduce the entrance barriers
for people currently involved in TREC. The recent release of the TDT2-3
corpora contains a script which generates TIPSTER-style text files that
TREC participants currently use. Currently, I don't have a customer for
this new functionality, so I'm not really sure what else is needed or if
there are additional scripts that I could write help the transition.
Comments are welcome.

There is one caveat to using DOCNO pointers. Automatic story boundaries
are currently excluded as an evaluation condition when using DOCNO
pointers. The software only uses the reference story boundaries to
score the system.
-------------------------------------------------------------
To unsubscribe from tdt-distrib, email majordomo@ldc.upenn.edu
with "unsubscribe tdt-distrib" in the body of the message.
(324) previous ~ index ~ next

Last updated Wed Aug 22 16:07:33 2001