(117) previous ~ index ~ next
To: Jon_Yamron@dragonsys.com
From: George Doddington <doddington@nist.gov>
Subject: Re: Training data for dry run?
Date: Fri, 11 Jun 1999 18:35:44 -0400
Jon_Yamron@dragonsys.com wrote:
>
> If the test data for the dry run spans the entire 6 months of the TDT2 corpus,
> what should we use for training data for those parts of our TDT software that
> require it? For example, one of our trackers requires multiple background
> models trained from news material similar to the test stories.
There is no EvalSet for the June dry run. Furthermore, the evaluation is
over the entire TDT2 data set, and there is no separation imposed between
training data and DevSet data. Don't be alarmed. The purpose of the June
dry run is to debug the task definitions and the research support tools and
infrastructure. It is not a contest. Please feel free to choose training
data in whatever way makes most sense to you under these conditions, so as
to best support your research. We certainly don't want to limit your
research. Selection of training and DevSet data for the September dry run
(and beyond) is an important item to discuss at this month's meeting. For
now, however, each site is free to make their own choice in this matter.
--
George Doddington in McLean, VA: doddington@nist.gov or 703/556-3434
(117) previous ~ index ~ next
Last updated Mon Jun 21 11:18:49 1999