(208) previous ~ index ~ next
To: James Allan <allan@cs.umass.edu>
From: George Doddington <doddington@nist.gov>
Subject: Re: Request for clarification regarding TDT rules -- (non)use of index
Date: Wed, 28 Oct 1998 14:01:15 -0500
> First note that, technically, the allowable training examples--both
> positive and negative--are specified in the index file for that topic
> number. Many of us are kind of side-stepping the index file, though,
> and in that case questions like yours become meaningful. (They also
> help understand how the index file is constructed.)
I'm concerned that you aren't using the index files. Everyone should
be using the index files. There are several reasons for this. Using
the index files will:
1) make processing simpler.
2) avoid the possible use of inadmissible information.
3) provide assurance that the data are being processed correctly.
4) insure that an evaluation run will be complete and acceptable.
Another reason to run from index files is that the topic relevance
table that you now use to control your run WILL NOT BE AVAILABLE
during the formal evaluation.
If you lack confidence in the integrity of the index files, or if
you have reason to believe that an index file is incorrect, please
let us know and we will attend to your concern and fix any problem
that we find.
--
George Doddington at NIST: doddington@nist.gov or 301/975-3261
(208) previous ~ index ~ next
Last updated Wed Oct 28 14:44:12 1998