(058) previous ~ index ~ next
To: David Graff <graff@unagi.cis.upenn.edu>
From: James Allan <allan@cs.umass.edu>
Subject: Re: Question about chronological ordering of data
Date: Sun, 24 May 1998 13:45:59 -0400
Dave and the TDTers,
> My recollection is that there is no need to worry about chronological
> order at the level of individual stories -- but since I didn't have a
> definite record of that, I was hoping others could clarify the point.
This is the way that the tasks are defined: it is FILES are processed
in date order, not stories. As Dave has pointed out, the stories
within the files are in chronological order, the files themselves are
in chronological order, and the files often do not overlap, so this
assumption is not likely to be a big problem.
The DATE_TIME tags are available for any site that wants to explore
how well things would work if the stories were perfectly sorted by
time, independent of source. But that is not an official part of the
task.
It is worth noting that the official order of the files is *NOT* the
alphanumeric sorting of the bndXXX files' names. Instead, it is the
order specified in the index files provided by NIST for each task. I
imagine those two possible orderings will be nearly identical, but it
is not safe to assume so.
-- james
(058) previous ~ index ~ next
Last updated Wed Sep 9 09:40:50 1998