(295) previous ~ index ~ next

To: tdt-distrib@unagi.cis.upenn.edu
From: David Graff <graff@unagi.cis.upenn.edu>
Subject: Re: assessing last-week's scores from NIST
Date: Wed, 18 Oct 2000 15:06:18 EDT

Folks,

I just took a closer look at James Allan's recent request for presentations at
the upcoming TDT workshop, and there is one point I thought should be
clarified. James wrote:

> Unfortunately, because of the apparent confusion in the preliminary
> evaluation, it's difficult to be *certain* that your approach was
> effective, so you'll have to focus more on whether it was
> interesting, or was effective on the training data...

In the results that Jon Fiscus circulated last Friday, the scoring problem
affected ONLY the 55 NEW topics that we created this year -- these are the
ones numbered 31001 - 31060.

The 60 OLDER topics (the ones that were tested last year, which are numbered
30001 - 30060) were not affected by the scoring glitch, and sites can
definitely use those scores to assess system performance.

I've been in touch with Mark Przybocki at NIST; he has amended the scoring
software and recomputed the scores. I expect he will do a few more sanity
checks to convince himself that his new numbers on the new topics make sense
(and that the numbers on the old topics have not changed in the process, since
there was nothing wrong with them in the first place). We'll keep you
posted...

Dave Graff


(295) previous ~ index ~ next

Last updated Wed Oct 25 15:48:22 2000