(005) previous ~ index ~ next

To: "'Rich Schwartz'" <schwartz@bbn.com>
From: George Doddington <doddington@msn.com>
Subject: RE: TDT3, a variety of issues
Date: Mon, 8 Feb 1999 10:57:09 -0800

We always have a huge problem about vast differences between data
sets. We tune to one and then find that the next set is completely
different in that the criteria used have drifted -- even though not
officially. So we see 4000 ontopic stories for Jan-Feb, 500 for Mar-Apr,
and 1800 for May-June. These differences are clearly not due to random
sampling, because there were plenty of topics to keep the differences
smaller than that. It is because the types of topics and the criteria for
inclusion of stories obviously drifted -- even if it was because of
different people.

Alvin Martin has applied several statistical tests to the distribution
of the number of stories for topics and has found no statistically
significant difference between the TDT2 devset and eval set, even
at the relatively low confidence level of 90%. He is also looking
at topic tracking performance differences and will distribute his
finding to you (all) soon.
--
George Doddington in Orinda, CA. doddington@nist.gov 925/631-6628


(005) previous ~ index ~ next

Last updated Thu May 13 09:28:12 1999