(267) previous ~ index ~ next
To: TDT distribution <tdt-distrib@ldc.upenn.edu>
From: George Doddington <doddington@nist.gov>
Subject: Proposed change to the topic tracking task
Date: Wed, 09 Aug 2000 14:04:27 -0400
Unless there is a coherent argument to the contrary, the topic
tracking task will be changed to demote Nt, the # of on-topic
training stories, from the status of "parameter" to the status
of "variable". In other words, the number of on-topic training
stories will be a variable function of topic, constrained only
to assume a value of 1, 2, 3 or 4. This will simplify the task
definition, conceptually at least, and reduce the number of
different sets of parameters. (Topic tracking performance as a
function of the value of Nt will still be tabulated and analyzed,
of course.) Note that this change will make the topic tracking
task more difficult, because of the need to normalize the score
across different values of Nt. (Performance conditioned on Nt
should NOT change, however.) As a footnote to the motivation
for this change, it has been observed that for the current
alternate condition of Nt = 4, this is actually Nt <= 4, because
not all topics will have 4 or more on-topic stories. Since this
condition is already in place, namely that Nt is a variable
function of topic, and since as it stands this provides a clue
to how many on-topic stories that exist in the test set, it was
thought that a good move would be simply to change the status of
Nt from parameter to variable. This would then simplify the task
definition, obscure the extraneous (and illegal) clue to threshold
setting, and facilitate easy evaluation over multiple values of Nt.
Note that under this new task definition, there may be far more than
120 topics to track -- in the limit up to 1500 for the formal TDT2000
evaluation.
--
George Doddington at NIST: doddington@nist.gov or 301/975-3261
(267) previous ~ index ~ next
Last updated Tue Sep 19 14:30:45 2000