(342) previous ~ index ~ next

To: Jaime Carbonell <jgc@nl.cs.cmu.edu>
From: George Doddington <doddington@nist.gov>
Subject: Re: objections to including topic affiliation in FSD output
Date: Mon, 19 Nov 2001 09:18:22 -0800

> 5. We want to be able to track progress, which implies a comparison
> baseline and/or results from earlier years. If we score the same
> as in earlier TDT cycles, we can do a more direct comparison (in
> addition to other forms of scoring). George always tells us that
> evaluations are to measure progress of the technology.

The scoring used in the current evaluation is available with the new
format. (See the general response that I just sent out.)
--------

> 6. If, in addition to standard FSD (or NED as per Charles's renaming)
> we want to score also 2nd or 3rd stories (mistaken for first story)
> with declining value, we certainly can do so.

This is not correct. Without a topic assignment, there is no way of
knowing which stories are first, second or third.
--------

> 7. The introduction of a larger number of confusible events (as per
> LDC's objective agreed at the meeting) will make it even more
> challenging for NED/FSD. Given that, we really want to work on
> the problem, rather than worry about retooling for different
> evaluation/optimization metrics.

Adding a topic affiliation to FSD output does not speak to the task
definition. I was not, and I am not, trying to change the task. I
am merely trying to accommodate the desire, expressed by a number of
people at the TDT workshop, to soften the evaluation by spreading the
scoring over the first few stories on a topic.
--------

> Underscoring point 4. F-skip will let us test more data and therefore
> allows us to measure statistically more meaningful results for
> NED/FSD. We really don't want to preclude that by introducing
> an incomatible format, as James and Victor point out.

The current output format does not accommodate F-skip (or j-skip, for
that matter). In order to accommodate j-skip, the output format must
be changed. In fact, the format that was used in 1997 when j-skip was
being used actually did accommodate the topic ID that I am proposing,
so in that context there is (was) no need to change the format.
--------

> Our message is: Don't take away NED/FSD -- it is a very tough problem
> and we want to try hard to make more (measurable) progress on it.

And my message is: Adding a topic ID to FSD output will add flexibility
to the formulation of evaluation while keeping the conceptual framework
that we have established. Nothing is being taken away, and I'm not
suggesting that the task be changed.
--
George Doddington in Orinda, CA: doddington@nist.gov or 925/250-8346
-------------------------------------------------------------
To unsubscribe from tdt-distrib, email majordomo@ldc.upenn.edu
with "unsubscribe tdt-distrib" in the body of the message.
(342) previous ~ index ~ next

Last updated Mon Nov 19 13:57:07 2001