(237) previous ~ index ~ next
To: George Doddington <doddington@nist.gov>
From: Rich Schwartz <schwartz@bbn.com>
Subject: Re: pooled /topic cost
Date: Thu, 5 Nov 1998 19:10:14 -0500 (EST)
George,
OK, I've stayed out up until now, but here goes.....
I know we had a discussion about possible hybrid story/topic
weighting. However, I thought we also agreed that we would not use this
for the upcoming formal evaluation. More to the point, obviously we MUST
know what the weighting is NOW (well, actually a couple of months ago).
It wouldn't be meaningful to change the weighting for the evaluation
measure after the fact.
(To those who might say, "This isn't a competition.")
Yes, it IS a competition. Even if you don't want a competition
BETWEEN sites, think of it as YOU against the PROBLEM. Our broad goal is
to try out different methods and find the one that works "best". If the
definition of "best" changes, then unfortunately the best method will
change also. While we would like this not to be the case, it just IS.
We showed that for the detection problem at the last meeting. In
addition, we've found, for example, that our two metrics for detection
have completely different behavior with respect to story weighted vs topic
weighted. (i.e., on story weighted, one method is TWICE as good as the
other, and vice versa.)
Although I personally much prefer story weighting, my recollection
was that we agreed on topic weighting. But if you want to change it to
something else, please make sure that we have the scoring software from
NIST that supports this weighting by October 1st.
--Rich
P.S. I agree with George's last comment on what's allowed to use.
i.e., I thought we had said that we could not add additional positive
stories. But since it would be interesting, someone who did that should
do a comparison.
======================================================================
On Thu, 5 Nov 1998, George Doddington wrote:
> J michAel schuLtz wrote:
>
> > Can someone verify for me that it is the topic weighted cost that we are
> > trying to minimize for the tracking task? As opposed to minimizing the
> > story weighted cost.
> >
> > Mike
>
> This issue remains unresolved. At the September meeting I believe
> that the general sentiment was a preference for topic weighted, but
> the variable weighting method (footnote 16, pages 13-14 in version
> 3.7 of the evaluation plan) may prevail. This method uses a variable
> weight that tends toward equal story weighting for very small topics
> and toward equal topic weighting for very large topics. The nominal
> topic size parameter, Nw, which represents the crossover point
> between story-weighted and topic-weighted averaging, will likely
> be chosen to be the mean (or possibly median) over all topics of
> the number of stories per topic.
>
> --------
>
> On another totally separate point: Please note that (Nt)max will
> remain equal to 4 for the formal evaluation, just as for the dry run.
> --
> George Doddington at NIST: doddington@nist.gov or 301/975-3261
>
>
(237) previous ~ index ~ next
Last updated Fri Nov 6 15:29:24 1998