(167) previous ~ index ~ next

To: Mark Liberman <myl@unagi.cis.upenn.edu>
From: Jaime Carbonell <jgc@NL.CS.CMU.EDU>
Subject: Re: more missed labels in devset
Date: Wed, 9 Sep 98 13:15:27 EDT

Mark,
Thanks for your analytical message. The Kappa ratio, of course,
measures inter-rater consistency, and not systematic misses (by all
raters). One would hope the two would be correlated, but it is not
necessarily so. Therefore, it would be useful to develop another
measure to check for systematic misses. [Aside: This was the side
discussion that Rich Schwartz and I were having last meeting, and
the reason for Rich's suggestion that the trackers be used to catch
misses, which is exactly what Ralf did.]

I think that it would be useful to search for misses with another
tracker or two, and then investigate the causes (your hypotheses are
plausible), leading to an improved labeling on the TDT2 corpus.
This is not a high cost operation, as only a small number of stories
(high-scoring alleged false alarms) need be re-read and perhaps
relabeled manually, if they indeed prove to be systematic misses.

Best,

--Jaime
(167) previous ~ index ~ next

Last updated Fri Sep 11 13:52:53 1998