(011) previous ~ index ~ next

To: "'Strzalkowski, Tomek (CRD)'" <strzalkowski@crd.ge.com>
From: George Doddington <doddington@msn.com>
Subject: RE: TDT3 -- Precision vs False Alarm (postscript attachment)
Date: Tue, 9 Feb 1999 07:53:31 -0800

>A separate issue I have is the False Alarm measure that we use.
>Should we report both FA and precision (and have both as part
>of cost function)? For example (taking GE1 run) topics 75 and 98
>get similar (very small) precision (approx 2-4%), but their FA rates
>differ by the factor of 5 (5% vs. 0.9%). The difference is the number
>of stories tracked (ratio 5 to 1).

While you have found an example in which two selected topics
happen to exhibit similar values of precision but different values
of false alarm, in general false alarm will be a more stable
indicator of performance than precision. For example, if you
compute the mean and variance statistics for precision and
false alarm for ALL 21 topics in the GE1 run, you will find that
the normalized standard deviation (std dev / mean) of precision
is significantly greater than that for false alarm.

Furthermore, precision is clearly a function of topic richness,
while we presume that false alarm is not. For the data taken
from the GE1 run, topic richness accounts for over half of the
total variance of precision. I've prepared a technical note that
explains this in more detail, including a figure that shows the
effect of richness on precision, taken from the TDT2 tracking
task and using GE1 results. I'm attaching a postscript copy
of this note. I hope this helps explain why we have selected
miss and false alarm rates for evaluation purposes.
--
George Doddington in Orinda, CA. doddington@nist.gov 925/631-6628

Click below to retrieve the document:

Precision.vs.FalseAlarm.ps (PostScript file)


(011) previous ~ index ~ next

Last updated Thu May 13 09:28:13 1999