(242) previous ~ index ~ next

To: George Doddington <doddington@nist.gov>
From: Rich Schwartz <schwartz@bbn.com>
Subject: Re: Tracking eval software
Date: Tue, 17 Nov 1998 15:00:52 -0500 (EST)

George,

When you say:

> The previous version excluded empty stories from the evaluation.
> The current version INCLUDES empty stories in the evaluation.
> This change was made to prevent excusing ASR systems from gross
> failure to output text.

Do you mean that if we use the Dragon ASR and it has an empty story due to
speech recognition errors, then all of the systems will be scored as
missing this story?

I understand that it is TRUE that the story was, in fact, missed.
But given that we are all using the ASR input, I'm not sure what we're
learning here, unless we separately count how often this happened.
If this effect dominates the miss rate (I doubt very much that the Dragon
recognizer would omit as much as 1% of the sentences, let alone 1% of
whole stories!)

--Rich
==========================================

On Tue, 17 Nov 1998, George Doddington wrote:

> > Is the DET curve produced by the tracking eval software a story-weighted
> > curve or a topic-weighted curve, and has it always been the same weighting?
>
> Both the previous and the current software produces story-weighted curves.
> --------
>
>
> > Why do I get a different curve from version 0.5 of the software than I got
> > with version 0.3? Was there some significant bug that was fixed?
>
> The previous version excluded empty stories from the evaluation.
> The current version INCLUDES empty stories in the evaluation.
> This change was made to prevent excusing ASR systems from gross
> failure to output text.
> --
> George Doddington at NIST: doddington@nist.gov or 301/975-3261
>
>

(242) previous ~ index ~ next

Last updated Fri Dec 4 12:05:49 1998