(278) previous ~ index ~ next
To: George Doddington <doddington@nist.gov>
From: Rich Schwartz <schwartz@bbn.com>
Subject: Re: index file choices
Date: Fri, 18 Dec 1998 09:53:15 -0500 (EST)
I'm somewhat amused by the discussion of detailed differences between CCAP
an FDCH transcriptions. As you may remember, we have already shown that
for TRACKING there is no difference between CCAP and ASR. So surely there
is also no difference between a "true" transcription (FDCH) and one in
which all the content words are carefully preserved (CCAP).
>From SDR we know also that there is a very minor effect of speech
recognition errors on retrieval of spoken documents. And we're really
just doing various forms of IR, no matter what we call it.
Now it's true that we DO observe a difference for the detection problem
where an incorrect word in a single seed article can cause a divergence.
Even there the difference is moderate -- about 25% relative. And clearly
as speech recognition inevitably improves (remember that the recognition
we're using here has nominal 23% WER, while the state of the art at 10xRT
now is about 16%) this difference will become negligible as well.
Perhaps we should focus more on the substantive issues, like what the
application is and what we're trying to accomplish, than worry about minor
details of the eval.
--Rich
=======================================================================
On Fri, 18 Dec 1998, George Doddington wrote:
> Date: Fri, 18 Dec 1998 00:10:19 -0500
> From: George Doddington <doddington@email.msn.com>
> Reply-To: George Doddington <doddington@nist.gov>
> To: "Strzalkowski, Tomek (CRD)" <strzalkowski@crd.ge.com>
> Cc: tdt-distrib@ldc.upenn.edu
> Subject: Re: index file choices
>
> >As far as I can say, FDCH is a true manual transcript (or as close to one
> >as we can get) while CCAP is a degraded manual transcript. Thus lacking
> >anything better CCAP could do as a poor substitute, but I see no reason
> >why it should be used as manual if FDCH is available. Since we are
> interested
> >in performance contrasts, both FDCH vs. ASR and CCAP vs. ASR are of
> >interest, although for somewhat different reasons. For me, manual = FDCH.
>
> Yes, you're absolutely right. The reason that CCAP was chosen is simply
> that it exists for all of the data, whereas FDCH covers only part of the
> data. Therefore CCAP may be compared with ASR over all audio sources.
> That is why CCAP was designated as the manual transcription, with the
> belief that, for TDT tasks, CCAP will provide similar results to FDCH,
> even if the CCAP transcription is not as faithful as the FDCH version.
> It will be interesting, of course, to compare TDT performance differences
> between CCAP and FDCH on the subset of stories for which FDCH transcripts
> exist.
> -----------------------
> George Doddington in McLean, VA. doddington@nist.gov or 703/556-3434
>
>
>
(278) previous ~ index ~ next
Last updated Wed Feb 3 10:44:19 1999