(067) previous ~ index ~ next
To: tdt-distrib@unagi.cis.upenn.edu
From: George Doddington <doddington@nist.gov>
Subject: Re: evaluation of story segmentation of MT'd Mandarin
Date: Tue, 30 Mar 1999 12:59:13 -0500
> On a related issue, it isn't clear to me how a "non-Mandarin"
> participant in the segmentation task will be able to handle
> Mandarin data, since we aren't planning to include a mapping
> of MT'd English words back into Mandarin characters. (Note
> that segmentation of Mandarin requires outputting story
> boundaries in terms of record ID's in the Mandarin source
> stream.)
Let me retract this statement. While Mandarin is to be scored
using an evaluation interval measured in Mandarin characters, it
is possible to score MT'd English separately using an evaluation
interval measured in terms of English words in the derivative MT'd
version of the source data. This won't give a strictly comparable
score, but it should be reasonably informative.
--
George Doddington at NIST: doddington@nist.gov or 301/975-3261
(067) previous ~ index ~ next
Last updated Thu May 13 09:28:22 1999