(049) previous ~ index ~ next

To: Rich Schwartz <schwartz@bbn.com>
From: Mark Liberman <myl@unagi.cis.upenn.edu>
Subject: Re: TDT3 dry run
Date: Fri, 26 Mar 1999 20:23:26 EST

>
> But here's a question that may indicate I haven't been paying
>attention.

Indeed, you haven't. While we have your attention, Rich, are you willing
to provide a frequency-sorted list of English named-entities from TDT-2,
to be used in extending the bilingual glossary?

> Were there going to be English translations of the Mandarin
>materials produced?

Yes. As discussed earlier, the LDC is buying the Systran Mandarin->English
system, and will be supplying translations of all the TDT-3 Mandarin
material. I posted a sample translation of a VOA transcript earlier.

However, I don't see what this really has to do with the task
definition. As I understood George's description at the workshop,
the tasks will be just as before, except that some of the data is
Mandarin and some is English. If each site had its own copy of the
Mandarin->English translation system, they could run it on either the
unsegmented or segmented input, and use it as they please in training
or testing their algorithms in each of the various task conditions.

We'll help you save (a little) money and (more significant) time by
running the system for everyone. It is worth discussion how we will
present the data so as to signal the cross-language correspondence --
One possibility would be external pointers that enable you to recover the
corresponding story boundaries in the two text streams, for the case
in which you are using veridical story boundaries.

Since you cannot be prevented from buying your own copy of Systran
Mandarin-->English and incorporating it into your system, we would aim
to enable you to do anything you could do with that capability. But
none of this would change the task definition in any way.

I think that the Dragon folks were making a more radical suggestion, namely
that the Mandarin materials should be translated into English and then
(from the perspective of the task definition) thrown away. Our local
research effort would find that strategy an uninteresting one, but of
course it is up to George and others to decide if it the task definitions
(especially of segmentation) should be changed to accomodate it.

-Mark




(049) previous ~ index ~ next

Last updated Thu May 13 09:28:20 1999