(043) previous ~ index ~ next

To: tdt-distrib@unagi.cis.upenn.edu
From: Mark Liberman <myl@unagi.cis.upenn.edu>
Subject: TDT-3 Mandarin Resources
Date: Wed, 24 Mar 1999 21:42:22 EST

TDT-ers,

Thanks to Shudong Huang, Xiaoyi Ma and Zhibiao Wu, a variety of
Mandarin information, links and resources are available at
http://www.ldc.upenn.edu/Projects/Chinese

Perhaps the most important resource is a pair of bilingual glossaries,
Mandarin-to-English and English-to-Mandarin. Each consists of a list
of words in the source language, paired with a list of sets of words
in the target language. They have been compiled from a set of diverse
resources, partly LDC-internal but mainly from the web. Some of the
materials from the web come with copyright restrictions, which we
have retained in the compilation.

For the moment, each of the two lists has been compiled from
completely separate sources. Over the next few days, we will invert
the lists and recombine the results. This is obviously easy to do in a
blind way, but we hope to produce a somewhat cleaner result by taking
some care. We will also add other material as we can find or produce
it.

The referenced web page also provides some general information about the
Chinese languages, the Chinese writing system, various resources for
entering and displaying Chinese characters, and so on.

As soon as we clarify some permissions, we'll also make a Mandarin
segmenter available on the same web site.

Finally, we have a demo version of the Systran Mandarin-English translation
system. I'll send a sample translation in a separate message. As soon
as we can get the full version purchased and installed, we'll prepare
automatic translations of all the Mandarin TDT-3 material.

-Mark Liberman


(043) previous ~ index ~ next

Last updated Thu May 13 09:28:20 1999