(135) previous ~ index ~ next

To: tdt-distrib@unagi.cis.upenn.edu
From: David Graff <graff@unagi.cis.upenn.edu>
Subject: Re: index files for devtest
Date: Thu, 03 Sep 1998 11:45:56 EDT

Folks,

Regarding cases of broadcast files where there is sgml & token stream
data but no asr data, this sort of problem stems from a mishap at LDC
in delivering audio data to Dragon for asr processing. During the
first week of July, we realized that following data files had been
missing from initial cdroms of audio data sent to Dragon:

19980318_1600_1700_VOA_WRP
19980318_1700_1800_VOA_TDY
19980319_1600_1700_VOA_WRP
19980319_1700_1800_VOA_TDY
19980325_1600_1700_VOA_WRP
19980325_1700_1800_VOA_TDY
19980424_1600_1630_CNN_HDL (now famous from recent email discussions)

Because Dragon did not receive these audio files until early July,
the asr data for them was not available for the devtest data release.
(The VOA files listed above were entirely absent from the devtest
release, due to other data-prep problems involving VOA files.)

We hope to put together a complete re-release of training and devtest
data as soon as possible after the next PI meeting, and a number of
miscellaneous holes in the file inventory will be filled.

Dave Graff
(135) previous ~ index ~ next

Last updated Wed Sep 9 09:40:55 1998