(156) previous ~ index ~ next

To: tdt-distrib@unagi.cis.upenn.edu
From: David Graff <graff@unagi.cis.upenn.edu>
Subject: Duplicate file in TDT2 English data
Date: Mon, 02 Aug 1999 19:02:59 EDT

Folks,

It was brought to our attention recently that the following file in
the TDT2 English data set:

19980209_2000_2100_PRI_TWD

is actually a duplication of this file:

19980216_2000_2100_PRI_TWD

The problem arose because the initial, "broadcast-time" capture of
both these programs had failed, and the LDC had received cassette
tapes for both programs, but when someone created the "19980209"
audio file, they used the wrong cassette tape (i.e. "19980216"), and
this went unnoticed throughout the rest of the corpus creation
project.

As a result, all files associated with 19980209_2000_2100_PRI_TWD
have text content that was actually broadcast on Feb. 16, and is
essentially identical to the content of 19980216_2000_2100_PRI_TWD.

We apologize for the confusion this causes. Please eliminate
19980209_2000_2100_PRI_TWD from all consideration -- this broadcast
should be classed as having never been recorded for use in TDT2.

Dave Graff
(156) previous ~ index ~ next

Last updated Thu Aug 19 16:14:47 1999