(194) previous ~ index ~ next
To: tdt-distrib@ldc.upenn.edu
From: Jon_Yamron@Dragonsys.com
Subject: BBN ASR
Date: Mon, 27 Sep 1999 14:06:08 -0400
I have encountered the following anomaly in the corpus. The file
as1/19980107_1130_1200_CNN_HDL.as1
contains the sequence of lines
<X Bsec=1602.48 Dur=0.23 Conf=NA>
<W recid=3189 Bsec=1602.71 Dur=0.00 Clust=NA Conf=NA>
<X Bsec=1602.71 Dur=2.89 Conf=NA>
In other words, the transcript contains a line indicating the presence of a word
of zero duration (although no word actually appears at the end of the line, as
required by our format), sandwiched between two short pauses.
I don't know if there are other examples, but it didn't take long for our parser
to find (and break on) this one...
- Jon
(194) previous ~ index ~ next
Last updated Tue Sep 28 10:38:17 1999