(106) previous ~ index ~ next

To: "'tdt-distrib@unagi.cis.upenn.edu'" <tdt-distrib@unagi.cis.upenn.edu>
From: "Strzalkowski, Tomek (CRD)" <strzalkowski@exc01crdge.crd.ge.com>
Subject: FW: More Problems with TDT2 devset data
Date: Wed, 29 Jul 1998 12:06:08 -0400

> ----------
> From: strzalk@schedar[SMTP:strzalk@schedar]
> Sent: Wednesday, July 29, 1998 12:13 PM
> To: strzalkowski@crdns.crd.ge.com
> Subject: More Problems with TDT2 devset data
>
>
> ----- Begin Included Message -----
>
> From wisegb@crd.ge.com Wed Jul 29 11:27:37 1998
> Sender: wisegb@crd.ge.com
> Date: Wed, 29 Jul 1998 11:28:37 -0400
> From: "G. Bowden Wise" <wisegb@crd.ge.com>
> Organization: GE Corporate Research & Development
> X-Mailer: Mozilla 4.05 [en] (X11; I; SunOS 5.5.1 sun4m)
> Mime-Version: 1.0
> To: Tomek Strzalkowski <strzalk@thuban.crd.ge.com>
> Subject: More Problems with TDT2 devset data
> Content-Transfer-Encoding: 7bit
>
> Tomek
>
> We discovered some other problems with some of the TDT2
> data (for the development set) In particular, some
> records in bndtkn and bndasr files do not have the
> Brecid or Erecid tags, they only have Bsec and Esec (TIME)
> tags.
>
> Can you please forward to the TDT folks; thanks!
>
> This means that when we track via RECID we cant
> use those docs that have them missing Brecid/Erecid
> (this is for the case where boundaries are given of course)
>
> Many of these have doctype=UNTRANSCRIBED which we ignore
> anyway, but there are some with doctype=NEWS ... for now
> we just skip docs that dont have corresponding Brecid/Erecid
>
> There are 23 NEWS
> 32 UNTRANSCRIBED
>
> Here is a grep of the missing Brecid/Erecid of the
> files: tdt_deliv_980708/tables/*.bnd*
>
> 19980301_1300_1330_CNN_HDL.bndtkn:<BOUNDARY docno=CNN19980301.1300.1409
> doctype=UNTRANSCRIBED Bsec=1409.32 Esec=1416.55>
> 19980302_1600_1630_CNN_HDL.bndtkn:<BOUNDARY docno=CNN19980302.1600.1379
> doctype=UNTRANSCRIBED Bsec=1379.79 Esec=1389.58>
> 19980304_1600_1630_CNN_HDL.bndtkn:<BOUNDARY docno=CNN19980304.1600.1405
> doctype=UNTRANSCRIBED Bsec=1405.36 Esec=1416.70>
> 19980305_1600_1630_CNN_HDL.bndtkn:<BOUNDARY docno=CNN19980305.1600.1404
> doctype=UNTRANSCRIBED Bsec=1404.36 Esec=1413.97>
> 19980306_1130_1200_CNN_HDL.bndasr:<BOUNDARY docno=CNN19980306.1130.0319
> doctype=NEWS Bsec=319.03 Esec=341.00>
> 19980308_1300_1330_CNN_HDL.bndtkn:<BOUNDARY docno=CNN19980308.1300.1418
> doctype=UNTRANSCRIBED Bsec=1418.13 Esec=1425.44>
> 19980309_1600_1630_CNN_HDL.bndtkn:<BOUNDARY docno=CNN19980309.1600.1809
> doctype=UNTRANSCRIBED Bsec=1809.79 Esec=1854.02>
> 19980311_1600_1630_CNN_HDL.bndtkn:<BOUNDARY docno=CNN19980311.1600.1152
> doctype=UNTRANSCRIBED Bsec=1152.38 Esec=1168.92>
> 19980311_1600_1630_CNN_HDL.bndtkn:<BOUNDARY docno=CNN19980311.1600.1251
> doctype=UNTRANSCRIBED Bsec=1251.14 Esec=1301.27>
> 19980313_1600_1630_CNN_HDL.bndtkn:<BOUNDARY docno=CNN19980313.1600.1411
> doctype=UNTRANSCRIBED Bsec=1411.35 Esec=1416.88>
> 19980315_1300_1330_CNN_HDL.bndasr:<BOUNDARY docno=CNN19980315.1300.0975
> doctype=NEWS Bsec=975.00 Esec=981.80>
> 19980315_1300_1330_CNN_HDL.bndasr:<BOUNDARY docno=CNN19980315.1300.0981
> doctype=NEWS Bsec=981.80 Esec=992.94>
> 19980316_1600_1630_CNN_HDL.bndtkn:<BOUNDARY docno=CNN19980316.1600.1084
> doctype=UNTRANSCRIBED Bsec=1084.81 Esec=1096.00>
> 19980318_1130_1200_CNN_HDL.bndtkn:<BOUNDARY docno=CNN19980318.1130.1093
> doctype=UNTRANSCRIBED Bsec=1093.50 Esec=1106.44>
> 19980321_1000_1030_CNN_HDL.bndtkn:<BOUNDARY docno=CNN19980321.1000.1414
> doctype=UNTRANSCRIBED Bsec=1414.23 Esec=1423.64>
> 19980322_1000_1030_CNN_HDL.bndtkn:<BOUNDARY docno=CNN19980322.1000.1411
> doctype=UNTRANSCRIBED Bsec=1411.86 Esec=1421.65>
> 19980322_1300_1330_CNN_HDL.bndtkn:<BOUNDARY docno=CNN19980322.1300.1413
> doctype=UNTRANSCRIBED Bsec=1413.97 Esec=1420.21>
> 19980322_1600_1630_CNN_HDL.bndtkn:<BOUNDARY docno=CNN19980322.1600.1413
> doctype=UNTRANSCRIBED Bsec=1413.65 Esec=1420.51>
> 19980323_1600_1630_CNN_HDL.bndasr:<BOUNDARY docno=CNN19980323.1600.0953
> doctype=NEWS Bsec=953.55 Esec=962.34>
> 19980323_1830_1900_ABC_WNT.bndasr:<BOUNDARY docno=ABC19980323.1830.0492
> doctype=NEWS Bsec=492.48 Esec=502.03>
> 19980326_1130_1200_CNN_HDL.bndtkn:<BOUNDARY docno=CNN19980326.1130.1075
> doctype=UNTRANSCRIBED Bsec=1075.83 Esec=1093.41>
> 19980326_1130_1200_CNN_HDL.bndtkn:<BOUNDARY docno=CNN19980326.1130.1093
> doctype=UNTRANSCRIBED Bsec=1093.41 Esec=1114.00>
> 19980328_1600_1630_CNN_HDL.bndtkn:<BOUNDARY docno=CNN19980328.1600.0987
> doctype=UNTRANSCRIBED Bsec=987.60 Esec=996.17>
> 19980329_1600_1630_CNN_HDL.bndtkn:<BOUNDARY docno=CNN19980329.1600.1318
> doctype=UNTRANSCRIBED Bsec=1318.48 Esec=1330.29>
> 19980331_1700_1800_VOA_TDY.bndtkn:<BOUNDARY docno=VOA19980331.1700.2954
> doctype=NEWS Bsec=2954.97 Esec=2961.88>
> 19980331_1830_1900_ABC_WNT.bndasr:<BOUNDARY docno=ABC19980331.1830.0601
> doctype=NEWS Bsec=601.54 Esec=612.30>
> 19980331_1830_1900_ABC_WNT.bndasr:<BOUNDARY docno=ABC19980331.1830.0612
> doctype=NEWS Bsec=612.30 Esec=622.92>
> 19980331_1830_1900_ABC_WNT.bndasr:<BOUNDARY docno=ABC19980331.1830.0622
> doctype=NEWS Bsec=622.92 Esec=635.76>
> 19980331_2130_2200_CNN_HDL.bndtkn:<BOUNDARY docno=CNN19980331.2130.1018
> doctype=UNTRANSCRIBED Bsec=1018.33 Esec=1032.42>
> 19980402_2000_2100_PRI_TWD.bndtkn:<BOUNDARY docno=PRI19980402.2000.1749
> doctype=UNTRANSCRIBED Bsec=1749.87 Esec=1763.99>
> 19980406_1600_1630_CNN_HDL.bndtkn:<BOUNDARY docno=CNN19980406.1600.1333
> doctype=UNTRANSCRIBED Bsec=1333.20 Esec=1341.04>
> 19980406_2130_2200_CNN_HDL.bndtkn:<BOUNDARY docno=CNN19980406.2130.0981
> doctype=UNTRANSCRIBED Bsec=981.69 Esec=996.05>
> 19980407_1600_1630_CNN_HDL.bndasr:<BOUNDARY docno=CNN19980407.1600.0574
> doctype=NEWS Bsec=574.26 Esec=591.24>
> 19980408_1130_1200_CNN_HDL.bndtkn:<BOUNDARY docno=CNN19980408.1130.1025
> doctype=UNTRANSCRIBED Bsec=1025.78 Esec=1050.29>
> 19980409_1600_1630_CNN_HDL.bndtkn:<BOUNDARY docno=CNN19980409.1600.1336
> doctype=UNTRANSCRIBED Bsec=1336.44 Esec=1347.26>
> 19980410_1130_1200_CNN_HDL.bndasr:<BOUNDARY docno=CNN19980410.1130.1324
> doctype=NEWS Bsec=1324.91 Esec=1333.98>
> 19980410_1130_1200_CNN_HDL.bndasr:<BOUNDARY docno=CNN19980410.1130.1333
> doctype=NEWS Bsec=1333.98 Esec=1344.60>
> 19980411_1300_1330_CNN_HDL.bndasr:<BOUNDARY docno=CNN19980411.1300.1306
> doctype=NEWS Bsec=1306.91 Esec=1321.10>
> 19980412_1000_1030_CNN_HDL.bndasr:<BOUNDARY docno=CNN19980412.1000.1492
> doctype=NEWS Bsec=1492.24 Esec=1512.22>
> 19980414_1600_1630_CNN_HDL.bndasr:<BOUNDARY docno=CNN19980414.1600.0924
> doctype=NEWS Bsec=924.35 Esec=943.17>
> 19980414_1600_1630_CNN_HDL.bndasr:<BOUNDARY docno=CNN19980414.1600.1016
> doctype=NEWS Bsec=1016.41 Esec=1035.39>
> 19980414_1600_1630_CNN_HDL.bndtkn:<BOUNDARY docno=CNN19980414.1600.1042
> doctype=UNTRANSCRIBED Bsec=1042.17 Esec=1052.07>
> 19980414_1800_1900_VOA_TDY.bndasr:<BOUNDARY docno=VOA19980414.1800.1160
> doctype=NEWS Bsec=1160.89 Esec=1193.22>
> 19980422_1130_1200_CNN_HDL.bndtkn:<BOUNDARY docno=CNN19980422.1130.1329
> doctype=UNTRANSCRIBED Bsec=1329.72 Esec=1341.01>
> 19980423_1130_1200_CNN_HDL.bndtkn:<BOUNDARY docno=CNN19980423.1130.1354
> doctype=UNTRANSCRIBED Bsec=1354.07 Esec=1365.06>
> 19980424_1130_1200_CNN_HDL.bndtkn:<BOUNDARY docno=CNN19980424.1130.1351
> doctype=UNTRANSCRIBED Bsec=1351.22 Esec=1359.78>
> 19980427_1600_1630_CNN_HDL.bndasr:<BOUNDARY docno=CNN19980427.1600.0892
> doctype=NEWS Bsec=892.48 Esec=902.82>
> 19980427_1800_1900_VOA_TDY.bndasr:<BOUNDARY docno=VOA19980427.1800.1085
> doctype=NEWS Bsec=1085.29 Esec=1094.84>
> 19980427_2000_2100_PRI_TWD.bndasr:<BOUNDARY docno=PRI19980427.2000.1667
> doctype=NEWS Bsec=1667.74 Esec=1691.17>
> 19980428_0130_0200_CNN_HDL.bndasr:<BOUNDARY docno=CNN19980428.0130.0934
> doctype=NEWS Bsec=934.03 Esec=947.00>
> 19980428_1130_1200_CNN_HDL.bndtkn:<BOUNDARY docno=CNN19980428.1130.1081
> doctype=UNTRANSCRIBED Bsec=1081.15 Esec=1092.91>
> 19980428_1600_1630_CNN_HDL.bndasr:<BOUNDARY docno=CNN19980428.1600.0351
> doctype=NEWS Bsec=351.71 Esec=369.99>
> 19980428_1600_1630_CNN_HDL.bndtkn:<BOUNDARY docno=CNN19980428.1600.1024
> doctype=UNTRANSCRIBED Bsec=1024.36 Esec=1031.24>
> 19980430_1600_1630_CNN_HDL.bndasr:<BOUNDARY docno=CNN19980430.1600.0571
> doctype=NEWS Bsec=571.42 Esec=596.29>
> 19980430_1600_1630_CNN_HDL.bndtkn:<BOUNDARY docno=CNN19980430.1600.1665
> doctype=UNTRANSCRIBED Bsec=1665.20 Esec=1671.88>
>
> --
> -------------------------------------------------------------------
> G. Bowden Wise General Electric Company
> wisegb@crd.ge.com Corporate Research and Development
> Phone: 518 387-5175 Dial Comm: 8*833-5175 FAX: 518-387-6845
>
>
> ----- End Included Message -----
>
(106) previous ~ index ~ next

Last updated Wed Sep 9 09:40:53 1998