(067) previous ~ index ~ next
To: Hubert Jin <hjin@bbn.com>
From: Jonathan Fiscus <jonathan.fiscus@nist.gov>
Subject: Re: TDT-2 Evaluation Website
Date: Thu, 04 Jun 1998 08:00:31 -0400
This is a multi-part message in MIME format.
--------------C079B284903976F736FC3332
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Hubert,
Before the last meeting, I created a dry-dry-run test set which included
test indexes as well as the text corpora. You should look at that
distribution for an example.
I would like to note that the format of the index files will change
slightly as a result of the last meeting. I haven't re-worked the index
files yet, in fact I'm working on them today... I hope to have the new
format
released shortly.
The message explaining the location of the dry-dry-run test corpus is
attached.
Jon
Hubert Jin wrote:
>
> Hi John,
>
> In the TDT2 evaluation plan, it is mentioned that there will be
> some index files, each contains a list of source filenames being
> processed. ~~~~~~~~~~~~~~~~
>
> Based on discussion, I assume that these source filenames mean the
> *.bndtkn. Please let me know if I am wrong about that. Also, it
> would be helpful if someone can distribute one example of the index
> files. [The distributed evaluation code TDT2trk.pl needs take such
> an index file as input].
>
> Thanks,
>
> -Hubert
>
> On Thu, 28 May 1998 john.garofolo@nist.gov wrote:
>
> > Folks,
> >
> > We have created a new Website for the TDT-2 evaluation. The new site
> > provides a terse central page with pointers to relevant information at
> > several locations. It is located at
> > http://www.nist.gov/speech/tdt98/tdt98.htm
> >
> > If you think that something has been left out that should be
> > included, please email me (john.garofolo@nist.gov) and
> > I'll add it.
> >
> > Thanks,
> >
> > John G.
> >
--
Jon Fiscus
NIST
Email: jfiscus@nist.gov
Phone: (301) 975-3182
--------------C079B284903976F736FC3332
Content-Type: message/rfc822
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
Return-Path: <john@jaguar.ncsl.nist.gov>
Received: from email.nist.gov by jaguar (SMI-8.6/SMI-SVR4)
id HAA24487; Thu, 4 Jun 1998 07:57:50 -0400
Received: from jaguar (jaguar.ncsl.nist.gov [129.6.84.212])
by email.nist.gov (8.8.8/8.8.8) with SMTP id HAA29768
for <jonathan.fiscus@nist.gov>; Thu, 4 Jun 1998 07:57:29 -0400 (EDT)
Received: from mjfrog.snlp.gov by jaguar (SMI-8.6/SMI-SVR4)
id HAA24484; Thu, 4 Jun 1998 07:57:49 -0400
Received: by mjfrog.snlp.gov (SMI-8.6/SMI-SVR4)
id HAA04736; Thu, 4 Jun 1998 07:58:37 -0400
Resent-From: john.garofolo@nist.gov
Resent-Message-Id: <9806040758.ZM4734@mjfrog.ncsl.nist.gov>
Resent-Date: Thu, 4 Jun 1998 07:58:36 -0400
X-Mailer: Z-Mail (3.2.1 10apr95)
Resent-To: <jonathan.fiscus@nist.gov>
Received: from email.nist.gov by jaguar (SMI-8.6/SMI-SVR4)
id PAA14302; Thu, 30 Apr 1998 15:45:19 -0400
Received: from linc.cis.upenn.edu (LINC.CIS.UPENN.EDU [158.130.12.3])
by email.nist.gov (8.8.8/8.8.8) with ESMTP id PAA19031;
Thu, 30 Apr 1998 15:44:29 -0400 (EDT)
Received: from unagi.cis.upenn.edu (UNAGI.CIS.UPENN.EDU [158.130.8.153])
by linc.cis.upenn.edu (8.8.5/8.8.5) with ESMTP id PAA25197;
Thu, 30 Apr 1998 15:45:48 -0400 (EDT)
Received: from localhost (localhost [127.0.0.1])
by unagi.cis.upenn.edu (8.8.5/8.8.5) with SMTP id PAA19214
for <tdt-distrib>; Thu, 30 Apr 1998 15:45:48 -0400 (EDT)
Message-Id: <199804301945.PAA19214@unagi.cis.upenn.edu>
To: tdt-distrib@unagi.cis.upenn.edu
Subject: Getting the TDT Phase 2 '98 Evaluation Scoring Tools
Date: Thu, 30 Apr 1998 15:45:48 EDT
From: David Graff <graff@unagi.cis.upenn.edu>
Folks,
Here is the information on how to get the TDT2 "dry-dry-run" test
corpus that was just announced by Jon Fiscus, along with his comments
about it:
------------------
Folks,
NIST has released the dry-dry-run evaluation set. You can download
the data using the instructions below. Please read the dry-dry-run
information in the file, 'Dry-Dry-Run.htm' with your favorite browser.
Please direct any comments or questions to Jon Fiscus, 'jfiscus@nist.gov'.
Jon
------------------
[ftp instructions available on request from graff@ldc.upenn.edu]
The tar file is 66451046 bytes (compressed). You will find that it
contains a complete set of January data (sgml text files, asr and tkn
text files, all tables), which SUPERCEDES the version of this data
released by the LDC on April 7 -- the various problems in that
earlier release that I reported recently have all been repaired.
Dave Graff
--------------C079B284903976F736FC3332--
(067) previous ~ index ~ next
Last updated Wed Sep 9 09:40:50 1998