(256) previous ~ index ~ next
To: Paul van Mulbregt <firstname.lastname@example.org>
From: Jonathan Fiscus <email@example.com>
Subject: Re: [Fwd: New Dry Run resources]
Date: Tue, 11 Jul 2000 09:35:29 -0400
Paul et al.,
Here's the URL to the postscript version of the compared evaluation
plans, v1.0 compared to v1.1.
Paul van Mulbregt wrote:
> The last time you released an update to the eval plan, you also supplied a
> Word file with diffs. I found
> that very helpful. Do you have such a diffs file this time?
> At 04:14 PM 03-07-00 -0400, you wrote:
> >I'm releasing three things today for the TDT2000 Dry Run: dry-run index
> >files, an updated evaluation plan (v1.1), and boundary files generated
> >by automatic story segmentation. The relevant URLs are:
> >Soon, (after the ftp server becomes well again), the resources will be
> >available at their intended locations:
> >Index Files
> >The index files are an interim release of the index files. The files
> >conform to the TDT spec. syntactically, but the certified off-topic
> >stories are placeholders. The LDC is re-annotating the certified no
> >stories to be inline with the new evaluation specification. As soon as I
> >return from my vacation (on July 11th), I'll issue updated index files.
> >Evaluation Plan
> >The primary changes to the TDT2000 eval plan (from TDT3) are
> >changes in the topic tracking task and the link detection task.
> >These two tasks represent the primary interest of the sponsor
> >and should serve as the primary focus of TDT R&D. The changes
> >Topic Tracking:
> > * Single-language training only. This does not indicate a lack
> > of interest in cross-language topic tracking. On the contrary,
> > cross-language tracking is of prime interest. However, dual-
> > language training has been eliminated because dual-language
> > training:
> > - results in essentially single-language topic tracking.
> > - avoids the hard cross-language issues.
> > - clutters the results with data of secondary value.
> > - fragments the research effort and causes unnecessary work.
> > * Negative example training stories. This is done by certifying
> > as off-topic, off-topic training stories that are very similar
> > to the on-topic training stories.
> >Link Detection:
> > * Extension of the task to include cross-language story pairs.
> > This is done without additional annotation effort by deriving
> > the link judgements from topic annotation for the given topics
> > (as was demonstrated successfully in the TDT3 dry run).
> >Automatic Story Segmentation Boundary Files
> >IBM graciously provided NIST with the output of their segmentation
> >system for both the TDT2 and TDT3 corpus. This release contains
> >boundary files, (in TDT corpus format), generated from the output of
> >their automatic story segmenter.
> >Jonathan Fiscus
> >National Inst. of Stds. and Tech.
> >100 Bureau Dr. Stop 8940
> >Gaithersburg, MD 20899-8940
> >Phone: (301) 975-3182
> >Email: firstname.lastname@example.org
> Paul van Mulbregt, Dragon Systems Inc., Newton, MA. (617) 965-5200
> email: email@example.com
National Inst. of Stds. and Tech.
100 Bureau Dr. Stop 8940
Gaithersburg, MD 20899-8940
Phone: (301) 975-3182
(256) previous ~ index ~ next
Last updated Tue Jul 11 10:00:16 2000