(149) previous ~ index ~ next

To: Jaime Carbonell <jgc@nl.cs.cmu.edu>
From: Jonathan Fiscus <jonathan.fiscus@nist.gov>
Subject: Re: TDT2 Dev-Test submission instructions
Date: Fri, 04 Sep 1998 16:06:37 -0400

Jaime,

I'm only responding to this paragraph initially to get things rolling.
My local copy of tdt_del_980708 does not have any lower case 'nyt'
strings anywhere in the 'tables' directory, nor are there any in the
tkntext directory. I think you have a 'old/bad' version of the dev-test
corpus. Attached is the email sent out by Dave Graff annouyncing the
availability of the data. Please reload this version and make sure that
your data matches.

I'll write more later, I have to run

Jon


Jaime Carbonell wrote:
>
> Jon et al,
>
> We are in a bit of a quandry as to exactly what to submit for
> the "official" evaluation because there are known bugs that have
> not been addressed (e.g. the ABC missing stories, the "nyt" vs "NYT"
> both reported to you earlier on 27-August, etc.)
>
> If we submit runs according to specification, the evaluation software
> bombs out (error message, no results) because of the bugs. This
> is not useful.
>
> An alternative is to introduce hacks into our system in order to
> make it bug-compatible with the current evaluation sofware. In
> other words we could waste considerable time introducing mirror
> images of the current bugs in order to try to get the eval sw to
> run to completion -- and throw away all this work when the eval software
> is finally working right. This is inefficient not only because
> it is wasted work in the long run, but it also must be done
> by every site. And, it leaves a bad taste.
>
> So, what we'll do at CMU is, barring further developments:
>
> 1. Submit runs done correctly (even if the eval sw can't digest them)
>
> 2. Try to run our own evaluation for the above, using internal
> eval methods + adapting TDT1 eval
>
> 3. If time permits, we may introduce workaround hacks to our code
> to try to get something half-way meaningful from the TDT-2 eval
> software. This may require extending submission by a day or two.
>
> This has been a frustraing time for us. We want the eval software
> to help us develop better algorithms and methods -- i.e. to sharpen
> and focus the research process. TDT-1 worked well in this respect.
>
> Things have been otherwise for TDT-2. And we do realize it's more
> complicated, and that Jon is overworked with multiple projects, and
> trying the best he can. And this is massive work for LDC, and so on.
> However, it's hard to do experimental research without functional
> measuring devices.
>
> --Jaime et CMUers
>
>

--
Jon Fiscus
NIST
Email: jfiscus@nist.gov
Phone: (301) 975-3182
(149) previous ~ index ~ next

Last updated Wed Sep 9 09:40:56 1998