(281) previous ~ index ~ next
To: Steve Lowe <firstname.lastname@example.org>
From: Rich Schwartz <email@example.com>
Subject: Re: Default evaluation conditions
Date: Tue, 22 Dec 1998 12:06:35 -0500 (EST)
I agree that the time is pretty tight. We also only have about 2
days this week and 2 days in January. We basically are limited to things
that are just rerunning the existing systems with different data or with a
very minor change that disables something.
I guess the limitation was that Jon wanted to come out with the
scored results by then. The only other way to get around this would be to
have them score the baseline systems sooner and then have the contrasts be
independent of the baseline systems. So that way they could be later.
The other deadline is the Jan 15th deadline for abstracts to the workshop.
So if you needed these results for a paper, it couldn't really be much
later than the 6th anyway.
This whole fall has been a nightmare with an evaluation every two
weeks. We have to think about how to get through this next year. This
year was not good. There are too many different domains and evaluations
and far too many of them are new and untested so there is the added
problem that the evaluation has the added pressure of not knowing whether
the mechanism will work. This promises to be WORSE next year since there
will be even more domains (event extraction) next year and TDT will
probably be a new and unrelated problems as well.
On Tue, 22 Dec 1998, Steve Lowe wrote:
> Date: Tue, 22 Dec 1998 11:19:24 -0500
> From: Steve Lowe <firstname.lastname@example.org>
> To: email@example.com
> Cc: firstname.lastname@example.org, email@example.com, firstname.lastname@example.org,
> email@example.com, firstname.lastname@example.org, email@example.com
> Subject: Re: Default evaluation conditions
> Regarding this request for contrastive results, we at Dragon have
> several contrastive submissions planned, but the deadline of January
> 6th is a little inconvenient. Is anyone else planning to submit
> constrasts, and does anyone else find this date awkward? Some of us
> will be taking holiday vacation, and expect to return to work on
> Monday the 4th---that makes Wednesday the 6th a little tight! We
> believe we will be able to make that deadline, but it is not, as I
> say, convenient.
> Could the deadline be delayed a week to Wednesday, January 13th?
> Sender: firstname.lastname@example.org
> Date: Mon, 14 Dec 1998 11:11:40 -0500
> From: Jonathan Fiscus <email@example.com>
> Organization: NIST
> X-Mailer: Mozilla 4.5 [en] (X11; U; SunOS 5.5.1 sun4u)
> X-Accept-Language: en
> MIME-Version: 1.0
> Content-Type: text/plain; charset=us-ascii
> Content-Transfer-Encoding: 7bit
> I spoke to Charles late last week and he was concerned that no one would
> submit results for the other than the default TDT evaluations, (e.g.
> segmentation on ASR transcripts and Tracking and Detection on the News
> wire text plus ASR transcripts.)
> I'd like to encourage participants to run their systems on more than the
> default evaluation conditions. I know that time is a precious commodity
> during these dim hours before the submission deadline so I'd like to
> propose an alternative.
> In our many other evaluations, there are two submission deadlines, a
> deadline for primary evaluation conditions and a deadline for
> contrastive evaluation conditions.
> TDT has only a single deadline, December 21. In the interest of
> obtaining greater breadth of coverage in both source file conditions and
> task parameter variations, (i.e., deferral periods) I propose a second,
> contrastive evaluation deadline, January 6th, 12:00 noon EST.
> The default evaluation conditions will still be due December 21st, (and
> if you are planning other non-default evaluations conditions, we will
> accept them at that time also).
> The contrastive evaluation deadline will include two types of system
> 1) results generated for non-default evaluation conditions by a system
> previously submitted on the December 21 deadline.
> 2) results generated by a system that explicitly breaks any of the
> evaluation rules. The intent here is to test the arguments that some
> evaluation rules hurt performance unnecessarily.
> I know that these additional non-default evaluations represent a burden
> to participants, but in the interest of good science, please consider
> these additional, and important, non-default evaluation conditions.
> Jonathan Fiscus Snailmail: Nat'l Inst. of Stds. and Tech.
> NIST 100 Bureau Dr. Stop 8940
> Phone: (301) 975-3182 Gaithersburg, MD 20899-8940
> Email: firstname.lastname@example.org
(281) previous ~ index ~ next
Last updated Wed Feb 3 10:44:19 1999