(235) previous ~ index ~ next

To: tdt-distrib@unagi.cis.upenn.edu
From: Jon_Yamron@Dragonsys.com
Subject: Re: Clarification of tracking
Date: Thu, 5 Nov 1998 18:02:38 -0500

Hmm...my understanding of the discussion flying back and forth is that
using prior data in an unsupervised way to beef up the topic model DOES
"conform to one of the conventional baseline runs". This is an important
point, given that we may be under time constraints to finish everything on
time. Are you saying that a site that does this kind of adaptation must
also do a run without it? Or just that it would be nice, so that we can
see the effect?

- Jon





George Doddington <doddington@nist.gov> on 11/05/98 05:29:43 PM

Please respond to doddington@nist.gov

To: Jon Yamron/Dragon Systems USA
cc: tdt-distrib@unagi.cis.upenn.edu
Subject: Re: Clarification of tracking




Jon_Yamron@DRAGONSYS.COM wrote:

> OK, I can use anything I find automatically, in the specified "non-topic"
> stories and in the January-April data, either to train my topic model or
my
> background model. I would point out that this kind of messes up the idea
> of demonstrating the effect of having different amounts of training data
> (e.g., for many events, we may see no difference in performance for Nt=1,
> 2, 4, 8, or 16, because we are able to automatically extract many useful
> examples from prior data, while for other events the differnces will be
> large). But if you don't care, I don't care.

Oops. Good point, Jon. Well, we certainly care about characterizing
performance in terms of Nt. But, on the other hand, we don't want to
squelch creative research ideas. So I would urge any site that would
like to try this (risky) idea to calibrate their own performance with
an experiment that conforms to one of the conventional baseline runs.
Good luck!
--
George Doddington at NIST: doddington@nist.gov or 301/975-3261








(235) previous ~ index ~ next

Last updated Fri Nov 6 15:29:23 1998