(372) previous ~ index ~ next
To: tdt-distrib@unagi.cis.upenn.edu
From: David Graff <graff@unagi.cis.upenn.edu>
Subject: [from James Allan:] TDT, annotation query
Date: Mon, 24 Jun 2002 11:04:56 -0400
------- Forwarded Message
From: James Allan <allan@cs.umass.edu>
To: tdt-distrib@ldc.upenn.edu
Subject: TDT, annotation query
Date: Mon, 24 Jun 2002 10:41:56 -0400
TDTers,
Regarding some of the discussion last week.... The current LDC plans
are to tag stories as YES or NO with respect to any topic. A YES
means that the story is substantially about a topic, a NO means that
it is not. NO stories may contain passing mentions of a topic.
In the past we have had BRIEF tagging which meant that less than 10%
of the story was on the topic. That 10% could have been substantive
(in which case it would be a YES if it were retagged) or passing (in
which case it would be a NO if it were retagged).
Note that this means you may want to be careful how you handle BRIEF
stories in your training, since their equivalents could be either NO
or YES in the evaluation set.
Now on to the question... Stephanie at the LDC has cautiously
indicated to me that they could annotate passing mentions with a tag
like PASSING. This would be under these conditions: (1) documents
would be annotated PASSING if they are detected during other
processing; (2) there would be *no* quality assurance on PASSING
mentions; and (3) there would be *no* attempt to guarantee that all
passing mentions were found and annotated. Finally, this is a
proposed possibility, not a guarantee that it can be done.
That means that we would have some passing mentions annotated, but
would not know whether they were all found.
My thinking is that it would be a shame to throw away the information
that a story had a passing mentioned if it's found, so I am in favor
of encouraging the LDC to investigate doing this. I am assuming that
it would cause at most a very small impact on their getting started,
so that if it looks like a huge impact, then they should not.
Any comments, opinions, support, disagreement, ... ?
-- james
------- End of Forwarded Message
-------------------------------------------------------------
To unsubscribe from tdt-distrib, email majordomo@ldc.upenn.edu
with "unsubscribe tdt-distrib" in the body of the message.
(372) previous ~ index ~ next
Last updated Mon Jun 24 18:19:30 2002