|
|
Segmenting Reports (sr)
Q) How do I
determine if a segment should be considered a story or not? Especially
with the introductory statements?
A) The rule of the thumb
is that if there are two or more independent
declarative clauses on a single topic, it
is seen as a story. If you find a situation that is confusing, please
send a message describing the situation to
lac-adm.
Please include the DOCID (Document ID), and a sample of the story with
your questions.
Q) When a section
is missing or only partially transcribed in a transcript but exists
in the speech file, how is it indicated?
A)
These incidents are indicated with
an <su for a non-transcribed report, and <sn..> for a non-report.
NOTE - If more than two independent declarative clauses exist, but the
rest has not been transcribed, the section should be marked with an <sr.
Q) Is music considered
part of a report, or not?
A) If a period of music is part of the report,
ie contextually important within the body of the report, include
it within the report boundaries. If music is at the end of a reports
however, indicate the start of a "significant" period of music with an
<sn. Significant periods are currently established as those over 10
seconds in length. Less than 10 seconds between speech can be left on the
end of the preceding speech segment.
Q) What should I do
when a reporter gives a lead in statement that does not correspond to the
actual report delivered by the remote reporter?
A) In situations like this,
seperate the introductory statements and the delivered report as two different
topics, and therefore two different sections.
Q) What should I do
if one story seems relevant to more than one of the defined topics?
A) You should label it as relevant to ALL
of the appropriate topics. It is o.k. (and in fact, sometimes necessary)
to mark one story with multiple topics.
Speaker Changes
Q) The
transcripts do not indicate the appearence of foreign language speakers.
Do these need to be indicated?
A) No.
Q) If there is a speaker
change that is not indicated in the trancript, is it necesssary to insert
a <turn> tag?
A) No, that is not necessary.
|
|
Q) Should I be concerned
if I come across repeated versions of the same article?
A) No. This may happen frequently. Articles may
have slight difference that you may not initially notice. Label the article
in the same fashion as the earlier version(s) that were encountered, and
continue.
Q) What should I do
if one story seems relevant to more than one of the defined topics?
A) You should label it as relevant to ALL
of the appropriate topics. It is o.k. (and in fact, sometimes necessary)
to mark one story with multiple topics.
Q) If there is an article
that is directly related to a stated topic, but has been separated in such
a fashion that without the earlier section there is no obvious connection
to the topic, is it still considered to be "Yes"? The article is
not missing, which would be a condition for rejecting it, but preceeding.
A) Reject the article, as "missing part
one".
Q) What are the conditions
under which articles can be rejected?
A) a) When the text is illegible or malformatted
(i.e. problem fonts)
b) When the text
is a list, ie sports scores, weather temperatures, stock indices, etc.
c) When there are
multiple stories within the text provided.
d) When the story
provided is incomplete i.e. is missing the first part.
e) When you encounter articles in labeling that are restarant
reviews, recipies, - these are not news.
Q) How are articles that are
parodies of news events labeled?
A) This is a tricky situation, but it appears
that most of these type of articles will warrant a "brief" notation.
This first
situation was labeled "brief" as a Lewinsky - appropriate topic.
The next
situation was determined to be appropriate as "brief"
to the Iraq situation, but the reference to Lewinsky was not seen as significant
enough to be relevant.