Frequently Asked Questions

Segmentation  | Labeling

 

 Question Field - Segmentation 
Section report 
Section non-report 
Speaker Turns 
Examples of Story situations
 
Send new question to lac-adm
 

 
Segmenting Reports  (sr)
 Q) How do I determine if a segment should be considered a story or not? Especially with the introductory statements?
A) The rule of the thumb is that if there are  two or more independent declarative clauses on a single topic, it is seen as a story. If you find a situation that is confusing, please  send a message describing the situation to lac-adm. Please include the DOCID (Document ID), and a sample of the story with your questions.
 
Q)  When a section is missing or only partially transcribed  in a transcript but exists in the speech file, how is it indicated?
A) These incidents are indicated with an <su for a non-transcribed report, and <sn..> for a non-report. NOTE - If more than two independent declarative clauses exist, but the rest has not been transcribed, the section should be marked with an <sr.

Q)  Is music considered part of a report, or not? 
A)  If a period of music is part of the report, ie contextually important within the body of the report,  include it within the report boundaries.  If music is at the end of a reports however, indicate the start of a "significant" period of music with an <sn. Significant periods are currently established as those over 10 seconds in length. Less than 10 seconds between speech can be left on the end of the preceding speech segment.
 

Q) What should I do when a reporter gives a lead in statement that does not correspond to the actual report delivered  by the remote reporter?
A) In situations like this, seperate the introductory statements and the delivered report as two different topics, and therefore two different sections.
 

Q) What should I do if one story seems relevant to more than one of the defined topics?
A)  You should label it as relevant to ALL of the appropriate topics. It is o.k. (and in fact, sometimes necessary) to mark one story with multiple topics.
 

Speaker Changes
Q) The transcripts do not indicate the appearence of foreign language speakers. Do these need to be indicated?
 A) No.

Q) If there is a speaker change that is not indicated in the trancript, is it necesssary to insert a <turn> tag?
A) No, that is not necessary.
 
 

 
 
 
 
 


 
 
 
Frequently Asked Questions
Labeling

 

Question Field - labeling 
Repeated Stories 
Rejecting reports 
Parody
 
 Send question to lac-adm
 

Q) Should I be concerned if I come across repeated versions of the same article?
A) No. This may happen frequently. Articles may have slight difference that you may not initially notice. Label the article in the same fashion as the earlier version(s) that were encountered, and continue.

Q) What should I do if one story seems relevant to more than one of the defined topics?
A)  You should label it as relevant to ALL of the appropriate topics. It is o.k. (and in fact, sometimes necessary) to mark one story with multiple topics.

Q) If there is an article that is directly related to a stated topic, but has been separated in such a fashion that without the earlier section there is no obvious connection to the topic, is it still considered to be "Yes"?  The article is not missing, which would be a condition for rejecting it, but preceeding.
A) Reject the article,  as "missing part one".

Q) What are the conditions under which articles can be rejected?
A)  a) When the text is illegible or malformatted (i.e. problem fonts)
       b) When the text is a list, ie sports scores, weather temperatures, stock indices, etc.
       c) When there are multiple stories within the text provided.
       d) When the story provided is incomplete i.e. is missing the first part.
          e) When you encounter articles in labeling that are restarant reviews, recipies, - these are not news.

Q) How are articles that are parodies of news events labeled?  
 A) This is a tricky situation, but it appears that most of these type of articles will warrant a "brief" notation.  This  first situation was labeled "brief"  as a Lewinsky - appropriate topic. The  next situation    was determined to be appropriate as "brief" to the Iraq situation, but the reference to Lewinsky was not seen as significant enough to be relevant.