RT-04 Transcription
The STT DevTest and Evaluation transcription data sets aim to produce verbatim (word-for-word) transcripts of broadcast news and conversational telephone speech. Annotators follow detailed careful transcription guidelines.
See also the RT-02 and RT-03 transcription pages:
NOTE: While the careful transcription guidelines vary only slightly from year to year, it is important to use only the current year's transcription specification. For example, the RT-03 guidelines call for a certain overlapping speech format in broadcast news, whereas the RT-04 transcription guidelines introduce another convention altogether. Please be sure to refer to the RT-04 Careful Transcription Specification version 3.0 (noted and linked, above) for the year 2004.