Segmentation of Prosody Data

The interface for doing prosody transcription Running the interface:

At the command prompt, run

tel-trans prosody train

files are in one of four categories
'ready', 'marking', 'marked', or 'done'.

select the file you wish to segment, enter it at the prompt.



Structure of the files

The file structure is rather straight forward -
the miscellaneous speech regions (chit-chat, instructions, etc) are not to be transcribed. In fact, all of utterences should already be in the files, so all that is needed is the proper alignment of the speech to the utterences. Most of the utterences will be made two or three times, but a simple cut and paste operation will suffice to make this easy to take care of.

untouched transcript:
neutral,conversation,November fifth neutral,conversation,October first neutral,conversation,September fourth
segmented transcript:
STARTTIME ENDTIME : EMOTION CATEGORY,DISTANCE CONTINUUM,UTTERENCE
88.67 89.63 : neutral,conversation,November fifth 90.97 91.92 : neutral,conversation,November fifth 93.54 94.59 : neutral,conversation,November fifth 95.76 96.89 : neutral,conversation,October first 98.17 99.41 : neutral,conversation,October first 100.81 101.99 : neutral,conversation,September fourth 103.05 104.25 : neutral,conversation,September fourth
-(( do not be concerned about empty lines - they are of little consequence.))

Please Note - You must be careful with segmenting the correct region, particularly trailing "th".
For instance,examine the waveform below -



The dotted line to the right is necessary despite visual appearences to the contrary. this utterence is "november fifth", and the 'th' extends to 94.59. Be aware that this occurs often - try to be as precise as possible without clipping the speakers utterence.
nmartey@ldc.upenn.edu
Last modified: Fri Aug 24 15:39:30 2001