Due Tuesday, September 26
Each file represents the transcription for one utterance. The following is an example of a portion of one utterance:
** wo3 hai2_shi4 xiao4 , xiao4_de0 zi4_ji3 dou1 you5_die3_r2 bu4_hao3_yi4_si0_le0 , kan4 ta1 hen3 you1_xin1_chong1_chong1_de0 muo2_yang4 , you4 ren3_bu2_zhu4 xiao4
si 1.824
wo3
w 1.937583
3_o 2.0480001
hai2_shi4
h 2.18325
2_I 2.2945831
S 2.3939171
4_% 2.4460831
xiao4
x 2.576833
y 2.6087501
4_W 2.6954169
,
} 3.6730001
The first line is the pinyin transliteration of the utterance.
Subsequent lines give a word-by-word phonetic transcription. The first element in the transcription is the word in pinyin transliteration. Following this are the phonetic segments (using an ascii-based phonetic transcription system), with the end times in seconds of each segment.
For example, the first word is wo3 (`I'), consisting of two segments "w" and "o". The "w" ends 1.94 seconds into the utterance, and the "o" ends 2.05 seconds into the utterance. Also indicated on the "o" with the prefix "3_" is the third (low) tone.
Here are some of the transcription conventions you'll need to know:
2'_uwith a "'" represents a prominent second tone "u", and
2"_uwith a '"' represents an even more prominent second tone "u".
The upshot is that you can tell a vowel easily by the fact that it has a tone mark.
] breath group with some perceivable pause or lengthening. } full pause.