求高手翻译论文“自然语言处理”part 3

来源:百度知道 编辑:UC知道 时间:2024/09/23 18:28:12
Another example of the same kind of phenomenon occurs in the plurals of English nouns. Consider the words cat, cats, dog and dogs. A native English speaker explaining how plurals are formed is likely to say something like "you add an s sound to the end." Careful listening will show that cats is indeed pronounced with a s sound, but dogs is not: it ends with a z sound. Yet as with my previous example, English speakers don't normally notice this difference. Again it isn't because they can't distinguish between s and z since Sue and zoo or hiss and his differ only in their s and z sounds.
The conclusion is that the sounds that native speakers 'hear' are not the sounds they make. This is important in both speech synthesis and speech recognition. When generating speech from written text, a synthesizer must not turn every n or s into an n or s sound, but must use more complex rules which mimic the native speaker. Similarly a speech recognition system must

英文单词的复数形式可以说是这类现象的另一个例子。请思考猫儿,猫儿们;狗儿,狗儿们(这两对词)。一位以英文为母语的人在解释复数形式的时候可能会说:“你就在词后面加个s音(就行了)”。如果注意听的话就会发现,猫儿(cat)的复数音的确是发为s,可是狗(dog)不是,它以z音结尾。虽然与我之前的例子一样,英文母语者通常都不会注意到这点不同。这再一次不是因为他们无法辨别s和z音的区别,因为Sue和zoo,hiss和his里的s和z也是只有s和z音部分不同而已,(可他们分的清清楚楚)。
结论就是通常对于说母语的人来讲,他们‘听到’的音并不是他们他们发出的音。这点对于语音合成和语音识别都是很重要的。当在从书写文字生成语音的时候,合成者不能把所有n和s变成n和s音,而要通过更复杂的规律去模仿人类说话。同样的,一个语音识别系统必须能够识别rum和run的不同,可是却不允许把imput和input认为是两个词。
更让人尴尬的事实是,这种特性甚至延伸到词与词之间。试着用‘非正式’的语气快速说以下几句话:When playing football, watch the referee.When talking about other people, watch who's listening.When catching a hard ball, wear gloves.
(起码如果你和我说同一种方言的话),你会发现你会分别说出Whem, when, when. 这说明了语音合成系统如果纯粹用提前录好的单词是没法合成听起来自然的声音的,因为一个词的发音会取决于前后词的读音。
语音识别系统必须在能够处理When的三种读音的同时,还能够区别rum, run,和rung。
就连这个例子都没法表现所有的复杂情况。美国腔的人在发write,writer, ride, rider的t和d音德时候和英国腔的人不同。如果我们忽略这微小的发音差别,这些词能够勉强发音成是write, wrider, ride, rider. 也就是说ride和write可以被清楚识别,可是rider和writer就变成发音一模一样。虽然用这些方言腔调的人完全没有‘听’出这点。在I’m a write and I write books这句话里的实际发音听起来像