Back to Silas S. Brown's home page

Chinese mistakes in commercial speech synthesizers

Commercial unit-selection voices may sound pleasant, but they do make mistakes. If you use one for language learning, be sure that it is not your only source. For example Gradint has a function to alternate between different synthesizers on different repeats (it also has a syllable-based voice which should at least be predictable).

To demonstrate the trouble with unit-selection voices for language learning, below are some example Chinese mistakes that I found, usually after just a few minutes of experimenting with each voice.

SynthesizerInputProblem
Google Translate (2011-05, using SVOX Yun which is also used by Android)继续学院The 学 is only half-pronounced. It seems like they had a recording of a whole 学 but some program played only half of it. You can't really hear the '-ue' of the 'xue'.
糖尿病'n' of 尿 unclear
深省Google correctly says this is "shēn xǐng", but its voice incorrectly says "shēn shěng" (the voice must be using a smaller dictionary than the transcriber)
somewhat unclear when spoken in isolation
Beijing Infoquick SinoVoice (2011-05)用出来The main word 用 could be clearer; at least 来 (and possibly 出) should be neutral tone (轻声) but isn't
iFlyTek InterPhonic / Bider SpeechPlus (free trial no longer available)bao3zheng4, bian4ming2, fou3ren4, jia3ru2, mei3zhou1, mu4du3, many others (via CSSML pinyin markup)Incorrect syllables spoken (I'd have thought pinyin gives better control but it doesn't)
Neospeech Hui (2011-05)糖尿病'n' of 尿 unclear
奉公守法first syllable unclear
ScanSoft (Nuance) MeiLing (also used by Nokia)深省省 spoken as shěng instead of xǐng; no way to add a dictionary entry to override it
地, 行 and many other ambiguous hanziEngine often gets the wrong reading (e.g. dì instead of de in many adverbs, xíng instead of háng in 十四行诗), no way to override (except sometimes by writing wrong hanzi)
邮编编 pitch too low for the context
切合实际,对际 in 切合实际 by itself is correctly pronounced jì, but when followed by ",对" the 际 seems to pronounced more like jiè (although not so when the hanzi after the comma is different, or when there is no pause before the 对)
絶 (variant of 絕/绝), 説 (variant of 說/说) and otherscompletely skipped, with no indication that there is a missing character in the text
用户界面界 sounds too much like 3rd tone instead of 4th tone
齁声Pitch falls from B to E-flat. Some drop in pitch of tone-1 at the end of a phrase is acceptable, but an augmented fifth?
人文学Faults on 文 (but not in 人文 by itself). Sounds better if incorrectly written as 人闻学.
撞击击 sounds like a truncated neutral tone instead of tone 1
电脑及资讯科技something like half a 个 is inserted before the 及
劫难sounds more like jián'àn than jiénàn (it must be a coded exception to 难's usual nán pronunciation but it seems the syllable boundary is wrong)
Microsoft Lili (couldn't test but heard a demo)spoken as an unclear cǎi instead of cái (the old "MS Simplified Chinese" voice actually gets this one right but gets 央行 wrong)
Neospeech Lily (no longer sold separately but used by NextSpeak and ImTranslator 2011-05 without the lexicon access)糖尿病'n' of 尿 very unclear
yong4chu5lai5, zhuan3lai2zhuan3qu4 (via pinyin lexicon)Incorrectly read as yong4chu1lai5, zhuai3lai2... but OK if input as hanzi 用出来, 转来转去
chan3chu2 or 铲除says chu4 instead of chu2
shan4yong4 or 善用shen4 instead of shan4 in pinyin; "n"s clipped in hanzi
li4bi4 or 利弊sounds like bi1bi4
you2bian1 or 邮编bian1 pitch too low for the context
jia1de5fu1spoken as jiādìfū (maybe it's being treated as 加的夫, which might be right but a pinyin override shouldn't try to guess what the pinyin should have been; what if it came from 家的夫?)
Loquendo Lisheng (2011)mu4du3, mu4du4.both words seem to end in du4 (the du3 sounds OK if it's the last thing in the sentence)
Apple Ting-ting (in OS 10.7)mu4du3, mu4du4Both "du" sounds seem incomplete


All material © Silas S. Brown unless otherwise stated.