cognitive-services-speech-sdk: Custom Chinese lexicon is not adopted by SpeakSsmlAsync

Describe the bug In Chinese, a word could have different pronunciation based on the context, so I created a custom lexicon to correct the pronunciation. The lexicon file is stored in Azure storage and can be accessed in public, the url is embedded in the ssml content correctly, generate auto with the SpeakSsmlAsyml method, there is no change to the audio

To Reproduce Steps to reproduce the behavior:

  1. Create a SpeechSynthesizer instance with following configuration VoiceName = “zh-CN-YunzeNeural”, Language = “zh-CN”,
  2. Call SpeakSsmlAsync using var result = await synthesizer.SpeakSsmlAsync(ssml); with following SSML content.
<speak xmlns="http://www.w3.org/2001/10/synthesis" xmlns:mstts="http://www.w3.org/2001/mstts" xmlns:emo="http://www.w3.org/2009/10/emotionml" version="1.0" xml:lang="zh-CN" >
<voice name="zh-CN-YunzeNeural">
<lexicon uri="https://matrixreader.blob.core.windows.net/public/lexicon.xml" />
任我行冷笑道, 剑指小腹,这个小姑娘。姊姊还好吗?
任我行大声道:你们这些人,都是我的手下败将,还不束手就擒!
</voice>
</speak>
  1. Save the audio content to a file
  2. Listen to the audio, the word “任我行” has different pronunciation.
  3. Also tested my lexicon file in the Speech Studio, the pronunciation is correct. So it could be something wrong in SDK.

Expected behavior Expect the the lexicon is adopted by sdk and pronunciation is correct.

Version of the Cognitive Services Speech SDK Version 1.27.0

Platform, Operating System, and Programming Language

  • OS: Windows 11 Pro 22H2
  • Hardware - x64,
  • Programming language: C#

Additional context

  • there is no error message

About this issue

  • Original URL
  • State: closed
  • Created a year ago
  • Comments: 33 (3 by maintainers)

Most upvoted comments

Hi @AndrewLang I am on vacation previous days. For this: " if there are words seems not supported or well recognized, then it cause the whole lexicon file not adopted", it’s correct. If one word is set a wrong pronunciation, the whole lexicon won’t work.

Region should not be the problem, I will share my code soon.

I am investigating. Thanks