cognitive-services-speech-sdk: Sample bookmark listener from tts Azure documentation not working
Hi,
In a java application, I try to use bookmarks for evaluating audio offsets in a text-to-speech conversion and even the sample code from the tts documentation is giving false results.
Any idea on what is the problem in my coding or a limitation that applies ?
Here is the code :
private void test() {
String speechSubscriptionKey = "38blabla";
String serviceRegion = "westeurope";
config = SpeechConfig.fromSubscription(speechSubscriptionKey, serviceRegion);
assert (config != null);
config.setSpeechSynthesisOutputFormat(SpeechSynthesisOutputFormat.Audio24Khz96KBitRateMonoMp3);
String ssml = "<speak version=\"1.0\" xmlns=\"http://www.w3.org/2001/10/synthesis\" xml:lang=\"en-US\">\r\n"
+ " <voice name=\"en-US-AriaNeural\">\r\n"
+ " We are selling <bookmark mark='flower_1'/>roses and <bookmark mark='flower_2'/>daisies.\r\n"
+ " </voice>\r\n" + "</speak>\r\n" + "";
assert (ssml != null);
SpeechSynthesizer synth;
synth = new SpeechSynthesizer(config, null);
assert (synth != null);
synth.BookmarkReached.addEventListener((o, e) -> {
// The unit of e.AudioOffset is tick (1 tick = 100 nanoseconds), divide by
// 10,000 to convert to milliseconds.
System.out.println(
"Bookmark " + e.getText() + " reached. Audio offset: " + e.getAudioOffset() / 10000 + "ms.");
});
// creates voice
SpeechSynthesisResult result = synth.SpeakSsml(ssml);
assert (result != null);
And here is the result :
Bookmark flower_1 reached. Audio offset: 50ms.
Bookmark flower_2 reached. Audio offset: 50ms.
Which is not the expected result.
My configuration for this test : Windows 10 with java jdk 1.8.0_301.
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Comments: 16 (8 by maintainers)
Just synced with service guy, the ETA to fix this issue is end of Nov. Thanks!
I am also experiencing this bug from .NET.