node-sdk: [text-to-speech] new API doesn't work

Hi, We are trying to migrate our text to speech implementation to the new authentication method. We were able to obtain the access token from curl and the authentication API, then we tried to use raw websockets, but we didn’t receive any data.

So we followed the example here (which is updated 13 days ago) which still authenticate using the old method (username and password), but we received no data using that too.

Here is our code (just a dummy script):

var TextToSpeechV1 = require('watson-developer-cloud/text-to-speech/v1');

var textToSpeech = new TextToSpeechV1({
//   iam_apikey: '<api_key>',
//   url: '<service_url>'
  username: "<username>",
  password: "<password>"
});

var synthesizeParams = {
    text: 'Hello world',
    accept: 'audio/mp3',
    voice: 'en-US_AllisonVoice'
};

async function f() {
    let promise = new Promise((resolve, reject) => {
        // Pipe the synthesized text to a file.
        var synthesizeStream = textToSpeech.synthesizeUsingWebSocket(synthesizeParams)
        
        synthesizeStream.on('message', (message, data) => {
            console.log(data)
        })
        synthesizeStream.on('close', (code, reason) => {
            console.log(code)
            resolve("100")
        })
        synthesizeStream.on('error', (err) => {
            console.log(err)
        })  
    });
  
    let result = await promise; // wait till the promise resolves (*)
    console.log(result); // "done!"
}
  
f().then(()=>{
    console.log("done")
});

Notice that the example uses the old authentication method, and the docs here uses the api_key and the url, but both methods don’t work, we can’t receive data using either of them.

  • Expected behavior: Receive audio data.
  • Actual behavior: Receiving no data.
  • Node version: 8.5.0
  • SDK version: 3.15.0

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Comments: 20 (11 by maintainers)

Most upvoted comments

@mrwnmonm Your original issue is caused by two problems - one that you can fix now and one that I posted a fix for with #823.

  1. Because you are not piping your stream anywhere, the underlying class needs to know to start sending data. You aren’t receiving any data because the connection is not initialized. You need to add the line synthesizeStream.resume(); after you instantiate the stream. See this note in the example.

  2. Once you do that, the request will be made but the connection will fail because of a bug handling the query parameters. The line voice: 'en-US_AllisonVoice' would cause the request to fail. Now that the fix is posted, you should be able to use as is.

@jeffpk62 Looking at the example, that is for synthesize over HTTP, rather than WebSockets. When the HTTP method is used without a callback, it will return a Stream for CF instances but will not return anything for IAM instances. This is a known technical limitation of handling IAM tokens in the SDK and will not be solved. Instead, we are going to start returning Promises when no callback is specified so behavior is consistent between CF and IAM.

To summarize - we should change the example to use a callback function instead of assuming that a stream is being returned.

Update: The SDK HTTP version works if you did it this way:

var TextToSpeechV1 = require('watson-developer-cloud/text-to-speech/v1');
var fs = require('fs');

var textToSpeech = new TextToSpeechV1({
  iam_apikey: 'apikey',
  url: 'https://gateway-syd.watsonplatform.net/text-to-speech/api/'
});

var synthesizeParams = {
  text: 'Hello world',
  accept: 'audio/mp3',
  voice: 'en-US_AllisonVoice'
};

textToSpeech.synthesize(synthesizeParams, function (err, audio) {
  if (err) {
    console.log('failure');
    return;
  }

  fs.writeFileSync('result-audio.mp3', audio);
  console.log('scuccess');
  });

but the code in the docs here produces that error Cannot read property 'on' of undefined.

and still the synthesizeUsingWebSocket method doesn’t work.

Thanks for fix the problem.But the example code in the docs here https://console.bluemix.net/apidocs/text-to-speech?language=node#synthesize-audio

is wrong. It is official document example code. Please update it right. This issure wasted me several days to find the right place and ringt answer to solve it. Thank you.

@dpopp07 Thanks, it works perfectly now. 👍 🙏 Of course we didn’t go with piping the data, because when requesting the timings, the returned data could be the timings object or the audio buffer. So we went with synthesizeStream.resume();.