aws-sdk-js-v3: Having problem with client-transcribe-streaming

Describe the bug By following the read me and build a test code. But I got the following error:

TypeError [ERR_INVALID_ARG_TYPE]: The "chunk" argument must be of type string or an instance of Buffer or Uint8Array. Received an instance of Object

I guess it something to do with the line:

yield { AudioEvent: { AudioChunk: chunk } };

Could anyone help check what’s wrong with my code?

SDK version number

    "@aws-sdk/client-transcribe-streaming": "^1.0.0-rc.8",
    "aws-sdk": "^2.808.0",

Is the issue in the browser/Node.js/ReactNative? Node.js

Details of the browser/Node.js/ReactNative version v15.4.0

To Reproduce (observed behavior)

const { TranscribeStreamingClient, StartStreamTranscriptionCommand } = require("@aws-sdk/client-transcribe-streaming");
const { PassThrough } = require("stream");
const { Readable } = require("stream");
var fs= require("fs");
var audioSource = fs.createReadStream("./speech.raw",{ highWaterMark: 512 });
var writeStream = fs.createWriteStream('./output.txt');
if(!audioSource){
    console.log("fail to open audioSource");
}
const client = new TranscribeStreamingClient();

const audioStream = async function* () {
  for await (const chunk of audioSource) {
     yield { AudioEvent: { AudioChunk: chunk } };
  }
};


const command = new StartStreamTranscriptionCommand({
  // The language code for the input audio. Valid values are en-GB, en-US, es-US, fr-CA, and fr-FR
  LanguageCode: "en-US",
  // The encoding used for the input audio. The only valid value is pcm.
  MediaEncoding: "pcm",
  // The sample rate of the input audio in Hertz. We suggest that you use 8000 Hz for low-quality audio and 16000 Hz for
  // high-quality audio. The sample rate must match the sample rate in the audio file.
  MediaSampleRateHertz: 16000,
  AudioStream: audioStream(),
});

async function go()
{
    try {
      const response = await client.send(command);
      //  await handleResponse(response);
      const transcriptsStream = Readable.from(response.TranscriptResultStream);
      transcriptsStream.pipe(writeStream);

    } catch (e) {
        console.log("e=%j",e);
      if (e.name === "InternalFailureException") {
        /* handle InternalFailureException */
      } else if (e.name === "ConflictException") {
        /* handle ConflictException */
      }
    } finally {
      /* clean resources like input stream */
    }
}

go();

Expected behavior Return the transcribe result.

Screenshots none

Additional context

About this issue

Original URL
State: closed
Created 4 years ago
Reactions: 1
Comments: 17 (7 by maintainers)

Most upvoted comments

If it can help, this is my solution, that makes use of hints suggested here

https://github.com/loretoparisi/aws-transcribe-streaming-node-example

loretoparisi on Jun 11, 2021

@VAL-MH97 @Y2KForever I used this code again:

I can also share the sample wav file, if it doesnt work.

Tested, thank you for a working example! 😃

Y2KForever on Jun 4, 2021

Changing the MediaSampleRateHertz to 8000 from 16000 removes the error for me. But the stream is still empty for some reason.

Y2KForever on Mar 1, 2021