elevenlabs-python: audio key error in text to stream

>>> def text_stream():
...     yield "Hi there, I'm Eleven "
...     yield "I'm a text to speech API "
...
>>> audio_stream = generate(
...     text=text_stream(),
...     voice="Nicole",
...     model="eleven_monolingual_v1",
...     stream=True
... )
>>>
>>> stream(audio_stream)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/bbekdemir/Developer/sandbox/venv/lib/python3.11/site-packages/elevenlabs/utils.py", line 74, in stream
    for chunk in audio_stream:
  File "/Users/bbekdemir/Developer/sandbox/venv/lib/python3.11/site-packages/elevenlabs/api/tts.py", line 99, in generate_stream_input
    if data["audio"]:
       ~~~~^^^^^^^^^
KeyError: 'audio'

About this issue

  • Original URL
  • State: closed
  • Created 8 months ago
  • Reactions: 8
  • Comments: 26

Most upvoted comments

@nitishymtpl I just had a close look and found that it is coming from server, the output is {‘message’: ‘Unusual activity detected. Free Tier usage disabled. If you are using proxy/VPN you might need to purchase a Paid Plan to not trigger our abuse detectors. Free Tier only works if users do not abuse it, for example by creating multiple free accounts. If we notice that many people try to abuse it, we will need to reconsider Free Tier altogether. Please play fair.\nPlease purchase any Paid Subscription to continue.’, ‘error’: ‘quota_exceeded’, ‘code’: 1008}

This worked (adding the set_api_key(“”) just before ‘generate’, but only the first time for some reason. I re-ran the exact same code and it gave me again the same error: KeyError: ‘audio’

Changing data[“audio”] to data.get(“audio”) doesn’t help (there’s no ‘audio’)

This is why a the audio is streamed for the first time but not on rerunning the code.

I also ran into this issue while using the streaming text-to-speech API. Here’s a description of the problem, and some ways around it.

The generate_stream_input function in the tts.py module of the ElevenLabs Python library assumes successful audio data responses from the text-to-speech API. This assumption is unsafe, as the API can respond with error messages under various conditions (e.g., missing API key, unended/timed-out input stream, account usage exceeded, etc.).

Proposed Solution: Rather than assuming success or simply ignoring the error, the maintainers should modify the function to robustly handle error responses from the API. This includes checking for the presence of the ‘audio’ key and handling cases where it’s absent.

Workarounds:

  • Debug the tts.py module to inspect actual API responses.
  • End text input streams with an empty string ('') to avoid timeouts.
  • Upgrade to a paid account or add Two-Factor Authentication to mitigate usage-related errors.

@RicardoEscobar your suggestion and changing to a paid plan worked for me! thx!

@bbekdemir @nikolas-n @Mascobot : Did anyone fixed this bug? It seems, whenever user is sending request, elevenlab is not sending result through websocket. Seems more of server issue or webhook url seems incorrect.