core: Whisper Addon - Pipeline timeout
The problem
Speech-to-text has an error resulting in pipeline timeout.
Several attempts to fix the problem:
- Reinstalled Whisper several times
- Tried other languages
- Tried other models
- Wyoming intrgation reloaded
- Wyoming intrgation deleted and re-added
- ESPhome (2023.4.2) tried from webinstaller (Voice Assistant) & latest version (2023.4.4)
Exactly the same problem every time.
As an input source I use “M5Stack ATOM Echo Development Kit”, according to the instructions here.
What version of Home Assistant Core has the issue?
core-2023.5.1
What was the last working version of Home Assistant Core?
No response
What type of installation are you running?
Home Assistant OS
Integration causing the issue
Whisper
Link to integration documentation on our website
https://www.home-assistant.io/integrations/wyoming/
Diagnostics information
home-assistant_wyoming_2023-05-05T02-28-50.373Z.log
Example YAML snippet
No response
Anything in the logs that might be useful for us?
Whisper Addon:
s6-rc: info: service s6rc-oneshot-runner: starting
s6-rc: info: service s6rc-oneshot-runner successfully started
s6-rc: info: service fix-attrs: starting
s6-rc: info: service fix-attrs successfully started
s6-rc: info: service legacy-cont-init: starting
s6-rc: info: service legacy-cont-init successfully started
s6-rc: info: service whisper: starting
s6-rc: info: service whisper successfully started
s6-rc: info: service discovery: starting
INFO:__main__:Ready
[04:11:01] INFO: Successfully send discovery information to Home Assistant.
s6-rc: info: service discovery successfully started
s6-rc: info: service legacy-services: starting
s6-rc: info: service legacy-services successfully started
ERROR:asyncio:Task exception was never retrieved
future: <Task finished name='Task-20' coro=<AsyncEventHandler.run() done, defined at /usr/local/lib/python3.9/dist-packages/wyoming/server.py:26> exception=ValueError("can't extend empty axis 0 using modes other than 'constant' or 'empty'")>
Traceback (most recent call last):
File "/usr/local/lib/python3.9/dist-packages/wyoming/server.py", line 32, in run
if not (await self.handle_event(event)):
File "/usr/local/lib/python3.9/dist-packages/wyoming_faster_whisper/handler.py", line 61, in handle_event
segments, _info = self.model.transcribe(
File "/usr/local/lib/python3.9/dist-packages/wyoming_faster_whisper/faster_whisper/transcribe.py", line 124, in transcribe
features = self.feature_extractor(audio)
File "/usr/local/lib/python3.9/dist-packages/wyoming_faster_whisper/faster_whisper/feature_extractor.py", line 152, in __call__
frames = self.fram_wave(waveform)
File "/usr/local/lib/python3.9/dist-packages/wyoming_faster_whisper/faster_whisper/feature_extractor.py", line 98, in fram_wave
frame = np.pad(frame, pad_width=padd_width, mode="reflect")
File "<__array_function__ internals>", line 200, in pad
File "/usr/local/lib/python3.9/dist-packages/numpy/lib/arraypad.py", line 815, in pad
raise ValueError(
ValueError: can't extend empty axis 0 using modes other than 'constant' or 'empty'
Debug Addistant:
stage: stt
run:
pipeline: 01gzm3e9q5123zc88tmmmzbvwf
language: de
events:
- type: run-start
data:
pipeline: 01gzm3e9q5123zc88tmmmzbvwf
language: de
timestamp: "2023-05-05T02:12:57.766709+00:00"
- type: stt-start
data:
engine: stt.faster_whisper
metadata:
language: de
format: wav
codec: pcm
bit_rate: 16
sample_rate: 16000
channel: 1
timestamp: "2023-05-05T02:12:57.766981+00:00"
stt:
engine: stt.faster_whisper
metadata:
language: de
format: wav
codec: pcm
bit_rate: 16
sample_rate: 16000
channel: 1
done: false
Additional information
No response
About this issue
- Original URL
- State: closed
- Created a year ago
- Reactions: 8
- Comments: 65 (15 by maintainers)
Just a heads up for those of you running a VM HA-instance: Double check that you have the avx -instruction sets enabled for your home assistant VM. This will have a huge impact on inference times.
I’m using Proxmox and by enabling x86-64_v3 for the HA vm, I got the “small” model (and not even the int8 variety) to run with 3-4s delay for most prompts whereas before it would always timeout.
Wow… That’s a terrible design
-------- Original Message -------- On 6 Jul 2023, 21:29, ChopperRob wrote:
It’s unusable.
This helped me a lot. I changed my cpu to
host
as I only have the one proxmox node. I used this forum post to make that decision: https://forum.proxmox.com/threads/cpu-type-host-vs-kvm64.111165/Everytime audio is send from the esphome device, in my case an athom echo, it is send to a different port on the home assistant device.
I did a packet capture while doing 2 voice commands. During this time i see 2 TCP streams running on the normal 6053 esphome api port and port 80 to grab the response audio.
And i see 2 UDP streams both around 60KB in size. The first is send to port 43280 on the HA device, the second to port 54921. Earlier packet captures the destination port was 44483, 51865 etc.
Weird thing is that the source port is always 58466, normally the source port is random and the destination is fixed.
I opened all UDP traffic from my esphome devices to Home Assistant on my firewall and Whisper is now working every time. (still have a different issue with piper, the device can’t play the repsonse audio)
My guess is esphome and HA will choose an UDP port via the communications on the API channels, so both ends know what ports will be used and HA can open the correct port.
A little OT but following up on my earlier comment, with the updates from chapter 4 everything is working for me now. Confidence restored and looking forward to eventually replacing alexa. 😄
I suspect this would be because E5620 does not support AVX instruction sets. … and AFAIK the benefit that AVX brings to the table is not about optimization.
I actually experienced this today after upgrading to home assistant 2023.8.0
Starting assist would show the ‘…’ while waiting for my voice, saying anything or just waiting would have no response and eventually return ‘Timeout running pipeline’. I’m running home assistant, whisper and piper in docker and everything was working fine on 2023.7 I tried updating the versions of whisper and piper but that didn’t change anything.
When going to settings > voice assistant > home assistant, I noticed that text-to-speech piper config was using ‘Amy (low)’ instead of ‘Ryan (low)’ that I had set previously. Clicking ‘try voice’ took a few minutes before finally generating the voice. I later realised when looking at the piper logs the delay was because it had to download the amy files. After trying the voice in the settings, assist was working fine again, no more timeouts, even switching back to Ryan works fine.
I guess assist was timing out because it was asking piper for a voice that piper didn’t have downloaded. Not sure at what point it switched but thought I’d share my experience in case it helps anyone else.
yeah, not the best from a network perspective.
I just found out the issue in my case, the esphome device uses a random UDP port to send the audio to Home Assistant. My firewall was blocking this.
I have the same issue with my atom-echo and local voice assistant. If I set the voice assistant to cloud I get the following log in the homeassistant core log
Voice error: Error processing nl-NL speech: 400 No audio data received
It looks to me like the atom-echo is not sending the recording correctly, but I don’t know how to debug this.
I immidiately did, as i want to split traffic on network level and have HA host stable and dedicated 😃 I don’t do it, but you could also run HA OS in a VM and get best of both worlds.
You are right about the docs, of course. But the voice stuff is still early days and I personally like that we are getting fast updates and functionalities - even if documentation is behind.
Same issue. Only “base” model is working occasionally - VM 4gb RAM, 4 CPU (Intel® Xeon® CPU E5620 @ 2.40GHz) The whisper process taking 300%+ from the CPUs and eventually crashes Still exist in HASS 2023.8.1 and the latest version of whisper 1.0.0 I am using Firefox browser and companion android app.
I was really impressed when HA came with voice assistant. It has understood czech language instantly. As I was speaking it immediatelly wrote what I said with almost no errors. After some time not using it I realized now the voice command (microphone icon) was missing and there was no possibility to select czech language. I found out I have to install the Whisper addon, but it just does not work. The addon’s CPU load is very very high, the response, if any, is very slow (several seconds) and it never understands what I say. Is there any possibility to revert back to the original voice-to-text service?
Upgraded the machine to 4gb ram, now it works!
surprisingly enough, my whisper communication problem got fixed too when i allowed the ‘atom’ device more wiggle-room though my firewall… im guessing its due to my Vlan config, but anyways, just happy it got figured out (at least in my case). ill probably spend the next few hours turning stuff on and off via voice commands, and than never use it again until there will be a “waking word” to try out…
cheers
Same issue here. But it worked a few weeks ago. Running on Raspi 4B 8GB.
Here’s what the debug assistant gives me:
ETA: since the switch to HA Yellow it seems like I can’t even get text commands parsed by the assistant anymore. No idea what’s going on. If I set language to English it works (in text), but in German it does not match the intent that I’m looking at.
Tried with the M5Stack ATOM Echo, Browser & IOS HA APP, always the same message.
I also have the same issue, I initiate the chat by pressing the button on the M5 Atom Echo - the error I see in the whisper container is:
The logs on EspHome are:
[18:12:40][D][binary_sensor:036]: ‘Button’: Sending state ON [18:12:40][D][voice_assistant:065]: Requesting start… [18:12:40][D][voice_assistant:045]: Starting… [18:12:40][D][voice_assistant:083]: Assist Pipeline running [18:12:40][D][light:035]: ‘M5Stack Atom Echo d4e650’ Setting: [18:12:40][D][light:058]: Red: 0%, Green: 0%, Blue: 100% [18:12:42][D][binary_sensor:036]: ‘Button’: Sending state OFF [18:12:42][D][voice_assistant:073]: Signaling stop…
My ESPHome config looks like:
I updated 2023-05-11
I run haos on oracle virtualbox on a Windows machine, so processing capability should not be the root cause of this issue
Home Assistant runs in a Proxmox VM (2x CPU, 10GB RAM) on an underutilized Intel Nuc (NUC8i3BEK), 32GB Ram, 4 x Intel® Core™ i3-8109U CPU @ 3.00GHz. I don’t think that the problem is related to the hardware, since the error occurs immediately after speaking and the VM has a total utilization of about 15% CPU / 3GB Ram.
The problem doesn’t seem to be related to the M5Stack ATOM Echo either, the same thing happens in the web browser & ios app as well.
What can I try or how can I help to narrow down the problem?