addons: Whisper container not working: Service exited with code 256 (by signal 4) on x86_64

TL/DR:

The issue was that CTranslate2, one of the libraries used by whisper, requires SSE4.1 on AMD to function. If you have an older cpu that doesn’t support AVX or SSE4.1 instructions: Bad luck. It’s theoretically possible (note: possible does not mean certain, nor likely) that you could manually build the required libraries manually but it would be very unlikely to perform satisfactorily.

If you have a CPU that does support those instructions, it’s possible that your VM host is configured to emulate an older cpu rather than directly pass the host cpu’s capabilities. Wherever you configure your VM, look for a setting for “cpu type” or similar, and change it from what might be “QEMU” or something to “host” or “pass-through”, then restart the VM.

To work out what CPU you have, from a terminal run run cat /proc/cpuinfo.

  • if running it on your homeassistant VM, it will show you what cpu is being “shown” to the VM.
  • If you run it on your host OS, it should show you the physical CPU’s details.

What you are looking for is the section called “flags” having avx or sse4_1

If you have an ARM cpu (say a raspberry pi) or something else that isn’t x86_64, then open a new ticket - this issue is about x86_64.

Describe the issue you are experiencing

(Note: whisper doesn’t appear in the add-on list in the bug report form, I hope I’m reporting in the right place)

Whisper does not appear via wyoming autodiscovery (piper has shown up OK), and logs show the service inside the whisper container repeatedly exiting with “INFO: Service exited with code 256 (by signal 4)”

After starting the whisper container in add-ons, the logs show:

s6-rc: info: service s6rc-oneshot-runner: starting
s6-rc: info: service s6rc-oneshot-runner successfully started
s6-rc: info: service fix-attrs: starting
s6-rc: info: service fix-attrs successfully started
s6-rc: info: service legacy-cont-init: starting
s6-rc: info: service legacy-cont-init successfully started
s6-rc: info: service whisper: starting
s6-rc: info: service whisper successfully started
s6-rc: info: service discovery: starting
[18:51:16] INFO: Service exited with code 256 (by signal 4)
[18:51:18] INFO: Service exited with code 256 (by signal 4)
[18:51:20] INFO: Service exited with code 256 (by signal 4)
[18:51:22] INFO: Service exited with code 256 (by signal 4)

The Service exited line repeats infinitum (for well over 24 hours, anyway).

On the vm running HAOS, the last few lines of journalctl -e show:

May 09 08:55:00 homeassistant addon_core_whisper[463]: [18:55:00] INFO: Service exited with code 256 (by signal 4)
May 09 08:55:02 homeassistant audit[1799253]: ANOM_ABEND auid=4294967295 uid=0 gid=0 ses=4294967295 subj=docker-default pid=1799253 comm="python3" exe="/usr/bin/python3.9" sig=4 res=1
May 09 08:55:02 homeassistant audit: BPF prog-id=133583 op=LOAD
May 09 08:55:02 homeassistant audit: BPF prog-id=133584 op=LOAD
May 09 08:55:02 homeassistant audit: BPF prog-id=133585 op=LOAD
May 09 08:55:02 homeassistant systemd[1]: Started Process Core Dump (PID 1799274/UID 0).
May 09 08:55:02 homeassistant systemd-coredump[1799275]: Process 1799253 (python3) of user 0 dumped core.
May 09 08:55:02 homeassistant systemd[1]: systemd-coredump@44474-1799274-0.service: Deactivated successfully.
May 09 08:55:02 homeassistant audit: BPF prog-id=133585 op=UNLOAD
May 09 08:55:02 homeassistant audit: BPF prog-id=133584 op=UNLOAD
May 09 08:55:02 homeassistant audit: BPF prog-id=133583 op=UNLOAD
May 09 08:55:02 homeassistant addon_core_whisper[463]: [18:55:02] INFO: Service exited with code 256 (by signal 4)
May 09 08:55:04 homeassistant audit[1799290]: ANOM_ABEND auid=4294967295 uid=0 gid=0 ses=4294967295 subj=docker-default pid=1799290 comm="python3" exe="/usr/bin/python3.9" sig=4 res=1
May 09 08:55:04 homeassistant kernel: show_signal: 14 callbacks suppressed
May 09 08:55:04 homeassistant kernel: traps: python3[1799290] trap invalid opcode ip:7f87d3bd9c8c sp:7ffc61d97dc8 error:0 in libctranslate2-52d7eefc.so.3.13.0[7f87d3ab0000+2cc3000]
May 09 08:55:04 homeassistant kernel: audit: type=1701 audit(1683622504.596:311801): auid=4294967295 uid=0 gid=0 ses=4294967295 subj=docker-default pid=1799290 comm="python3" exe="/usr/bin/python3.9" sig=4 res=1
May 09 08:55:04 homeassistant audit: BPF prog-id=133586 op=LOAD
May 09 08:55:04 homeassistant audit: BPF prog-id=133587 op=LOAD
May 09 08:55:04 homeassistant audit: BPF prog-id=133588 op=LOAD
May 09 08:55:04 homeassistant kernel: audit: type=1334 audit(1683622504.618:311802): prog-id=133586 op=LOAD
May 09 08:55:04 homeassistant kernel: audit: type=1334 audit(1683622504.618:311803): prog-id=133587 op=LOAD
May 09 08:55:04 homeassistant kernel: audit: type=1334 audit(1683622504.618:311804): prog-id=133588 op=LOAD
May 09 08:55:04 homeassistant systemd[1]: Started Process Core Dump (PID 1799311/UID 0).
May 09 08:55:04 homeassistant systemd-coredump[1799313]: Process 1799290 (python3) of user 0 dumped core.
May 09 08:55:04 homeassistant systemd[1]: systemd-coredump@44475-1799311-0.service: Deactivated successfully.
May 09 08:55:05 homeassistant kernel: audit: type=1334 audit(1683622504.990:311805): prog-id=133588 op=UNLOAD
May 09 08:55:05 homeassistant kernel: audit: type=1334 audit(1683622504.990:311806): prog-id=133587 op=UNLOAD
May 09 08:55:05 homeassistant kernel: audit: type=1334 audit(1683622504.990:311807): prog-id=133586 op=UNLOAD
May 09 08:55:04 homeassistant audit: BPF prog-id=133588 op=UNLOAD
May 09 08:55:04 homeassistant audit: BPF prog-id=133587 op=UNLOAD
May 09 08:55:04 homeassistant audit: BPF prog-id=133586 op=UNLOAD
May 09 08:55:05 homeassistant addon_core_whisper[463]: [18:55:05] INFO: Service exited with code 256 (by signal 4)
May 09 08:55:06 homeassistant audit[1799327]: ANOM_ABEND auid=4294967295 uid=0 gid=0 ses=4294967295 subj=docker-default pid=1799327 comm="python3" exe="/usr/bin/python3.9" sig=4 res=1
May 09 08:55:06 homeassistant kernel: traps: python3[1799327] trap invalid opcode ip:7fea8649fc8c sp:7ffceda1d7e8 error:0 in libctranslate2-52d7eefc.so.3.13.0[7fea86376000+2cc3000]
May 09 08:55:06 homeassistant kernel: audit: type=1701 audit(1683622506.532:311808): auid=4294967295 uid=0 gid=0 ses=4294967295 subj=docker-default pid=1799327 comm="python3" exe="/usr/bin/python3.9" sig=4 res=1
May 09 08:55:06 homeassistant audit: BPF prog-id=133589 op=LOAD
May 09 08:55:06 homeassistant audit: BPF prog-id=133590 op=LOAD
May 09 08:55:06 homeassistant audit: BPF prog-id=133591 op=LOAD
May 09 08:55:06 homeassistant systemd[1]: Started Process Core Dump (PID 1799348/UID 0).
May 09 08:55:06 homeassistant systemd-coredump[1799349]: Process 1799327 (python3) of user 0 dumped core.
May 09 08:55:06 homeassistant systemd[1]: systemd-coredump@44476-1799348-0.service: Deactivated successfully.
May 09 08:55:06 homeassistant audit: BPF prog-id=133591 op=UNLOAD
May 09 08:55:06 homeassistant audit: BPF prog-id=133590 op=UNLOAD
May 09 08:55:06 homeassistant audit: BPF prog-id=133589 op=UNLOAD
May 09 08:55:06 homeassistant addon_core_whisper[463]: [18:55:06] INFO: Service exited with code 256 (by signal 4)

and dmesg shows:

[91039.220989] show_signal: 14 callbacks suppressed
[91039.220996] traps: python3[1801332] trap invalid opcode ip:7fc5f8147c8c sp:7fffe2215c78 error:0 in libctranslate2-52d7eefc.so.3.13.0[7fc5f801e000+2cc3000]
[91039.221277] audit: type=1701 audit(1683622607.570:312158): auid=4294967295 uid=0 gid=0 ses=4294967295 subj=docker-default pid=1801332 comm="python3" exe="/usr/bin/python3.9" sig=4 res=1
[91039.243698] audit: type=1334 audit(1683622607.593:312159): prog-id=133739 op=LOAD
[91039.244254] audit: type=1334 audit(1683622607.593:312160): prog-id=133740 op=LOAD
[91039.244867] audit: type=1334 audit(1683622607.594:312161): prog-id=133741 op=LOAD
[91039.614449] audit: type=1334 audit(1683622607.963:312162): prog-id=133741 op=UNLOAD
[91039.614469] audit: type=1334 audit(1683622607.964:312163): prog-id=133740 op=UNLOAD
[91039.614477] audit: type=1334 audit(1683622607.964:312164): prog-id=133739 op=UNLOAD
[91041.207015] traps: python3[1801387] trap invalid opcode ip:7f0688604c8c sp:7fffdc873f68 error:0 in libctranslate2-52d7eefc.so.3.13.0[7f06884db000+2cc3000]
[91041.207088] audit: type=1701 audit(1683622609.556:312165): auid=4294967295 uid=0 gid=0 ses=4294967295 subj=docker-default pid=1801387 comm="python3" exe="/usr/bin/python3.9" sig=4 res=1
[91045.174293] show_signal: 14 callbacks suppressed
[91045.174300] traps: python3[1801469] trap invalid opcode ip:7f68db4bcc8c sp:7fffdb6e65e8 error:0 in libctranslate2-52d7eefc.so.3.13.0[7f68db393000+2cc3000]
[91045.174682] audit: type=1701 audit(1683622613.524:312179): auid=4294967295 uid=0 gid=0 ses=4294967295 subj=docker-default pid=1801469 comm="python3" exe="/usr/bin/python3.9" sig=4 res=1
[91045.206187] audit: type=1334 audit(1683622613.555:312180): prog-id=133748 op=LOAD
[91045.206470] audit: type=1334 audit(1683622613.556:312181): prog-id=133749 op=LOAD
[91045.206645] audit: type=1334 audit(1683622613.556:312182): prog-id=133750 op=LOAD
[91045.576399] audit: type=1334 audit(1683622613.926:312183): prog-id=133750 op=UNLOAD
[91045.576414] audit: type=1334 audit(1683622613.926:312184): prog-id=133749 op=UNLOAD
[91045.576420] audit: type=1334 audit(1683622613.926:312185): prog-id=133748 op=UNLOAD
[91047.186082] traps: python3[1801506] trap invalid opcode ip:7fcf20600c8c sp:7ffd5879e698 error:0 in libctranslate2-52d7eefc.so.3.13.0[7fcf204d7000+2cc3000]
[91047.186258] audit: type=1701 audit(1683622615.535:312186): auid=4294967295 uid=0 gid=0 ses=4294967295 subj=docker-default pid=1801506 comm="python3" exe="/usr/bin/python3.9" sig=4 res=1

My stack is:

  • 4 x Opteron 6174 @ 2.2GHz with 132GB RAM
    • Debian bullseye
    • libvirt 7.0.0-3 amd64
      • HA Operating System 10.1 on kvm with 8GB RAM, 2 vCPU, first with the default cpu (qemu64?) then with host-passthrough with same results.
        • Home Assistant 2023.5.2
        • Supervisor 2023.04.1
        • Frontend 20230503.3 - latest

What type of installation are you running?

Home Assistant OS

Which operating system are you running on?

Home Assistant Operating System

Which add-on are you reporting an issue with?

Whisper

What is the version of the add-on?

Whisper 0.1.1

Steps to reproduce the issue

  1. Install HAOS vm on above machine
  2. Add “whisper” add-on and click “start”
  3. Observe log entries showing service restarting

System Health information

None of these appear relevant, being outside of the whisper container’s environment:

SQL query does full table scan

The query select string_agg(ss.state, ',') as state, string_agg(date_trunc('second', ss.last_updated_ts - now())::text, ',') as deltatime, max(ss.last_updated_ts) as last_updated from ( select state, to_timestamp(last_updated_ts) as last_updated_ts from states where entity_id = 'sensor.housepower_usage' order by state_id DESC limit 10 ) ss; contains the keyword entity_id but does not reference the states_meta table. This will cause a full table scan and database instability. Please check the documentation and use states_meta.entity_id instead.
Update webhook trigger: 9ab4b310-4e51-4140-9452-405e0e63c748

This stops working in version 2023.7.0. Please address before upgrading.
A choice needs to be made about whether the 9ab4b310-4e51-4140-9452-405e0e63c748 webhook automation trigger is accessible from the internet. Edit the automation "Sleep as Android webhook handler", (automation.sleep_as_android_webhook_handler) and click the gear icon beside the Webhook ID to choose a value for 'Only accessible from the local network'
Update hubcentral with ESPHome 2023.4.0 or later

To improve Bluetooth reliability and performance, we highly recommend updating hubcentral with ESPHome 2023.4.0 or later. When updating the device from ESPHome earlier than 2022.12.0, it is recommended to use a serial cable instead of an over-the-air update to take advantage of the new partition scheme.

Anything in the Supervisor logs that might be useful for us?

23-05-09 18:51:14 INFO (SyncWorker_4) [supervisor.docker.addon] Starting Docker add-on homeassistant/amd64-addon-whisper with version 0.1.1
23-05-09 19:01:57 INFO (MainThread) [supervisor.host.info] Updating local host information
23-05-09 19:01:58 INFO (MainThread) [supervisor.host.services] Updating service information
23-05-09 19:01:58 INFO (MainThread) [supervisor.host.network] Updating local network information
23-05-09 19:01:58 INFO (MainThread) [supervisor.host.sound] Updating PulseAudio information
23-05-09 19:01:58 INFO (MainThread) [supervisor.host.manager] Host information reload completed
23-05-09 19:12:10 INFO (MainThread) [supervisor.homeassistant.api] Updated Home Assistant API token

Anything in the add-on logs that might be useful for us?

(per above snippet):

[19:19:27] INFO: Service exited with code 256 (by signal 4)
[19:19:29] INFO: Service exited with code 256 (by signal 4)
[19:19:31] INFO: Service exited with code 256 (by signal 4)
[19:19:33] INFO: Service exited with code 256 (by signal 4)
[19:19:35] INFO: Service exited with code 256 (by signal 4)
[19:19:37] INFO: Service exited with code 256 (by signal 4)
[19:19:39] INFO: Service exited with code 256 (by signal 4)
[19:19:41] INFO: Service exited with code 256 (by signal 4)
[19:19:43] INFO: Service exited with code 256 (by signal 4)
[19:19:45] INFO: Service exited with code 256 (by signal 4)
[19:19:47] INFO: Service exited with code 256 (by signal 4)
[19:19:49] INFO: Service exited with code 256 (by signal 4)
[19:19:51] INFO: Service exited with code 256 (by signal 4)
[19:19:53] INFO: Service exited with code 256 (by signal 4)
[19:19:55] INFO: Service exited with code 256 (by signal 4)

Additional information

HAOS vm was initially created using the haos_ova-9.4.qcow2.xz image, (ages ago) and was updated via the HA frontend to the current release.

I have done a couple of full restarts (of the full HAOS vm as well as the containers within) with no change in symptoms. At first the vm was running with the default cpu (which appears to be qemu64 on my setup), and noting the op-code messages in the dmesg output I also tried booting with the ‘Copy host CPU configuration’ or cpu-passthrough option in libvirt but the result is the same (all logs etc above are gathered after booting in cpu-passthrough mode).

I have tried with the tiny-int8 and medium-int8 models selected. I also tried uninstalling and re-installing the add-on, and leaving the model selection at the default tiny-int8 during that process (since the first time around I might have changed the model before the first container start).

I’m comfortable using ssh, docker etc, so I can easily run commands etc inside the whisper container if that’s helpful.

About this issue

  • Original URL
  • State: closed
  • Created a year ago
  • Comments: 22 (8 by maintainers)

Commits related to this issue

Most upvoted comments

Thanks, @synesthesiam I can confirm that the same issue occurs on my system that doesn’t have ATX instruction compatibility. It would be great if someone with the know-how could compile this plugin without the instructions (similar to the Frigate addon fork by pdecat: https://github.com/pdecat/frigate-hass-addons/). However, this is a good excuse to migrate to something a bit more modern (and powerful) - I have been running Home assistant OS on an old Dell Wyse think client that uses the AMD G-T56N processor.

Thanks all for the diagnostics! I was able to replicate the issue with CTranslate2’s docker container on my host system, so I’ve logged an issue with them upstream to see how we go. https://github.com/OpenNMT/CTranslate2/issues/1224

CTranslate2 is compiled with AVX instructions, which don’t seem to be supported by your processor

Thank you @synesthesiam, I recently started this issue where there is more detailed information from me. If I need to find out the exact CPU model, I’ll probably have to open the PC I’m running HAOS on. Or is there a better method to find out the processor type via HA?

#3355

Run cat /proc/cpuinfo to get the full details of what CPU you are using. If you run that inside the HAOS vm, you’ll see what CPU the virtual machine is “seeing”, and if you run the command on the host OS you should see what the physical CPU is. You are looking for “avx” to be listed in the “flags” section of the output.

@agners Yes, AVX instructions are required for this add-on on x86-64 machines.

I had this same error but was able to fix it by setting CPU type to “host”. I saw this didn’t work for somebody else but might be worth trying. Using an i7-10710U host with newest ProxMox, haos VM.

Signal 4 means SIGILL (illegal instruction), and it seems that is what Whisper trips here:

May 09 08:55:06 homeassistant kernel: traps: python3[1799327] trap invalid opcode ip:7fea8649fc8c sp:7ffceda1d7e8 error:0 in libctranslate2-52d7eefc.so.3.13.0[7fea86376000+2cc3000]

Maybe libctranslate2 has been built with a specific x86-64 CPU profile in mind? 🤔