whisperX: pyannote/speaker-diarization-3.0 runs slower than pyannote/speaker-diarization@2.1
https://github.com/m-bain/whisperX/blob/07fafa37b3ef7ce8628b194da302a5a996bb7d37/setup.py#L22
Currently pyannote.audio is pinned to 3.0.0, but it has been reported that it performed slower because the embeddings model ran on CPU. As a result a new release 3.0.1 fixed it by replacingonnxruntime with onnxruntime-gpu.
It makes sense for whisperX to update pyannote.audio to 3.0.1, however, there is a conflict with faster_whisper on onnxruntime, as discussed here. Until it is resolved on the faster_whisper side, installing both will end up onnxruntime still in CPU mode and thus slower performance.
My current workaround is running the following commands post installation
pip install pyannote.audio==3.0.1
pip uninstall onnxruntime
pip install --force-reinstall onnxruntime-gpu
Alternative, use the old 2.1 model.
model = whisperx.DiarizationPipeline(model_name='pyannote/speaker-diarization@2.1', use_auth_token=YOUR_AUTH_TOKEN, device='cuda')
About this issue
- Original URL
- State: closed
- Created 9 months ago
- Reactions: 24
- Comments: 26 (13 by maintainers)
Pyannote just released 3.1 without onnx. It should work fine for us. I’ll work on it tomorrow if nobody did before. Release note : https://github.com/pyannote/pyannote-audio/releases/tag/3.1.0
Brilliant, thank you. I thought I was crazy. Your fix worked for me. Went from around 8 minutes to 30 second for diarization on 2 speaker ~45minute audio file.
How did you fixed it? I ran the commands only after installing whisperx and this error message popped up during installation and running code. It stop showing the error after reinstalling onnxruntime but the performance issues continue happening.
😦 The current setup.py change breaks Mac, which sucks. Not sure I understood / I tried solution people pointed out about the slower pyannote, but no matter what I do there is no onnxruntime available as @dylorr pointed out for mac. So I don’t think the solution really helps Mac users out.
Seems like this has to do with the pyanote dependency and not really whisperx, so I created a docker container following the advice for the pyannote issue where it said to use a 3.0.0 or lower, by cloning the whisperx and modifying the setup.py.
For those on Mac, here is the repository / docker image if you want to use it and just have a plug-and-go solution.
I was trying to get something working in the setup.py to detect the environment but didn’t try too hard / was running into a bit of issues, so just decided to hard-code it for now to 3.0.0 and maybe that detect OS can be a future thing.
Link to the modified repo: https://github.com/justinwlin/WhisperXMac
Docker image: https://hub.docker.com/layers/justinwlin/whisperxmac/1.0/images/sha256-3e56473cc25de95269955ef1a9c596ea7e62a9b83da682cf9bc3e91abe5d8798?context=repo
I didn’t test the Docker image too hard, I just made sure the below worked:
also just made sure that import whisperx work for python, and since the cli in my docker container is just a bash shell passing the arguments to the python function, i assume it works for any python scripts too.
It should not, pyannote 3.0 integrate a new model that is supposed to get better results especially on overlapping discussions. You can see their results on public database on the release note here. But, it is research oriented. It is possible that your dataset or your data are not like the , and you can have worst result. Another thing could be a problem on the whisperX process. Too be sure you should compare manually. (And… it is not the easiest thing to do)
On the other subject, uninstall reinstall do not work for me either. And that is a big problem.
I think setup is. best! https://github.com/m-bain/whisperX#setup-️