whisper.cpp: Stuck on convert_encoder while converting to CoreML
I’ve tried to convert small model to CoreML format on Mac M1 by following the CoreML instruction.
However, the process stuck after Running MIL backend_mlprogram pipeline step. I can see ANECompilerService is using 100% cpu in top but the converting process just never ends.
My environment:
- Macbook Pro (M1 silicon)
- Python 3.9
- whisper==1.1.10
- openai-whisper==20230314
- ane-transformers==0.1.3
About this issue
- Original URL
- State: open
- Created a year ago
- Reactions: 7
- Comments: 25 (2 by maintainers)
same here. 68 minutes, then I found solution to kill ANECompilerService and it finished
Same. Tried
generate-coreml-model.sha few times with bothmedium.enandlarge, even let it run 8h or so overnight, never completed.After
sudo kill -9onANECompilerService(sendingSIGTERMdidn’t work) , the process finished almost immediately.Afterwards, running the model hangs indefinitely at
whisper_init_state: first run on a device may take a while ...If I again send
SIGKILLtoANECompilerService, it finishes within seconds and correctly transcribes the audio.I have successfully executed on M1
. /models/generate-coreml-model.sh large, which took about 50 minutes, it would be better to mark an approximate time in the documentation @ggerganovFWIW I’m not entirely convinced that waiting is required or is the full answer here. I waited for multiple hours converting the
mediummodel and it didn’t finish, but if I force-quitANECompilerServiceafter waiting for a few mins the process appears to complete successfully.That said, on my Mac I end up with the same issue with the converted model in Xcode, both at runtime and when performance benchmarking the model – it gets stuck compiling the model and never finishes. Sometimes if I’m lucky the compilation appears to happen immediately and I can use the model as usual for that run of the program – mostly only in Debug builds though. Seems to be a bug in the compiler service.
I have the same issues if I make a mlprogram or an mlmodel, but the mlprogram seems to show the problem more often / worse.
I have
torch==2.0, python 3.10, M1 MbP, 16GB RAMI’v successfully converted all models, time spend from 1 min to 60 min. No errors, just need time.
System: Macbook Pro M1Pro
I am having the exact same issue - if I kill ANECompilerService, the COREML compiled main continues on and begins to work. What is the issue here? It SEEMS to be recompiling the ml each time.
@archive-r @arrowcircle
However, even after killing ANECompilerService, the same issues occur when you run it again.
I have successfully used the
baseoption and experienced no issues during subsequent runs. However, when I tried using themediumoption, it became stuck.Something’s not quite right here. It took my M1 Max 64GB RAM exactly 4 hours to convert the
base.enand almost 3 hours to convert the medium.en (which didn’t load btw). Could someone share the details such as torch & python version etc?I am using torch.2.1.0-dev20230417 python 3.10.10.
I think it would be better to have some sort of progress update if possible. It just looks like it’s hanging.