faster-whisper: Requested float16 compute type, but the target device or backend do not support efficient float16 computation.

I recently tried this wonderful tool on CPU of my Windows 10 amchine and got quite good results. But when I tried on GPU via model = WhisperModel(model_path, device="cuda", compute_type="float16") I received following error Requested float16 compute type, but the target device or backend do not support efficient float16 computation. I have GTX1050 Ti and main driver is 31.0.15.1694. How can I fix this error and run on my GPU card?

About this issue

  • Original URL
  • State: closed
  • Created a year ago
  • Comments: 17 (8 by maintainers)

Most upvoted comments

Your GPU does not support FP16 execution.

You can set compute_type to “float32” or “int8”.

int8 does not work on the Tesla P40 or the P100 - I get errors thrown. Any thoughts on flags to set for that, if any?

Also not sure why the P40 is reported as not supporting FP16 when the datasheets for the GPU indicate that it definitely does - needed to set the allow flag for it to use FP16. Will post benchmarks in a bit from FP32 vs. FP16 (with forcing flag on).

FP32 test on a ~45 min file, Tesla P40, batch size 16.

real    8m37.289s 
user    7m35.752s
sys     0m26.459s

FP16 test on the same file, Tesla P40, batch size 16, environment variable set:

real    8m24.999s
user    7m49.251s
sys     0m21.153s

Much lower memory pressure as well. Transcription was the same quality, speed/performance about the same?