LocalAI: Problem with TTS in 2.8

We are using LocalAI in Docker but have Problems with all TTS models described in TTS in LocalAI .

But when calling the following curl:

curl http://localhost:8080/tts -H "Content-Type: application/json" -d '{ "backend": "bark", "input":"Hello!" }' | aplay

we get the following error:

stderr OSError: /opt/conda/envs/transformers/lib/python3.11/site-packages/torchaudio/lib/libtorchaudio.so: undefined symbol: _ZN2at4_ops10zeros_like4callERKNS_6TensorEN3c108optionalINS5_10ScalarTypeEEENS6_INS5_6LayoutEEENS6_INS5_6DeviceEEENS6_IbEENS6_INS5_12MemoryFormatEEE

This error is thrown with bark, qoqui and Vall-E-X. Piper works.

LocalAI version: v2.8.0-cublas-cuda12-ffmpeg

Environment, CPU architecture, OS, and Version: Linux aifb-bis-mlpc 5.15.0-92-generic #102-Ubuntu SMP Wed Jan 10 09:33:48 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

To Reproduce Run v2.8.0-cublas-cuda12-ffmpeg LocalAI on Server an the curl command.

Expected behavior LocalAI shouldn’t return an error but a tts file.

Logs I added a log file. _Shared_LocalAI_logs.txt

About this issue

  • Original URL
  • State: closed
  • Created 5 months ago
  • Comments: 17 (9 by maintainers)

Commits related to this issue

Most upvoted comments

We tested it with the master branch (master-cublas-cuda12-ffmpeg). As input we used the standard curl and we got the following error:

curl:

curl
http://localhost:8080/tts
-H "Content-Type: application/json" -d '{
   "backend": "bark",
   "input":"Hello!"
}' | aplay
4:32PM DBG Loading model in memory from file: /models/model_configuration
4:32PM DBG Loading Model  with gRPC (file: /models/model_configuration) (backend: hello!): {backendString:Hello! model: threads:0 assetDir:/tmp/localai/backend_data context:{emptyCtx:{}} gRPCOptions:0xc000230800 externalBackends:map[autogptq:/build/backend/python/autogptq/run.sh bark:/build/backend/python/bark/run.sh coqui:/build/backend/python/coqui/run.sh diffusers:/build/backend/python/diffusers/run.sh exllama:/build/backend/python/exllama/run.sh exllama2:/build/backend/python/exllama2/run.sh huggingface-embeddings:/build/backend/python/sentencetransformers/run.sh mamba:/build/backend/python/mamba/run.sh petals:/build/backend/python/petals/run.sh sentencetransformers:/build/backend/python/sentencetransformers/run.sh transformers:/build/backend/python/transformers/run.sh transformers-musicgen:/build/backend/python/transformers-musicgen/run.sh vall-e-x:/build/backend/python/vall-e-x/run.sh vllm:/build/backend/python/vllm/run.sh] grpcAttempts:20 grpcAttemptsDelay:2 singleActiveBackend:true parallelRequests:true}
[172.24.0.1]:37026 500 - POST /tts

Thank you so much. (nearly) everything works now. We still have a problem with Musicgen (https://localai.io/features/text-to-audio/#transformers-musicgen). This throws the error: no module named 'google' and with selecting different speakers with bark. Then we get sendfile: file /tmp/generated/audio/piper_28.wav not found.

Please open separate tickets for it with full logs and how to reproduce it, thanks!

lol I’ve read that line at least four times before writing it looked legit

I ran some tests again with master images, and can confirm that #1710 is fixed (just closed the issue, thanks a lot!).

As for this issue specifically, I tested with master-cublas-cuda12-ffmpeg (sha256:de26b09328fea0bd57ff2e14ae28ba9a54ca489a1bc96208131f8d4c1d494672), and while the initial error is definitely fixed, there still seems to be something wrong as, when curling:

curl http://localhost:8080/tts -H "Content-Type: application/json" -d '{"backend": "bark","input":"Hello!"}'
{"error":{"code":500,"message":"grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/hello!. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS","type":""}}

I’m getting:

3:03PM DBG Request for model:
3:03PM INF Loading model with backend Hello!
3:03PM DBG Loading model in memory from file: /build/models
3:03PM DBG Loading Model  with gRPC (file: /build/models) (backend: hello!): {backendString:Hello! model: threads:0 assetDir:/tmp/localai/backend_data context:{emptyCtx:{}} gRPCOptions:0xc0001f6000 externalBackends:map[autogptq:/build/backend/python/autogptq/run.sh bark:/build/backend/python/bark/run.sh coqui:/build/backend/python/coqui/run.sh diffusers:/build/backend/python/diffusers/run.sh exllama:/build/backend/python/exllama/run.sh exllama2:/build/backend/python/exllama2/run.sh huggingface-embeddings:/build/backend/python/sentencetransformers/run.sh mamba:/build/backend/python/mamba/run.sh petals:/build/backend/python/petals/run.sh sentencetransformers:/build/backend/python/sentencetransformers/run.sh transformers:/build/backend/python/transformers/run.sh transformers-musicgen:/build/backend/python/transformers-musicgen/run.sh vall-e-x:/build/backend/python/vall-e-x/run.sh vllm:/build/backend/python/vllm/run.sh] grpcAttempts:20 grpcAttemptsDelay:2 singleActiveBackend:false parallelRequests:false}

It appears there might be a mixup between the backend and input fields as LocalAI tries to load the Hello! backend, though the TTSEndpoint code looks totally legit.

ouch, good catch, this is a regression introduced in https://github.com/mudler/LocalAI/pull/1692.

Sorry that it took me forever to realize that the images weren’t pushed, then an equal amount of time to build a docker image from your branch.

I just ran a quick test for vLLM and the model loaded successfully, so I’d say it’s fixed but maybe it’s better to wait a bit more and confirm with the images from the master branch.

@golgeek I’ve tried only with TTS models (vall-e-x specifically), can you confirm that? please open up another issue for vLLM

Sorry, the error seemed too suspiciously similar, and I thought it might be the same origin.

Was coming back to report the same as @Jasonthefirst.

And I opened #1710 for vLLM.

@golgeek I’ve tried only with TTS models (vall-e-x specifically), can you confirm that? please open up another issue for vLLM