TTS: [Bug] newer models performs much worse

Describe the bug

The updated version requires re-download of the checkpoints. However, they gives much worse voice cloning/synthesis than previous versions.

The same issue also mentioned in the Huggingface community discussion: https://huggingface.co/coqui/XTTS-v2/discussions/16

Is it possible to use the previous model checkpoints or use local downloaded model? The model loading of the package seems strange. I do not see any options to do so.

To Reproduce

import torch from TTS.api import TTS

Get device

device = “cuda” if torch.cuda.is_available() else “cpu”

List available 🐸TTS models

print(TTS().list_models())

Init TTS

tts = TTS(“tts_models/multilingual/multi-dataset/xtts_v2”).to(device)

Run TTS

❗ Since this model is multi-lingual voice cloning model, we must set the target speaker_wav and language

Text to speech list of amplitude values as output

wav = tts.tts(text=“Hello world!”, speaker_wav=“my/cloning/audio.wav”, language=“en”)

Expected behavior

No response

Logs

No response

Environment

TTS 0.21.1
Python 3.10.11
Pytorch 2.1.0+cu121
Ubuntu 22.04

Additional context

No response

About this issue

Original URL
State: closed
Created 7 months ago
Reactions: 1
Comments: 15 (2 by maintainers)

Most upvoted comments

The best way actually, create a folder named model at your script location and put these files inside:

https://huggingface.co/coqui/XTTS-v2/raw/v2.0.2/config.json https://huggingface.co/coqui/XTTS-v2/resolve/v2.0.2/model.pth?download=true https://huggingface.co/coqui/XTTS-v2/raw/v2.0.2/vocab.json

And in your code load the TTS like that:

model = TTS(model_path="model/", config_path="model/config.json").to(device)
``

Clad3815 on Nov 25, 2023

I thought that my local model was corrupted and I deleted so now, I can’t use it it anymore. Hope this problem will be solved soon.

you can try to git clone the hf model and checkout the old commit

ttio2tech on Nov 25, 2023

update: I saw it’s possible to load from local model: TTS/api.py Example loading a model from a path: >>> tts = TTS(model_path=“/path/to/checkpoint_100000.pth”, config_path=“/path/to/config.json”, progress_bar=False, gpu=False) >>> tts.tts_to_file(text=“Ich bin eine Testnachricht.”, file_path=“output.wav”)

ttio2tech on Nov 25, 2023