TTS: [Bug] Bark examples not working out of the box?
Describe the bug
I have been following this tutorial: https://tts.readthedocs.io/en/dev/models/bark.html#example-use
To Reproduce
But this is the result I got:
(.venv) nemewsys@nemewsys-Legion-5-15ACH6H:~/voice-to-text$ tree bark_voices/
bark_voices/
└── ljspeech
└── speaker.wav
1 directory, 1 file
(.venv) nemewsys@nemewsys-Legion-5-15ACH6H:~/voice-to-text$ tts --model_name tts_models/multilingual/multi-dataset/bark --text "This is an example." --out_path "output.wav" --voice_dir bark_voices/ --speaker_idx "ljspeech" --progress_bar True
> tts_models/multilingual/multi-dataset/bark is already downloaded.
> Using model: bark
> Text: This is an example.
> Text splitted to sentences.
['This is an example.']
Downloading HuBERT custom tokenizer
Downloading (…)rt_base_ls960_14.pth: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 104M/104M [00:02<00:00, 39.6MB/s]
Traceback (most recent call last):
File "/home/nemewsys/voice-to-text/.venv/bin/tts", line 8, in <module>
sys.exit(main())
File "/home/nemewsys/voice-to-text/.venv/lib/python3.10/site-packages/TTS/bin/synthesize.py", line 447, in main
wav = synthesizer.tts(args.text, speaker_name=args.speaker_idx)
File "/home/nemewsys/voice-to-text/.venv/lib/python3.10/site-packages/TTS/utils/synthesizer.py", line 365, in tts
outputs = self.tts_model.synthesize(
File "/home/nemewsys/voice-to-text/.venv/lib/python3.10/site-packages/TTS/tts/models/bark.py", line 218, in synthesize
history_prompt = load_voice(self, speaker_id, voice_dirs)
File "/home/nemewsys/voice-to-text/.venv/lib/python3.10/site-packages/TTS/tts/layers/bark/inference_funcs.py", line 81, in load_voice
generate_voice(audio=audio_path, model=model, output_path=output_path)
File "/home/nemewsys/voice-to-text/.venv/lib/python3.10/site-packages/TTS/tts/layers/bark/inference_funcs.py", line 134, in generate_voice
hubert_manager.make_sure_tokenizer_installed(model_path=model.config.LOCAL_MODEL_PATHS["hubert_tokenizer"])
File "/home/nemewsys/voice-to-text/.venv/lib/python3.10/site-packages/TTS/tts/layers/bark/hubert/hubert_manager.py", line 31, in make_sure_tokenizer_installed
huggingface_hub.hf_hub_download(repo, model, local_dir=model_dir, local_dir_use_symlinks=False)
File "/home/nemewsys/voice-to-text/.venv/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 118, in _inner_fn
return fn(*args, **kwargs)
File "/home/nemewsys/voice-to-text/.venv/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 1379, in hf_hub_download
os.makedirs(os.path.dirname(local_dir_filepath), exist_ok=True)
File "/usr/lib/python3.10/os.py", line 215, in makedirs
makedirs(head, exist_ok=exist_ok)
File "/usr/lib/python3.10/os.py", line 215, in makedirs
makedirs(head, exist_ok=exist_ok)
File "/usr/lib/python3.10/os.py", line 215, in makedirs
makedirs(head, exist_ok=exist_ok)
[Previous line repeated 1 more time]
File "/usr/lib/python3.10/os.py", line 225, in makedirs
mkdir(name, mode)
PermissionError: [Errno 13] Permission denied: '/root/.local'
Expected behavior
For it to produce the output.wav with the voice in the bark_voices folder
Logs
No response
Environment
{
"CUDA": {
"GPU": [
"NVIDIA GeForce RTX 3070 Laptop GPU"
],
"available": true,
"version": "11.7"
},
"Packages": {
"PyTorch_debug": false,
"PyTorch_version": "2.0.1+cu117",
"TTS": "0.15.6",
"numpy": "1.22.0"
},
"System": {
"OS": "Linux",
"architecture": [
"64bit",
"ELF"
],
"processor": "x86_64",
"python": "3.10.6",
"version": "#83-Ubuntu SMP Thu Jun 15 19:16:32 UTC 2023"
}
}
Additional context
No response
About this issue
- Original URL
- State: closed
- Created a year ago
- Reactions: 1
- Comments: 20 (3 by maintainers)
@FeatureSpitter
Hi, I also encountered the same issue yesterday. I could run bark generation without voice clone out of the box, but I faced the same issue when I generated with voice clone.
I found out HuBERT custom tokenizer download path is not set in the current implementation.
This is the model.config.LOCAL_MODEL_PATHS at https://github.com/coqui-ai/TTS/blob/dev/TTS/tts/layers/bark/inference_funcs.py#L134
I think other model paths are set at https://github.com/coqui-ai/TTS/blob/dev/TTS/tts/models/bark.py#L270, but hubert and tokenieer path is not set, so it directing ./root, which is read-only.
I think you can fix it by modifying the hubert_tokenizer model path from ./root to others by hard-code or downloading the hubert_tokenizer manually to the /root/.local/share/tts/suno/bark_v0/. (this path may be different in your setting).
I fixed this issue by adding the following line at https://github.com/coqui-ai/TTS/blob/dev/TTS/tts/models/bark.py#L270 like this.
I’m unsure if it helps your situation, but I just share my way.
Same bug encountered as of v0.22.0 for the github version of TTS.
Do you have write access to the folder? Seems like you don’t.
Same bug with 0.22.0
This is your error
PermissionError: [Errno 13] Permission denied: '/root/.local'
I don’t have a different explanation than the one above. Sorry.Is this something that should be mentioned in the README.md? Until you said this I was not aware of this fact 🙂
Should be fixed by #2894
I encountered this problem too. After resolving the code, I found the problem arises from the bark config file
config.json
. In my case the config fileconfig.json
is located at~/.local/share/tts/tts_models--multilingual--multi-dataset--bark/config.json
:You can modify this config file to this to resolve this problem:
I have pulled a request to huggingface model card
erogol/bark
to resolve this.same problem here. what is the recommended fix ?
I did do the changes recommended ( edit
~/.local/share/tts/tts_models--multilingual--multi-dataset--bark/config.json
and change/root/
to/home/myuser/
), and now I get this error:Any update on this? Just ran into this issue out-of-the-box myself. It seems that it’s trying to download something to
/root
which doesn’t work given that/root
is only writable byroot
, not a non-superuser/non-sudo.