NeMo: Cannot run Mixer TTS colab with Mixer-TTS-X model
Hi all. I’m struggling to run a training using Mixer-TTS-X model. I’m using the tutorial for training both Fastpitch and MixerTTS.
Modifications I’ve done:
pretrained_model = "tts_en_lj_mixerttsx"
Adding ‘raw_texts’ argument when generating a spectrogram:
spectrogram = spec_gen.generate_spectrogram(tokens=tokens, raw_texts=["Hey, this produces speech!"])
Correcting this:
from nemo.collections.tts.torch.data import MixerTTSXDataset
Just in case:
from nemo.collections.tts.torch.tts_data_types import LMTokens
from transformers.models.albert.tokenization_albert import AlbertTokenizer
add lm_tokenizer parameter here:
def pre_calculate_supplementary_data(sup_data_path, sup_data_types, text_tokenizer, text_normalizer, lm_tokenizer, text_normalizer_call_kwargs)
Getting the right config file:
&& wget https://raw.githubusercontent.com/NVIDIA/NeMo/$BRANCH/examples/tts/conf/mixer-tts-x.yaml
Creating the lm_tokenizer object:
lm_tokenizer = LMTokens()
And after running the command:
mixer_tts_sup_data_path = "mixer_tts_x_sup_data_folder"
sup_data_types = ["align_prior_matrix", "pitch", "lm_tokens"]
pitch_mean, pitch_std, pitch_min, pitch_max = pre_calculate_supplementary_data(
mixer_tts_sup_data_path, sup_data_types, text_tokenizer, text_normalizer, lm_tokenizer, text_normalizer_call_kwargs
)
I get the following error:
[NeMo I 2022-08-24 22:00:27 data:216] Loading dataset from tests/data/asr/an4_train.json.
30it [00:00, 712.41it/s][NeMo I 2022-08-24 22:00:27 data:253] Loaded dataset with 30 files.
[NeMo I 2022-08-24 22:00:27 data:255] Dataset contains 0.02 hours.
[NeMo I 2022-08-24 22:00:27 data:357] Pruned 0 files. Final dataset contains 30 files
[NeMo I 2022-08-24 22:00:27 data:360] Pruned 0.00 hours. Final dataset contains 0.02 hours.
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
[<ipython-input-21-b87416bfd412>](https://localhost:8080/#) in <module>
3
4 pitch_mean, pitch_std, pitch_min, pitch_max = pre_calculate_supplementary_data(
----> 5 mixer_tts_sup_data_path, sup_data_types, text_tokenizer, text_normalizer, lm_tokenizer, text_normalizer_call_kwargs
6 )
3 frames
[<ipython-input-20-7893d3034131>](https://localhost:8080/#) in pre_calculate_supplementary_data(sup_data_path, sup_data_types, text_tokenizer, text_normalizer, lm_tokenizer, text_normalizer_call_kwargs)
22 text_normalizer=text_normalizer,
23 lm_tokenizer=lm_tokenizer,
---> 24 text_normalizer_call_kwargs=text_normalizer_call_kwargs
25 )
26 stage2dl[stage] = torch.utils.data.DataLoader(ds, batch_size=1, collate_fn=ds._collate_fn, num_workers=1)
[/usr/local/lib/python3.7/dist-packages/nemo/collections/tts/torch/data.py](https://localhost:8080/#) in __init__(self, **kwargs)
759 class MixerTTSXDataset(TTSDataset):
760 def __init__(self, **kwargs):
--> 761 super().__init__(**kwargs)
762
763 def _albert(self):
[/usr/local/lib/python3.7/dist-packages/nemo/collections/tts/torch/data.py](https://localhost:8080/#) in __init__(self, manifest_filepath, sample_rate, text_tokenizer, tokens, text_normalizer, text_normalizer_call_kwargs, text_tokenizer_pad_id, sup_data_types, sup_data_path, max_duration, min_duration, ignore_file, trim, trim_ref, trim_top_db, trim_frame_length, trim_hop_length, n_fft, win_length, hop_length, window, n_mels, lowfreq, highfreq, **kwargs)
323
324 for data_type in self.sup_data_types:
--> 325 getattr(self, f"add_{data_type.name}")(**kwargs)
326
327 @staticmethod
[/usr/local/lib/python3.7/dist-packages/nemo/collections/tts/torch/data.py](https://localhost:8080/#) in add_lm_tokens(self, **kwargs)
785
786 def add_lm_tokens(self, **kwargs):
--> 787 lm_model = "kwargs.pop('lm_model')"
788
789 if lm_model == "albert":
KeyError: 'lm_model'
Any ideas are welcome. Thanks,
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Comments: 27
Oh, you should also be able to load the
r1.11.0
version of the tutorial, which has the correct path for that version of the repository.Ahh, I see the problem now. #4690 was merged to
main
, while #4811 was merged as a bugfix to ther1.11.0
branch. My apologies! In this case you will have to wait until ther1.11.0
branch is merged again withmain
, which should be within the next few days.Alternatively you could try cherrypicking the fix in another branch, if it is urgent.
No problem!
I don’t think anyone on our team has tried it yet, but yes, you should be able to fine-tune as usual.