NeMo: Cannot run Mixer TTS colab with Mixer-TTS-X model
Hi all. I’m struggling to run a training using Mixer-TTS-X model. I’m using the tutorial for training both Fastpitch and MixerTTS.
Modifications I’ve done:
pretrained_model = "tts_en_lj_mixerttsx"
Adding ‘raw_texts’ argument when generating a spectrogram:
spectrogram = spec_gen.generate_spectrogram(tokens=tokens, raw_texts=["Hey, this produces speech!"])
Correcting this:
from nemo.collections.tts.torch.data import MixerTTSXDataset
Just in case:
from nemo.collections.tts.torch.tts_data_types import LMTokens
from transformers.models.albert.tokenization_albert import AlbertTokenizer
add lm_tokenizer parameter here:
def pre_calculate_supplementary_data(sup_data_path, sup_data_types, text_tokenizer, text_normalizer, lm_tokenizer, text_normalizer_call_kwargs)
Getting the right config file:
&& wget https://raw.githubusercontent.com/NVIDIA/NeMo/$BRANCH/examples/tts/conf/mixer-tts-x.yaml
Creating the lm_tokenizer object:
lm_tokenizer = LMTokens()
And after running the command:
mixer_tts_sup_data_path = "mixer_tts_x_sup_data_folder"
sup_data_types = ["align_prior_matrix", "pitch", "lm_tokens"]
pitch_mean, pitch_std, pitch_min, pitch_max = pre_calculate_supplementary_data(
mixer_tts_sup_data_path, sup_data_types, text_tokenizer, text_normalizer, lm_tokenizer, text_normalizer_call_kwargs
)
I get the following error:
[NeMo I 2022-08-24 22:00:27 data:216] Loading dataset from tests/data/asr/an4_train.json.
30it [00:00, 712.41it/s][NeMo I 2022-08-24 22:00:27 data:253] Loaded dataset with 30 files.
[NeMo I 2022-08-24 22:00:27 data:255] Dataset contains 0.02 hours.
[NeMo I 2022-08-24 22:00:27 data:357] Pruned 0 files. Final dataset contains 30 files
[NeMo I 2022-08-24 22:00:27 data:360] Pruned 0.00 hours. Final dataset contains 0.02 hours.
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
[<ipython-input-21-b87416bfd412>](https://localhost:8080/#) in <module>
3
4 pitch_mean, pitch_std, pitch_min, pitch_max = pre_calculate_supplementary_data(
----> 5 mixer_tts_sup_data_path, sup_data_types, text_tokenizer, text_normalizer, lm_tokenizer, text_normalizer_call_kwargs
6 )
3 frames
[<ipython-input-20-7893d3034131>](https://localhost:8080/#) in pre_calculate_supplementary_data(sup_data_path, sup_data_types, text_tokenizer, text_normalizer, lm_tokenizer, text_normalizer_call_kwargs)
22 text_normalizer=text_normalizer,
23 lm_tokenizer=lm_tokenizer,
---> 24 text_normalizer_call_kwargs=text_normalizer_call_kwargs
25 )
26 stage2dl[stage] = torch.utils.data.DataLoader(ds, batch_size=1, collate_fn=ds._collate_fn, num_workers=1)
[/usr/local/lib/python3.7/dist-packages/nemo/collections/tts/torch/data.py](https://localhost:8080/#) in __init__(self, **kwargs)
759 class MixerTTSXDataset(TTSDataset):
760 def __init__(self, **kwargs):
--> 761 super().__init__(**kwargs)
762
763 def _albert(self):
[/usr/local/lib/python3.7/dist-packages/nemo/collections/tts/torch/data.py](https://localhost:8080/#) in __init__(self, manifest_filepath, sample_rate, text_tokenizer, tokens, text_normalizer, text_normalizer_call_kwargs, text_tokenizer_pad_id, sup_data_types, sup_data_path, max_duration, min_duration, ignore_file, trim, trim_ref, trim_top_db, trim_frame_length, trim_hop_length, n_fft, win_length, hop_length, window, n_mels, lowfreq, highfreq, **kwargs)
323
324 for data_type in self.sup_data_types:
--> 325 getattr(self, f"add_{data_type.name}")(**kwargs)
326
327 @staticmethod
[/usr/local/lib/python3.7/dist-packages/nemo/collections/tts/torch/data.py](https://localhost:8080/#) in add_lm_tokens(self, **kwargs)
785
786 def add_lm_tokens(self, **kwargs):
--> 787 lm_model = "kwargs.pop('lm_model')"
788
789 if lm_model == "albert":
KeyError: 'lm_model'
Any ideas are welcome. Thanks,
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Comments: 27
Oh, you should also be able to load the
r1.11.0version of the tutorial, which has the correct path for that version of the repository.Ahh, I see the problem now. #4690 was merged to
main, while #4811 was merged as a bugfix to ther1.11.0branch. My apologies! In this case you will have to wait until ther1.11.0branch is merged again withmain, which should be within the next few days.Alternatively you could try cherrypicking the fix in another branch, if it is urgent.
No problem!
I don’t think anyone on our team has tried it yet, but yes, you should be able to fine-tune as usual.