FastSpeech2: Mismatch tensor size when training

I encountered this error when trying to train a new model for a different language. Lexicon, TextGrid, .lab files is from MFA Can you please take a look at this issue @ming024

Traceback (most recent call last):                                                              | 0/212 [00:00<?, ?it/s]
  File "train.py", line 198, in <module>
    main(args, configs)
  File "train.py", line 82, in main
    output = model(*(batch[2:]))
  File "/fastspeech2/fs2-env/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/fastspeech2/fs2-env/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 159, in forward
    return self.module(*inputs[0], **kwargs[0])
  File "/fastspeech2/fs2-env/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/fastspeech2/FastSpeech2/model/fastspeech2.py", line 91, in forward
    d_control,
  File "/fastspeech2/fs2-env/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/fastspeech2/FastSpeech2/model/modules.py", line 121, in forward
    x = x + pitch_embedding
RuntimeError: The size of tensor a (47) must match the size of tensor b (101) at non-singleton dimension 1
Training:   0%|                                                                 | 1/900000 [00:00<122:50:30,  2.04it/s]
Epoch 1:   0%|                                                                                 | 0/212 [00:00<?, ?it/s]

p/s: I tried run the same train command without changing anything few time, each time the size of tensor a and tensor b is different

About this issue

  • Original URL
  • State: open
  • Created 3 years ago
  • Comments: 15 (1 by maintainers)

Most upvoted comments

the best solution is to run “mfa train xxx” command to generate textgrid files again and then run preprocess.py(even run “mfa align xxx” command using align model trained in other dataset with same lexicon may also not work).