mltu: train.py giving error on custom dataset

Hi! I am trying to fine-tune the wav2vec2 model from your “10_wav2vec2_torch” tutorial. As far as I know, my dataset is in a similar format to the LJ Speech Dataset that you are using as an example. There is a ‘wavs’ folder which contains the audio files, and a ‘metadata.csv’ file that has rows of pipe-separated transcriptions. I have been able to successfully run the train.py script on the default dataset (LJ Speech Dataset), but when I use my own dataset, I get this output on the terminal. Am I missing something?

Some weights of the model checkpoint at facebook/wav2vec2-base-960h were not used when initializing Wav2Vec2ForCTC: ['wav2vec2.encoder.pos_conv_embed.conv.weight_v', 'wav2vec2.encoder.pos_conv_embed.conv.weight_g']
- This IS expected if you are initializing Wav2Vec2ForCTC from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing Wav2Vec2ForCTC from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of Wav2Vec2ForCTC were not initialized from the model checkpoint at facebook/wav2vec2-base-960h and are newly initialized: ['wav2vec2.encoder.pos_conv_embed.conv.parametrizations.weight.original1', 'wav2vec2.masked_spec_embed', 'wav2vec2.encoder.pos_conv_embed.conv.parametrizations.weight.original0']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Some weights of Wav2Vec2ForCTC were not initialized from the model checkpoint at facebook/wav2vec2-base-960h and are newly initialized because the shapes did not match:
- lm_head.bias: found shape torch.Size([32]) in the checkpoint and torch.Size([29]) in the model instantiated
- lm_head.weight: found shape torch.Size([32, 768]) in the checkpoint and torch.Size([29, 768]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Cuda Device Available.
INFO:WarmupCosineDecay:Epoch 1 - Learning Rate: 1e-08
  0%|                                                                                                                  | 0/18 [00:00<?, ?it/s]/home/ee/anaconda3/lib/python3.9/site-packages/mltu/transformers.py:234: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.
  return padded_audios, np.array(label)
Epoch 1 - loss: 25.1576 - CER: 4.2681 - WER: 1.0000: 100%|████████████████████████████████████████████████████| 18/18 [00:08<00:00,  2.06it/s]
  0%|                                                                                                                   | 0/2 [00:00<?, ?it/s]Exception in thread Thread-19:
Exception in thread Thread-15:
Traceback (most recent call last):
  File "/home/ee/anaconda3/lib/python3.9/threading.py", line 980, in _bootstrap_inner
Exception in thread Thread-16:
Traceback (most recent call last):
  File "/home/ee/anaconda3/lib/python3.9/threading.py", line 980, in _bootstrap_inner
Exception in thread Thread-14:
Traceback (most recent call last):
  File "/home/ee/anaconda3/lib/python3.9/threading.py", line 980, in _bootstrap_inner
    self.run()
  File "/home/ee/anaconda3/lib/python3.9/threading.py", line 917, in run
    self._target(*self._args, **self._kwargs)
  File "/home/ee/anaconda3/lib/python3.9/site-packages/mltu/torch/dataProvider.py", line 245, in worker_function
    result = self.function(data_index)
  File "/home/ee/anaconda3/lib/python3.9/site-packages/mltu/dataProvider.py", line 287, in __getitem__
    batch_data, batch_annotations = batch_postprocessor(batch_data, batch_annotations)
  File "/home/ee/anaconda3/lib/python3.9/site-packages/mltu/transformers.py", line 222, in __call__
    max_len = max([len(a) for a in audio])
ValueError: max() arg is an empty sequence
    self.run()
  File "/home/ee/anaconda3/lib/python3.9/threading.py", line 917, in run
    self._target(*self._args, **self._kwargs)
  File "/home/ee/anaconda3/lib/python3.9/site-packages/mltu/torch/dataProvider.py", line 245, in worker_function
    result = self.function(data_index)
  File "/home/ee/anaconda3/lib/python3.9/site-packages/mltu/dataProvider.py", line 287, in __getitem__
    batch_data, batch_annotations = batch_postprocessor(batch_data, batch_annotations)
  File "/home/ee/anaconda3/lib/python3.9/site-packages/mltu/transformers.py", line 222, in __call__
    max_len = max([len(a) for a in audio])
ValueError: max() arg is an empty sequence
    self.run()
  File "/home/ee/anaconda3/lib/python3.9/threading.py", line 917, in run
    self._target(*self._args, **self._kwargs)
  File "/home/ee/anaconda3/lib/python3.9/site-packages/mltu/torch/dataProvider.py", line 245, in worker_function
    result = self.function(data_index)
  File "/home/ee/anaconda3/lib/python3.9/site-packages/mltu/dataProvider.py", line 287, in __getitem__
    batch_data, batch_annotations = batch_postprocessor(batch_data, batch_annotations)
  File "/home/ee/anaconda3/lib/python3.9/site-packages/mltu/transformers.py", line 222, in __call__
    max_len = max([len(a) for a in audio])
ValueError: max() arg is an empty sequence
Traceback (most recent call last):
  File "/home/ee/anaconda3/lib/python3.9/threading.py", line 980, in _bootstrap_inner
    self.run()
  File "/home/ee/anaconda3/lib/python3.9/threading.py", line 917, in run
    self._target(*self._args, **self._kwargs)
  File "/home/ee/anaconda3/lib/python3.9/site-packages/mltu/torch/dataProvider.py", line 245, in worker_function
    result = self.function(data_index)
  File "/home/ee/anaconda3/lib/python3.9/site-packages/mltu/dataProvider.py", line 287, in __getitem__
    batch_data, batch_annotations = batch_postprocessor(batch_data, batch_annotations)
  File "/home/ee/anaconda3/lib/python3.9/site-packages/mltu/transformers.py", line 222, in __call__
    max_len = max([len(a) for a in audio])
ValueError: max() arg is an empty sequence
Exception in thread Thread-18:
Traceback (most recent call last):
  File "/home/ee/anaconda3/lib/python3.9/threading.py", line 980, in _bootstrap_inner
    self.run()
  File "/home/ee/anaconda3/lib/python3.9/threading.py", line 917, in run
    self._target(*self._args, **self._kwargs)
  File "/home/ee/anaconda3/lib/python3.9/site-packages/mltu/torch/dataProvider.py", line 245, in worker_function
    result = self.function(data_index)
  File "/home/ee/anaconda3/lib/python3.9/site-packages/mltu/dataProvider.py", line 287, in __getitem__
    batch_data, batch_annotations = batch_postprocessor(batch_data, batch_annotations)
  File "/home/ee/anaconda3/lib/python3.9/site-packages/mltu/transformers.py", line 222, in __call__
    max_len = max([len(a) for a in audio])
ValueError: max() arg is an empty sequence
Exception in thread Thread-23:
Traceback (most recent call last):
  File "/home/ee/anaconda3/lib/python3.9/threading.py", line 980, in _bootstrap_inner
    self.run()
  File "/home/ee/anaconda3/lib/python3.9/threading.py", line 917, in run
    self._target(*self._args, **self._kwargs)
  File "/home/ee/anaconda3/lib/python3.9/site-packages/mltu/torch/dataProvider.py", line 245, in worker_function
    result = self.function(data_index)
  File "/home/ee/anaconda3/lib/python3.9/site-packages/mltu/dataProvider.py", line 287, in __getitem__
    batch_data, batch_annotations = batch_postprocessor(batch_data, batch_annotations)
  File "/home/ee/anaconda3/lib/python3.9/site-packages/mltu/transformers.py", line 222, in __call__
    max_len = max([len(a) for a in audio])
ValueError: max() arg is an empty sequence
Exception in thread Thread-21:
Traceback (most recent call last):
  File "/home/ee/anaconda3/lib/python3.9/threading.py", line 980, in _bootstrap_inner
    self.run()
  File "/home/ee/anaconda3/lib/python3.9/threading.py", line 917, in run
    self._target(*self._args, **self._kwargs)
  File "/home/ee/anaconda3/lib/python3.9/site-packages/mltu/torch/dataProvider.py", line 245, in worker_function
    result = self.function(data_index)
  File "/home/ee/anaconda3/lib/python3.9/site-packages/mltu/dataProvider.py", line 287, in __getitem__
    batch_data, batch_annotations = batch_postprocessor(batch_data, batch_annotations)
  File "/home/ee/anaconda3/lib/python3.9/site-packages/mltu/transformers.py", line 222, in __call__
    max_len = max([len(a) for a in audio])
ValueError: max() arg is an empty sequence
Exception in thread Thread-22:
Traceback (most recent call last):
  File "/home/ee/anaconda3/lib/python3.9/threading.py", line 980, in _bootstrap_inner
    self.run()
  File "/home/ee/anaconda3/lib/python3.9/threading.py", line 917, in run
    self._target(*self._args, **self._kwargs)
  File "/home/ee/anaconda3/lib/python3.9/site-packages/mltu/torch/dataProvider.py", line 245, in worker_function
    result = self.function(data_index)
  File "/home/ee/anaconda3/lib/python3.9/site-packages/mltu/dataProvider.py", line 287, in __getitem__
    batch_data, batch_annotations = batch_postprocessor(batch_data, batch_annotations)
  File "/home/ee/anaconda3/lib/python3.9/site-packages/mltu/transformers.py", line 222, in __call__
    max_len = max([len(a) for a in audio])
ValueError: max() arg is an empty sequence

About this issue

  • Original URL
  • State: closed
  • Created 8 months ago
  • Comments: 19 (9 by maintainers)

Most upvoted comments

I upgraded to mltu==1.1.7 and everything is working perfectly, for both small and large datasets, with the default batch size. Thank you for taking the time to fix the bug.

Thanks, there was another bug in my code, you received this error because small validation dataset. But now if you pip install mltu==1.1.7 this should be solved. I appreciate that you revealed me these cases 😃