audio-super-res: ValueError: Shapes (4, 128, 128) and () are incompatible

Hi,

I know that the authors of this project have moved on, but I am still curious if anyone has ran into a similar issue regarding the shapes for the LSTM network.

I am only attempting to train the single-speaker.

First, I had to fork the project and fix the Makefile so that it pulls the VCTK-Corpus dataset from a different source (http://www.udialogue.org/download/VCTK-Corpus.tar.gz).

Second, I had to downgrade joblib to 0.11 due to a bug in the most recent Python2.7-compatible version.

This allowed me to download, make, and perform the downsampling on the dataset using the following parameters (the default parameters in the Makefile would simply not work for me and produce empty .h5 files):

sca = 4
sr = 16000

tr_dim = 8192
tr_str = 4096

va_dim = 8192
va_str = 4096

Now, whenever I am trying to train using the following (ignore the $output_name variables):

!python run.py train 
  --train ../data/vctk/speaker1/$train_output_name
  --val ../data/vctk/speaker1/$val_output_name -e 120 
  --batch-size 64 
  --lr 3e-4 
  --logname singlespeaker 
  --model audiotfilm 
  --r 4 
  --layers 4 
  --piano false
  --pool_size 8 
  --strides 8 
  --full true

I receive this error:

/content/gdrive/My Drive/audio-super-res/src
Using TensorFlow backend.
List of arrays in input file: [u'data', u'label']
Shape of X: (3328, 8192, 1)
Shape of Y: (3328, 8192, 1)
List of arrays in input file: [u'data', u'label']
Shape of X: (384, 8192, 1)
Shape of Y: (384, 8192, 1)
audiotfilm
building model...
WARNING:tensorflow:From /usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
Traceback (most recent call last):
  File "run.py", line 172, in <module>
    main()
  File "run.py", line 169, in main
    args.func(args)
  File "run.py", line 119, in train
    model = get_model(args, n_dim, r, from_ckpt=False, train=True)
  File "run.py", line 155, in get_model
    strides=args.strides, opt_params=opt_params, log_prefix=args.logname)  
  File "/content/gdrive/My Drive/audio-super-res/src/models/audiotfilm.py", line 29, in __init__
    opt_params=opt_params, log_prefix=log_prefix)
  File "/content/gdrive/My Drive/audio-super-res/src/models/model.py", line 48, in __init__
    self.predictions = self.create_model(n_dim, r)
  File "/content/gdrive/My Drive/audio-super-res/src/models/audiotfilm.py", line 95, in create_model
    x_norm = _make_normalizer(x, nf, nb)
  File "/content/gdrive/My Drive/audio-super-res/src/models/audiotfilm.py", line 60, in _make_normalizer
    x_rnn = LSTM(output_dim = n_filters, return_sequences = True)(x_in_down)
  File "/usr/local/lib/python2.7/dist-packages/keras/engine/topology.py", line 546, in __call__
    self.build(input_shapes[0])
  File "/usr/local/lib/python2.7/dist-packages/keras/layers/recurrent.py", line 784, in build
    self.W = K.concatenate([self.W_i, self.W_f, self.W_c, self.W_o])
  File "/usr/local/lib/python2.7/dist-packages/keras/backend/tensorflow_backend.py", line 1427, in concatenate
    return tf.concat(axis, [to_dense(x) for x in tensors])
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/util/dispatch.py", line 180, in wrapper
    return target(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/array_ops.py", line 1254, in concat
    tensor_shape.scalar())
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/tensor_shape.py", line 1023, in assert_is_compatible_with
    raise ValueError("Shapes %s and %s are incompatible" % (self, other))
ValueError: Shapes (4, 128, 128) and () are incompatible

Anyone see something like this before? Thanks!

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Reactions: 2
  • Comments: 19 (6 by maintainers)

Most upvoted comments

Awesome, feel free to close the issue

Thanks for the suggestion! I’ve added a requirements.txt. I’ve also upgraded to librosa 0.8.0.

Thanks for your response, guys. Absolutely no worries about no longer maintaining this - I think it’s understood that you are working on new research now. Thank you so much for all of your hard work and a really interesting paper!

No idea. Very frustrating. I ended up using this version (https://github.com/leekh7411/Audio-Super-Resolution-Python3-TF), which actually ended up working for me.

Thanks, I’ll give it a look

No idea. Very frustrating. I ended up using this version (https://github.com/leekh7411/Audio-Super-Resolution-Python3-TF), which actually ended up working for me.