librosa: filters.mel: ValueError: operands could not be broadcast together with shapes (1,1025) (0,)

Description

I get an exception in filters.mel: ValueError: operands could not be broadcast together with shapes (1,201) (0,). In line lower = -ramps[i] / fdiff[i], where fdiff = array([], shape=(130, 0), dtype=float64).

The same problem was also reported here:

Steps/Code to Reproduce

Example:

      audio = numpy.array([ 2.78090321e-03,  3.32838898e-03,  4.34225125e-03, ...,
                           -4.98423541e-05, -1.79219493e-04,  1.82329724e-04])
      sample_rate = 16000
      num_feature_filters =  40
      step_len = 0.01
      window_len =  0.025

      mfccs = librosa.feature.mfcc(
            audio, sr=sample_rate,
            n_mfcc=num_feature_filters,
            hop_length=int(step_len * sample_rate), n_fft=int(window_len * sample_rate))

Expected Results

It works. It has also worked in the past.

Actual Results

Exception:

  File "/u/zeyer/setups/librispeech/2018-02-26--att/returnn/GeneratingDataset.py", line 814, in _get_audio_features_mfcc
    line: mfccs = librosa.feature.mfcc(
            audio, sr=sample_rate,
            n_mfcc=num_feature_filters,
            hop_length=int(step_len * sample_rate), n_fft=int(window_len * sample_rate))
    locals:
      mfccs = <not found>
      librosa = <local> <module 'librosa' from '/u/zeyer/.local/lib/python3.6/site-packages/librosa/__init__.py'>
      librosa.feature = <local> <module 'librosa.feature' from '/u/zeyer/.local/lib/python3.6/site-packages/librosa/feature/__init__.py'>
      librosa.feature.mfcc = <local> <function mfcc at 0x7f98700e9488>
      audio = <local> array([ 2.78090321e-03,  3.32838898e-03,  4.34225125e-03, ...,
                             -4.98423541e-05, -1.79219493e-04,  1.82329724e-04]), len = 22912
      sr = <not found>
      sample_rate = <local> 16000
      n_mfcc = <not found>
      num_feature_filters = <local> 40
      hop_length = <not found>
      int = <builtin> <class 'int'>
      step_len = <local> 0.01
      n_fft = <not found>
      window_len = <local> 0.025
  File "/u/zeyer/.local/lib/python3.6/site-packages/librosa/feature/spectral.py", line 1299, in mfcc
    line: S = power_to_db(melspectrogram(y=y, sr=sr, **kwargs))
    locals:
      S = <local> None
      power_to_db = <global> <function power_to_db at 0x7f98700d1048>
      melspectrogram = <global> <function melspectrogram at 0x7f98700e9510>
      y = <local> array([ 2.78090321e-03,  3.32838898e-03,  4.34225125e-03, ...,
                         -4.98423541e-05, -1.79219493e-04,  1.82329724e-04]), len = 22912
      sr = <local> 16000
      kwargs = <local> {'hop_length': 160, 'n_fft': 400}
  File "/u/zeyer/.local/lib/python3.6/site-packages/librosa/feature/spectral.py", line 1391, in melspectrogram
    line: mel_basis = filters.mel(sr, n_fft, **kwargs)
    locals:
      mel_basis = <not found>
      filters = <global> <module 'librosa.filters' from '/u/zeyer/.local/lib/python3.6/site-packages/librosa/filters.py'>
      filters.mel = <global> <function mel at 0x7f98701338c8>
      sr = <local> 16000
      n_fft = <local> 400
      kwargs = <local> {}
  File "/u/zeyer/.local/lib/python3.6/site-packages/librosa/filters.py", line 247, in mel
    line: lower = -ramps[i] / fdiff[i]
    locals:
      lower = <not found>
      ramps = <local> array([[[ 0.00000000e+00, -4.00000000e+01, -8.00000000e+01, ...,
...                             [[ 4.67655199e+01,  6.76551987e+00..., len = 130, _[0]: {len = 1, _[0]: {len = 201}}
      i = <local> 0
      fdiff = <local> array([], shape=(130, 0), dtype=float64), len = 130, _[0]: {len = 0}
ValueError: operands could not be broadcast together with shapes (1,201) (0,)

Versions

In [6]: librosa.version.show_versions()
INSTALLED VERSIONS
------------------
python: 3.6.3 (default, Oct 25 2017, 11:03:15) 
[GCC 5.4.0 20160609]

librosa: 0.5.1

audioread: 2.1.5
numpy: 1.16.0
scipy: 1.1.0
scikit-learn: None
joblib: 0.11
decorator: 4.3.0
six: 1.11.0
resampy: 0.2.0

numpydoc: None
sphinx: None
sphinx_rtd_theme: None
sphinxcontrib-versioning: None
matplotlib: 2.1.0
numba: 0.35.0

>>> import platform; print(platform.platform())
Linux-4.4.0-53-generic-x86_64-with-debian-stretch-sid
>>> import sys; print("Python", sys.version)
Python 3.6.3 (default, Oct 25 2017, 11:03:15) 
[GCC 5.4.0 20160609]
>>> import numpy; print("NumPy", numpy.__version__)
NumPy 1.16.0
>>> import scipy; print("SciPy", scipy.__version__)
SciPy 1.1.0
>>> import librosa; print("librosa", librosa.__version__)
librosa 0.5.1

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Comments: 15 (5 by maintainers)

Commits related to this issue

Most upvoted comments

For documentation. In numpy 1.16 numpy changed linspace to consider the input shape. This causes in the old hz_to_mel this bug, because it returned always an array with at least 1d.

Numpy 1.16: https://docs.scipy.org/doc/numpy/release.html

Start and stop arrays for linspace, logspace and geomspace These functions used to be limited to scalar stop and start values, but can now take arrays, which will be properly broadcast and result in an output which has one axis prepended. This can be used, e.g., to obtain linearly interpolated points between sets of points.

Ok, after updating to librosa 0.6.2, this issue seems to be gone. So maybe this can be closed. But I guess it’s useful to have this as a reference for others who stumble upon this exception.

At least I was able to run it with numpy==1.14.3 https://github.com/mozilla/TTS/blob/master/requirements.txt