librosa: reassign_spectrogram docstring is slow and uninformative

Description

This issue comes in the wake of PR #926, which closed #394.

First of all, let me express how happy i am that time-frequency reassignment eventually made its way into a wide-audience library such as librosa. Huge thanks to @scjs for his work.

That being said, the current docstring example for reassign_spectrogram technically works, but it’s built upon a 2-second example of Vibe Ace at 22050 Hz sample rate. The result is not very compelling in terms of the capabilities of time-frequency reassignment. In particular, it’s not at all clear that time-frequency reassignment alleviates the Heisenberg time-frequency tradeoff. At first glance, what seems to happen looks more like a threshold-based sparsification of STFT magnitudes. I’m afraid we might convey the wrong kind of message with this example.

Furthermore, this example is quite slow: ~5 seconds on my laptop, including display in a Jupyter notebook.

Please let me know if you want me to make a PR.

Steps/Code to Reproduce

y, sr = librosa.load(librosa.util.example_audio_file())
y_zoom = y[27 * sr : 29 * sr]
freqs, times, mags = librosa.reassigned_spectrogram(
    y=y_zoom, sr=sr, hop_length=16, n_fft=64, ref_power=1e-4
)
db = librosa.amplitude_to_db(mags, ref=np.max)
import matplotlib.pyplot as plt
plt.figure()
plt.subplot(2, 1, 1)
librosa.display.specshow(
    db, x_axis="s", y_axis="linear", sr=sr, hop_length=16
)
plt.title("Spectrogram")
plt.subplot(2, 1, 2)
plt.scatter(times, freqs, c=db, s=0.1, cmap="magma")
plt.title("Reassigned spectrogram")
plt.xlim([0, 2])
plt.xticks([0, 0.5, 1, 1.5, 2])
plt.ylabel("Hz")
plt.subplots_adjust(
    left=0.1, bottom=0.05, right=0.95, top=0.95, hspace=0.5
)
plt.show()

Actual Results

reassign_vibe_ace

Expected Results

I would prefer to see a faster and more pedagogical example, even if it’s synthetic. For example this:

reassigned_synth

Runs in 75 milliseconds on my laptop.

Code:

sr = 4000
y = 2e-1 * librosa.clicks(times=[0.1], sr=sr, click_duration=1.0, click_freq=2400.0, length=8000) +\
    1e-1 * librosa.clicks(times=[1.2], sr=sr, click_duration=0.5, click_freq=400.0, length=8000) +\
    1e-2 * librosa.chirp(300, 1600, sr=sr, duration=2.0, linear=True) +\
    1e-3 * np.random.randn(2*sr)

freqs, times, mags = librosa.reassigned_spectrogram(
    y=y, sr=sr, n_fft=64, ref_power=1e-2)
db = librosa.amplitude_to_db(mags, ref=np.max)

plt.figure(figsize=(8,8))
plt.subplot(2, 1, 1)
librosa.display.specshow(
    db, x_axis="s", y_axis="linear", sr=sr, hop_length=16)
plt.title("Spectrogram")
plt.gca().set_xticklabels([])
plt.gca().set_xlabel(None)
plt.subplot(2, 1, 2)
plt.scatter(times, freqs, c=db, s=0.1, cmap="magma")
plt.title("Reassigned spectrogram")
plt.xlabel("Time (s)")
plt.xlim(0, 2)
plt.xticks([0, 0.5, 1, 1.5, 2])
plt.ylabel("Hz")
plt.ylim(0, 2000)

Versions

latest master (0.7.1-dev)

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Reactions: 1
  • Comments: 15 (15 by maintainers)

Commits related to this issue

Most upvoted comments

sr = 4000
amin = 1e-10
y = 1e-3 * librosa.clicks(times=[0.3], sr=sr, click_duration=1.0,
                          click_freq=1200.0, length=8000) +\
    1e-3 * librosa.clicks(times=[1.5], sr=sr, click_duration=0.5,
                          click_freq=400.0, length=8000) +\
    1e-3 * librosa.chirp(200, 1600, sr=sr, duration=2.0) +\
    1e-6 * np.random.randn(2*sr)

freqs, times, mags = librosa.reassigned_spectrogram(y=y, sr=sr, n_fft=64)
mags_db = librosa.power_to_db(mags, amin=amin)
background = np.zeros_like(mags) - 10*np.log10(amin)

plt.subplot(2, 1, 1)
librosa.display.specshow(mags_db, x_axis="s", y_axis="linear", sr=sr,
                         hop_length=16, cmap="gray_r")
plt.title("Spectrogram")
plt.gca().set_xticklabels([])
plt.gca().set_xlabel(None)
plt.subplot(2, 1, 2)
librosa.display.specshow(background, x_axis="s", y_axis="linear",
                         cmap="gray_r", sr=sr, hop_length=16)
plt.scatter(times, freqs, c=mags_db, alpha=0.05, cmap="gray_r")
plt.clim(-100, 0)
plt.title("Reassigned spectrogram")

reassign_synth_v-6

with a lower SNR (10dB instead of 30), we begin to see the limitation of the method. Which is a good thing IMO. reassign_synth_v-5

But i also think that the most important thing to show is how spectral leakage disappears. So upon second thought, i think it’s preferable to keep the noise level rather low.