scipy: scipy.signal.signaltools._fftconv_faster has incorrect estimates

My issue is about the function scipy.signal.signaltools._fftconv_faster. The function has a logical error, causing most convolutions to use fft when the direct method would save a large factor.

Code example, showing that _fftconv_faster returns that fft is faster even when multiplying by a scalar kernel, when the signal is large enough.

>>> from scipy.signal.signaltools import _fftconv_faster
>>> _fftconv_faster(np.empty(1), np.empty(1000000), "full")
True

Here is the impact in timing on a lenovo yoga 920, intel core i7 8th gen, 16gb ram, running Ubuntu 18.04:

>>> import numpy as np
>>> from scipy.signal import convolve
>>> h = np.random.randn(1)
>>> x = np.random.randn(1000000)
>>> %timeit convolve(x, h, method="direct")
714 µs ± 14 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
>>> %timeit convolve(x, h)
70.2 ms ± 1.34 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

The automatic run, which chooses fft, is slower by a x100 factor.

The root cause for this is the incorrect estimate for the number of multiplications in the direct convolution algorithm. This is line 506 in v1.3.1, and 497 of the latest commit [8a3e9fa] as of writing this issue:

direct_time = (x.size * h.size * _prod(out_shape))

The correct direct computation time is actually:

direct_time = min(x.size, h.size) * _prod(out_shape)

The big oh constants, in the tens of thousands, were probably calculated as a ratio to the incorrect estimate. So a fix to the direct estimate will need to be accompanied by a recalculation to the constants.

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Comments: 18 (10 by maintainers)

Commits related to this issue

Most upvoted comments

The formula I gave is based directly on the C code and is exactly the number the multiplications.

How did you determine that?

Why should it be min(x.size, h.size) instead of x.size * h.size?

For each element in the output, you calculate sum(a * b) where a and b are the appropriately shifted input arrays. That operation has linear time complexity, not quadratic.

If correct I presume this will have to be the minima along each dimension.

In theory, I think that’s right. However, thinking about this again, it’s might actually be the maximum along each dimension because of https://github.com/scipy/scipy/issues/5967#issuecomment-519861440.

Thanks for this issue – I’ll likely use this over in stsievert/scipy-convolve-timings.

The correct direct computation time is actually: direct_time = min(x.size, h.size) * _prod(out_shape)

Why should it be min(x.size, h.size) instead of x.size * h.size? If correct I presume this will have to be the minima along each dimension.