astropy: Knuth’s rule fails with simple and small array (eats up system's memory)
Description
The knuth_bin_width
is not able to handle a small and simple array.
Expected behavior
A histogram should be generated, or an error shown explaining why it was not possible to obtain it.
Actual behavior
The function starts to gobble up the system’s memory.
Steps to Reproduce
import numpy as np
import matplotlib.pyplot as plt
from astropy.visualization import hist
arr = np.array([0.05555556, 0. , 0. , 0. , 0. ,1. , 0. , 0. , 0. , 0.5 ])
ax = plt.subplot(111)
hist(arr, bins='knuth', ax=ax)
System Details
Linux-5.5.0-050500-generic-x86_64-with-glibc2.10
>>> Python 3.8.8 (default, Feb 24 2021, 21:46:12)
[GCC 7.3.0]
>>> Numpy 1.19.2
>>> astropy 4.2
>>> Scipy 1.5.2
>>> Matplotlib 3.3.1
About this issue
- Original URL
- State: open
- Created 3 years ago
- Comments: 15 (15 by maintainers)
@Gabriel-p Even if that is case, I think it’s better than the current code which tries to optimize a highly nonconvex function that can possibly have no minimum. A grid search with a sensible choice for the upper-bound on the number of bins seems a reasonable solution. What do you think?
PS: The max number of bins could even be an input parameter with some sensible default.
Because the issue happens inside the
optimize
call. That’s whereM
grows without boundYou can get 4.3.dev by installing the development version from
master
. It does take a while to compute. Didn’t eat up all my machine’s memory, so the eating does stop at some point. I think this might actually be a bug in:https://github.com/astropy/astropy/blob/6fd98e528ee59d2b7d9b932946a39e199221360d/astropy/stats/histogram.py#L16
cc @larrybradley