bottleneck: Leaks memory when input is not a numpy array
If you run the following program you see that nansum
leaks all the memory it are given when passed a Pandas object. If it is passed the ndarray underlying the Pandas object instead then there is no leak:
import psutil
import gc
def f():
x = np.zeros(10*1024*1024, dtype='f4')
# Leaks 40MB/iteration
bottleneck.nansum(pd.Series(x))
# No leak:
#bottleneck.nansum(x)
process = psutil.Process(os.getpid())
def _get_usage():
gc.collect()
return process.memory_info().private / (1024*1024)
last_usage = _get_usage()
print(last_usage)
for _ in range(10):
f()
usage = _get_usage()
print(usage - last_usage)
last_usage = usage
This affects not just nansum
, but apparently all the reduction functions (with or without axis
specified), and at least some other functions like move_max
.
I’m not completely sure why this happens, but maybe it’s because PyArray_FROM_O
is allocating a new array in this case, and the ref count of that is not being decremented by anyone? https://github.com/kwgoodman/bottleneck/blob/master/bottleneck/src/reduce_template.c#L1237
I’m using Bottleneck 1.2.1 with Pandas 0.23.1. sys.version
is 3.6.1 (v3.6.1:69c0db5, Mar 21 2017, 18:41:36) [MSC v.1900 64 bit (AMD64)]
.
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Reactions: 2
- Comments: 15 (3 by maintainers)
Commits related to this issue
- fix memory leak #201 but only for reduce functions — committed to pydata/bottleneck by kwgoodman 5 years ago
- refactor bugfix #201 — committed to pydata/bottleneck by kwgoodman 5 years ago
- plug memory leak when raising exception #201 — committed to pydata/bottleneck by kwgoodman 5 years ago
- refactor bugfix #201 — committed to pydata/bottleneck by kwgoodman 5 years ago
- remove superfluous Py_DECREF #201 — committed to pydata/bottleneck by kwgoodman 5 years ago
- refactor bugfix #201 — committed to pydata/bottleneck by kwgoodman 5 years ago
- fix memory leak #201 — committed to pydata/bottleneck by kwgoodman 5 years ago
OK, I merged the memory leak fix into master.