catboost: Potential memory leak in catboost.Pool
Problem: Memory leak in long running applications using catboost catboost version: 0.26.1 Operating System: MacOS Catalina CPU: 2.6 GHz 6-Core Intel Core i7
The issue of Pool
object not releasing memory was already discussed in https://github.com/catboost/catboost/issues/892 and explained here https://github.com/catboost/catboost/issues/892#issuecomment-583037773.
We are using a slightly modified example from the comment above:
import gc
import os
import sys
import catboost as cb
import numpy as np
import psutil
def memory_footprint():
"""Returns memory (in MB) being used by Python process"""
mem = psutil.Process(os.getpid()).memory_info().rss
return mem / 1024 ** 2
def main(batch_size=15, n_iterations=100, print_every=10, cleanup_every=None):
print("python version=", sys.version)
print("numpy version=", np.__version__)
print("catboost version=", cb.__version__)
features = [[1, 1, 0, 0.5, 0.33]] * batch_size
cat_indices = [0, 1, 2]
for i in range(n_iterations):
if i % print_every == 0:
print("Memory usage (iter {}): {:.2f} MB".format(i, memory_footprint()))
features_pool = cb.Pool(features, cat_features=cat_indices)
if cleanup_every and (i % cleanup_every == 0):
del features_pool
gc.collect()
When running for small number of iterations with large batch size there seems to be no issue (memory seems to reach a plateau after several iterations):
main(batch_size=1_500_000, n_iterations=15, print_every=1, cleanup_every=1)
python version= 3.8.7 (default, Mar 4 2021, 17:04:03)
[Clang 12.0.0 (clang-1200.0.32.29)]
numpy version= 1.21.2
catboost version= 0.26.1
Memory usage (iter 0): 87.73 MB
Memory usage (iter 1): 120.77 MB
Memory usage (iter 2): 120.80 MB
Memory usage (iter 3): 120.72 MB
Memory usage (iter 4): 120.75 MB
Memory usage (iter 5): 120.79 MB
Memory usage (iter 6): 120.82 MB
Memory usage (iter 7): 120.82 MB
Memory usage (iter 8): 120.85 MB
Memory usage (iter 9): 120.85 MB
Memory usage (iter 10): 120.85 MB
Memory usage (iter 11): 120.85 MB
Memory usage (iter 12): 120.85 MB
Memory usage (iter 13): 120.85 MB
Memory usage (iter 14): 120.85 MB
However, if we reduce the batch size and let it run for some time, memory keeps increasing after 1.4 million iterations:
main(batch_size=15, n_iterations=1_500_000, print_every=50000, cleanup_every=1000)
python version= 3.8.7 (default, Mar 4 2021, 17:04:03)
[Clang 12.0.0 (clang-1200.0.32.29)]
numpy version= 1.21.2
catboost version= 0.26.1
Memory usage (iter 0): 76.31 MB
Memory usage (iter 50000): 86.60 MB
Memory usage (iter 100000): 95.76 MB
Memory usage (iter 150000): 106.44 MB
Memory usage (iter 200000): 114.22 MB
Memory usage (iter 250000): 127.86 MB
Memory usage (iter 300000): 135.63 MB
Memory usage (iter 350000): 143.37 MB
Memory usage (iter 400000): 163.04 MB
Memory usage (iter 450000): 170.80 MB
Memory usage (iter 500000): 178.55 MB
Memory usage (iter 550000): 186.30 MB
Memory usage (iter 600000): 194.05 MB
Memory usage (iter 650000): 201.80 MB
Memory usage (iter 700000): 209.55 MB
Memory usage (iter 750000): 217.31 MB
Memory usage (iter 800000): 249.01 MB
Memory usage (iter 850000): 256.75 MB
Memory usage (iter 900000): 201.80 MB
Memory usage (iter 950000): 209.22 MB
Memory usage (iter 1000000): 217.59 MB
Memory usage (iter 1050000): 224.99 MB
Memory usage (iter 1100000): 233.52 MB
Memory usage (iter 1150000): 242.02 MB
Memory usage (iter 1200000): 250.55 MB
Memory usage (iter 1250000): 259.06 MB
Memory usage (iter 1300000): 267.56 MB
Memory usage (iter 1350000): 276.07 MB
Memory usage (iter 1400000): 284.59 MB
Memory usage (iter 1450000): 293.11 MB
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Reactions: 2
- Comments: 16 (6 by maintainers)
Commits related to this issue
- Switch to TBB local executor to limit TLS size (github #1835) MLTOOLS-5866 ref:44dd4400b0c8f6c0810cc2490dc11593120408a2 — committed to catboost/catboost by Evgueni-Petrov-aka-espetrov 3 years ago
- Switch to TBB local executor to limit TLS size (github #1835) MLTOOLS-5866 ref:44dd4400b0c8f6c0810cc2490dc11593120408a2 — committed to catboost/catboost by Evgueni-Petrov-aka-espetrov a year ago
Yes, we will release CatBoost with the first fix soon. However, we will keep tcmalloc because it is slightly faster than the default allocator.
@zquintana it was fixed in 1.0 catboost release (specifically here), we didn’t have any issues since then