tensorflow: Memory leak in forward pass (e.g., of ResNet50 model) with TensorFlow 2.12.0 and Python 3.11
The following minimal example reproduces the memory leak I ran into. (No GPU, just CPU.)
memleak.py:
import numpy as np
import psutil
import tensorflow as tf
model = tf.keras.applications.ResNet50() # VGG19 seems to not leak.
# tf.config.threading.set_inter_op_parallelism_threads(0) and tf.config.threading.set_intra_op_parallelism_threads(0) do not help.
inp = (np.random.rand(1, 224, 224, 3) * 255).astype('uint8')
for run in range(1, 9999999):
model(inp)
memory_usage_in_MiB = psutil.Process().memory_info().rss / (1024 * 1024)
print(f'Memory usage after {run} run(s) (in MiB): {memory_usage_in_MiB:.3f}', flush=True)
Dockerfile:
FROM python:3.11.2
RUN pip install --no-cache-dir tensorflow==2.12.0 psutil==5.9.4
# Disable the Docker cache from this stage on, see https://stackoverflow.com/a/58801213/1866775
ADD "https://www.random.org/cgi-bin/randbyte?nbytes=10&format=h" skipcache
ADD ./memleak.py /
RUN python /memleak.py
Output (docker build --rm .):
Memory usage after 1 run(s) (in MiB): 604.324
Memory usage after 2 run(s) (in MiB): 606.906
Memory usage after 3 run(s) (in MiB): 606.906
Memory usage after 4 run(s) (in MiB): 606.906
Memory usage after 5 run(s) (in MiB): 606.906
Memory usage after 6 run(s) (in MiB): 607.164
Memory usage after 7 run(s) (in MiB): 607.164
Memory usage after 8 run(s) (in MiB): 607.164
Memory usage after 9 run(s) (in MiB): 607.164
Memory usage after 10 run(s) (in MiB): 607.164
Memory usage after 11 run(s) (in MiB): 607.422
Memory usage after 12 run(s) (in MiB): 607.422
[...]
Memory usage after 498 run(s) (in MiB): 626.242
Memory usage after 499 run(s) (in MiB): 626.242
Memory usage after 500 run(s) (in MiB): 626.242
Memory usage after 501 run(s) (in MiB): 626.500
Memory usage after 502 run(s) (in MiB): 626.500
[...]
[...]
Memory usage after 1996 run(s) (in MiB): 683.477
Memory usage after 1997 run(s) (in MiB): 683.734
Memory usage after 1998 run(s) (in MiB): 683.734
Memory usage after 1999 run(s) (in MiB): 683.734
Memory usage after 2000 run(s) (in MiB): 683.734
Memory usage after 2001 run(s) (in MiB): 683.734
[...]
Memory usage after 9996 run(s) (in MiB): 960.258
Memory usage after 9997 run(s) (in MiB): 960.508
Memory usage after 9998 run(s) (in MiB): 960.508
Memory usage after 9999 run(s) (in MiB): 960.508
Memory usage after 10000 run(s) (in MiB): 960.508
Memory usage after 10001 run(s) (in MiB): 960.508
[...]
Memory usage after 24997 run(s) (in MiB): 1547.840
Memory usage after 24998 run(s) (in MiB): 1547.840
Memory usage after 24999 run(s) (in MiB): 1534.230
Memory usage after 25000 run(s) (in MiB): 1532.348
Memory usage after 25001 run(s) (in MiB): 1533.441
Memory usage after 25002 run(s) (in MiB): 1544.711
[...]
When using the same TensorFlow version (2.12.0) but with Python 3.10.10 (instead of 3.11.2), the memory usage does not grow.
About this issue
- Original URL
- State: open
- Created a year ago
- Reactions: 4
- Comments: 17 (2 by maintainers)
Any news on this ? The issue is still present in tf-nighly and not specific to keras
Dockerfilememleak.py:I have replicated the issue on Colab also with tf-nightly version and with python 3.9.16.
Please refer the attached gist for details.