tensorflow: Memory leak with tf.shuffle, doesn't release buffer memory
System information
- OS Platform and Distribution Linux Ubuntu 16.04
- Python: 2.7.17 |Anaconda, Inc.| (default, Oct 21 2019, 19:04:46) [GCC 7.3.0]
- Tensorflow: 1.12.0
- Numpy: 1.16.5
- GPU: GeForce RTX 2080 Ti
- CUDA: 9.2
Describe the current behavior CPU memory gradually increase after each epoch until the program restarts, i suspect that dataset.shuffle doesn’t release the buffer memory. Tested with tf 1.15, same situation.
Code to reproduce the issue
import numpy as np
import tensorflow as tf
class ASRDataGenerator(object):
def __init__(self,num):
self.num = num
def __call__(self):
for i in range(self.num):
for j in range(106):
yield 'a','b',np.random.randn(100,120)
class TFASRDataSet(object):
def __init__(self,num,batch_size):
self.num = num
self.batch_size = batch_size
self.asrDataGenerator = ASRDataGenerator(num)
def setDataSetIterator(self):
dataset = tf.data.Dataset.from_generator(self.asrDataGenerator, (tf.string,tf.string,tf.float32))
dataset = dataset.shuffle(30000)
dataset = dataset.map(lambda s1,s2,feat: [s1,s2,feat])
dataset = dataset.batch(self.batch_size, drop_remainder=True)
self.iterator = dataset.make_initializable_iterator()
test_tfASRDataSet = TFASRDataSet(248,192)
test_tfASRDataSet.setDataSetIterator()
test_iter = test_tfASRDataSet.iterator
test_next = test_iter.get_next()
run_config = tf.ConfigProto()
run_config.gpu_options.allow_growth = True
run_config.allow_soft_placement = True
with tf.Session(config=run_config) as sess:
for i in range(100):
sess.run(test_iter.initializer)
while True:
try:
loss_list = sess.run([test_next])
print(len(loss_list[0]))
except tf.errors.OutOfRangeError:
print("train epoch %d finish" % (i+1))
break
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Reactions: 3
- Comments: 18 (9 by maintainers)
I have recently investigated the memory growth observed for OSS version of TensorFlow when shuffle is used. The conclusion of my investigation is that the memory growth is because of poor performance of the memory allocator (TensorFlow OSS uses system malloc by default). In my experiments, switching to use TCMalloc (details below) resulted in constant memory usage (and program speedup).
For the evaluation, I used the following simple input pipeline:
When executed on workstation, it produces the following output:
I then installed tcmalloc using
sudo apt-get install libtcmalloc-minimal4and used it for the same program, as follows:Not only the gradual memory growth disappeared, but the program also ran 2x faster.
@azzeddineCH I am on 2.7 and still have the problem.