tensorflow: Suspected memory leak - when loading multiple models with tf.saved_model.load

Please make sure that this is a bug. As per our GitHub Policy, we only address code/doc bugs, performance issues, feature requests and build/installation issues on GitHub. tag:bug_template

System information

  • Have I written custom code (as opposed to using a stock example script provided in TensorFlow): YES

  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux Ubuntu 18.04

  • TensorFlow installed from (source or binary): binary wheel via PyPI

  • TensorFlow version (use command below): v2.4.0-rc3-20-g97c3fef64ba 2.4.0-rc4

  • Python version: 3.6 Bazel version (if compiling from source): NA GCC/Compiler version (if compiling from source): NA

  • CUDA/cuDNN version: CUDA 11.1

  • GPU model and memory: V100 16GB

Describe the current behavior I’m suspecting a CPU memory leak when loading multiple models. When I’m running infinite loop that keeps loading the same model while using the same variable the memory (private bytes and working set) of the process keep increasing. At some points the working set seems to free some memory, but the trend is that the memory keeps on rising. For example I used a simple model. For our current real model the memory leak in tf 2.3 is 0.32MB per load model For our current real model the memory leak in tf 2.4 is 0.08MB per load model it is still an issue since the model can be changed and our server serve different models 24/7

This trend happens even though I call gc.collect() on every iteration and tf.keras.backend.clear_session().

Describe the expected behavior The memory shouldnt increase on each interation

Standalone code to reproduce the issue

import tensorflow as tf
import gc


def build_and_save_own_model():
    model = tf.keras.models.Sequential([
        tf.keras.layers.Flatten(input_shape=(28, 28)),
        tf.keras.layers.Dense(128, activation='relu'),
        tf.keras.layers.Dropout(0.2),
        tf.keras.layers.Dense(10, activation='softmax')
    ])
    model.save('my_model')
    tf.keras.backend.clear_session()
    del model
    gc.collect()


def profile_load_model(path):
    model = tf.saved_model.load(path)
    tf.keras.backend.clear_session()
    del model
    gc.collect()


def run_model():
    model_path = 'my_model'
    build_and_save_own_model()
    print("load model in loops:")
    c = 1
    while True:
        print("----------- iter", c)
        profile_load_model(model_path)
        c += 1


if __name__ == '__main__':
    print("*****************************************************")
    print("START LOADING MODEL")
    print(tf.version.GIT_VERSION, tf.version.VERSION)
    print("*****************************************************")
    run_model()

TF2_3 memory leak: memleak_tf2_3

TF2_4 memory leak: memleak_tf2_4

thanks

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 17 (8 by maintainers)

Most upvoted comments

@jvishnuvardhan Looks good to me, thanks. But I have not opened this issue, so it’s not up to me to close it 😃