tensorflow: TF-2.0: Keras model save load memory leakage

System information

  • Have I written custom code (as opposed to using a stock example script provided in TensorFlow): Yes
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 18.04.3 LTS
  • Mobile device (e.g. iPhone 8, Pixel 2, Samsung Galaxy) if the issue happens on mobile device:
  • TensorFlow installed from (source or binary): Binary
  • TensorFlow version (use command below): tensorflow-gpu==2.0.0-beta1
  • Python version: Python 3.6.8
  • Bazel version (if compiling from source):
  • GCC/Compiler version (if compiling from source):
  • CUDA/cuDNN version: CUDA 10.0.130/ cuDNN 7.6
  • GPU model and memory: GeForce GTX 1050 Ti, 4036 MiB

Describe the current behavior During saving and loading model I face memory leakage. Eventually it crashes with OSError: [Errno 12] Cannot allocate memory. During real training it happens quite fast.

Describe the expected behavior I expect that memory should be cleaned.

Code to reproduce the issue

import tensorflow as tf
import numpy as np
from tqdm import tqdm
from memory_profiler import profile

data_array = np.random.random_sample((1, 1024))
tf_array = tf.constant(data_array, dtype=tf.float32)

input = tf.keras.Input((1, 1024))
hidden_layer = tf.keras.layers.Dense(1024)(input)
output = tf.keras.layers.Dense(1)(hidden_layer)
model = tf.keras.Model(inputs=[input], outputs=[output])

pred = model([tf_array])
print(pred)


@profile
def func():
    export_path = "temp_export"
    tf.saved_model.save(model, export_path)
    imported = tf.saved_model.load(export_path)


for i in tqdm(range(1000000), total=1000000):
    func()

Other info / logs Profiler logs (2 steps) Снимок экрана от 2019-09-05 15-14-15

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Comments: 22 (9 by maintainers)

Most upvoted comments

I’ve managed to find a workaround for this issue which satisfied me. This is to save a model with h5 format. With this approach there is no memory leak even with tf-2.0, and besides there are other benefits:

  1. models are saved and loaded much faster (in my case ~4 times faster);
  2. loaded models take much less memory (10 times less);
  3. prediction also goes faster (~5 times).

And it didn’t require many modifications to apply this approach, essentially I had to implement method get_config for my custom layers.

Also if you need immediate memory deallocation, you can use tf.keras.backend.clear_session() and gc.collect(), they work fine now.

Here is modified code from my gist:

import tensorflow as tf
import psutil
import gc

input = tf.keras.Input((1, 1024))
dense1 = tf.keras.layers.Dense(1024)(input)
dense2 = tf.keras.layers.Dense(1024)(dense1)
dense2 = tf.keras.layers.BatchNormalization()(dense2)
dense2 = tf.keras.layers.LeakyReLU()(dense2)
output = tf.keras.layers.Dense(1)(dense2)
model = tf.keras.Model(inputs=[input], outputs=[output])

def func():
  export_path = "temp_export.h5"
  model.save(export_path)
  tf.keras.models.load_model(export_path)
  tf.keras.backend.clear_session()

for i in range(1000000):
    func()
    if i % 100 == 0:
        print(i, ": free memory", psutil.virtual_memory().available / (1024.0 ** 2), "Mb") 
    gc.collect()