openvino: [Bug] Model compilation takes too much memory in cpp

System information

OpenVINO => 2022.2.0
Operating System / Platform => Linux 64 Bit
Compiler => g++
Problem classification: Model Conversion
Framework: PyTorch
OS: Linux f93eccdcf762 3.10.0-1160.76.1.el7.x86_64
Model name: HiFiGAN
Max memory: 15.3GB

Detailed description

I often follow the following procedure to load an ONNX model and inference with a fixed-size tensor. But today when I ran a benchmark(google benchmark) to test the performance of our model, I found model compilation with shape{1, 500, 80}(compiledModel_ = core_.compile_model(model); ) takes too much memory thus leading to OOM and the whole program was killed.

// load and compile the model
    core_.set_property("AUTO", ov::log::level(ov::log::Level::WARNING));
    model_ = core_.read_model(model_path);
    ov::Shape static_shape = {1, static_cast<unsigned long>(mellen), 80};
    model->reshape(static_shape);
    compiledModel_ = core_.compile_model(model); // leads to OOM
    inferRequest_ = compiledModel_.create_infer_request();
// infer
    ov::Tensor input_tensor(ov::element::f32, static_shape, input.data());
    inferRequest_.set_input_tensor(input_tensor);
    inferRequest_.infer();
    auto wav_out = inferRequest_.get_output_tensor(0);

Acturally I don’t know how to deal with this problem. Maybe I shouldn’t load ONNX model and compile it to IR in cpp?

Often the mel length varies from 100 to 1300. OOM will happene with mel length >= 500.

About this issue

Original URL
State: closed
Created 2 years ago
Comments: 57 (27 by maintainers)

Most upvoted comments

@yszhou2019 we did find there is mem leak in tbb, but the different calltrace with you had provided before.

Per my understand, the above calltrace will not be called multiple times when loop call inference(), so it should not bring too much memory leak. Anyhow we will continue debugging this tbb memory leak to try to find a solution to solve it.

To skip tbb mem leak issue, if possible you can rebuild openvino with option disable tbb (-DTHREADING=SEQ), and run the same test sample to check whether the memory consumption still continuously increasing.

riverlijunjie on Nov 22, 2022

You can also try running valgrind with the massif plugin - this might help you narrow down the memory leak root cause.

tomdol on Nov 10, 2022

@yszhou2019 Please share the log for below two commands:

compiledModel_ = core_.compile_model(model, "CPU", ov::hint::performance_mode(ov::hint::PerformanceMode::THROUGHPUT));
compiledModel_ = core_.compile_model(model);

wangleis on Oct 26, 2022

It looks like we have a but inside the AUTO plugin. @wangleis Can you take a look?

ilyachur on Oct 24, 2022

ov::Exception will be thrown if I load the model with “CPU”. like
[OpenVINO] Compiling Mellen 1200
terminate called after throwing an instance of 'ov::Exception'
  what():  Please, check environment due to no supported devices can be used
    ov::Core core;
    core.set_property("CPU", ov::log::level(ov::log::Level::WARNING));
I can send you this model and cpp test file. Could you leave your email?

I meant, could you use ov::CompiledModel model1 = core.compile_model(model, "CPU");? Does it work?

ilyachur on Oct 24, 2022