openvino: [Bug] Model compilation takes too much memory in cpp

System information
  • OpenVINO => 2022.2.0
  • Operating System / Platform => Linux 64 Bit
  • Compiler => g++
  • Problem classification: Model Conversion
  • Framework: PyTorch
  • OS: Linux f93eccdcf762 3.10.0-1160.76.1.el7.x86_64
  • Model name: HiFiGAN
  • Max memory: 15.3GB
Detailed description

I often follow the following procedure to load an ONNX model and inference with a fixed-size tensor. But today when I ran a benchmark(google benchmark) to test the performance of our model, I found model compilation with shape{1, 500, 80}(compiledModel_ = core_.compile_model(model); ) takes too much memory thus leading to OOM and the whole program was killed.

// load and compile the model
    core_.set_property("AUTO", ov::log::level(ov::log::Level::WARNING));
    model_ = core_.read_model(model_path);
    ov::Shape static_shape = {1, static_cast<unsigned long>(mellen), 80};
    model->reshape(static_shape);
    compiledModel_ = core_.compile_model(model); // leads to OOM
    inferRequest_ = compiledModel_.create_infer_request();
// infer
    ov::Tensor input_tensor(ov::element::f32, static_shape, input.data());
    inferRequest_.set_input_tensor(input_tensor);
    inferRequest_.infer();
    auto wav_out = inferRequest_.get_output_tensor(0);

Acturally I don’t know how to deal with this problem. Maybe I shouldn’t load ONNX model and compile it to IR in cpp?

Often the mel length varies from 100 to 1300. OOM will happene with mel length >= 500.

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Comments: 57 (27 by maintainers)

Most upvoted comments

@yszhou2019 we did find there is mem leak in tbb, but the different calltrace with you had provided before. image

Per my understand, the above calltrace will not be called multiple times when loop call inference(), so it should not bring too much memory leak. Anyhow we will continue debugging this tbb memory leak to try to find a solution to solve it.

To skip tbb mem leak issue, if possible you can rebuild openvino with option disable tbb (-DTHREADING=SEQ), and run the same test sample to check whether the memory consumption still continuously increasing.

You can also try running valgrind with the massif plugin - this might help you narrow down the memory leak root cause.

@yszhou2019 Please share the log for below two commands:

  1. compiledModel_ = core_.compile_model(model, "CPU", ov::hint::performance_mode(ov::hint::PerformanceMode::THROUGHPUT));
  2. compiledModel_ = core_.compile_model(model);

It looks like we have a but inside the AUTO plugin. @wangleis Can you take a look?

ov::Exception will be thrown if I load the model with “CPU”. like

[OpenVINO] Compiling Mellen 1200
terminate called after throwing an instance of 'ov::Exception'
  what():  Please, check environment due to no supported devices can be used
    ov::Core core;
    core.set_property("CPU", ov::log::level(ov::log::Level::WARNING));

I can send you this model and cpp test file. Could you leave your email?

I meant, could you use ov::CompiledModel model1 = core.compile_model(model, "CPU");? Does it work?