openvino: [Bug] Model compilation takes too much memory in cpp
System information
- OpenVINO => 2022.2.0
- Operating System / Platform => Linux 64 Bit
- Compiler => g++
- Problem classification: Model Conversion
- Framework: PyTorch
- OS: Linux f93eccdcf762 3.10.0-1160.76.1.el7.x86_64
- Model name: HiFiGAN
- Max memory: 15.3GB
Detailed description
I often follow the following procedure to load an ONNX model and inference with a fixed-size tensor. But today when I ran a benchmark(google benchmark) to test the performance of our model, I found model compilation with shape{1, 500, 80}(compiledModel_ = core_.compile_model(model); ) takes too much memory thus leading to OOM and the whole program was killed.
// load and compile the model
core_.set_property("AUTO", ov::log::level(ov::log::Level::WARNING));
model_ = core_.read_model(model_path);
ov::Shape static_shape = {1, static_cast<unsigned long>(mellen), 80};
model->reshape(static_shape);
compiledModel_ = core_.compile_model(model); // leads to OOM
inferRequest_ = compiledModel_.create_infer_request();
// infer
ov::Tensor input_tensor(ov::element::f32, static_shape, input.data());
inferRequest_.set_input_tensor(input_tensor);
inferRequest_.infer();
auto wav_out = inferRequest_.get_output_tensor(0);
Acturally I don’t know how to deal with this problem. Maybe I shouldn’t load ONNX model and compile it to IR in cpp?
Often the mel length varies from 100 to 1300. OOM will happene with mel length >= 500.
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Comments: 57 (27 by maintainers)
@yszhou2019 we did find there is mem leak in tbb, but the different calltrace with you had provided before.
Per my understand, the above calltrace will not be called multiple times when loop call inference(), so it should not bring too much memory leak. Anyhow we will continue debugging this tbb memory leak to try to find a solution to solve it.
To skip tbb mem leak issue, if possible you can rebuild openvino with option disable tbb (-DTHREADING=SEQ), and run the same test sample to check whether the memory consumption still continuously increasing.
You can also try running valgrind with the massif plugin - this might help you narrow down the memory leak root cause.
@yszhou2019 Please share the log for below two commands:
compiledModel_ = core_.compile_model(model, "CPU", ov::hint::performance_mode(ov::hint::PerformanceMode::THROUGHPUT));compiledModel_ = core_.compile_model(model);It looks like we have a but inside the AUTO plugin. @wangleis Can you take a look?
I meant, could you use
ov::CompiledModel model1 = core.compile_model(model, "CPU");? Does it work?