onnxruntime: `static inline Ort::Env onnx_env{nullptr}` easily leads to nullptr deref on app exit
Creating a null Ort::Env without later moving in an initialized Env + defining ORT_API_MANUAL_INIT can easily crash on app exit due to not checking for the null Env or null Ort::Global<void>::api_
Setup
- Microsoft Windows [Version 10.0.19044.1889]
- MSVS 2019 community v16.11.18
- Onnx v1.12.1
Repro
Pseudocode
class MyOnnx {
static inline Ort::Env onnx_env{nullptr};
...
};
main() {
// do nothing
}
- build app with pseudocode above and defining
ORT_API_MANUAL_INIT - run app
Result
App crashes on exit due to nullptr derefence. Debugger will report access violation and Ort::Global<void>::api_ was nullptr
Expected
No crash. Silent exit.
Notes
I see the cause in code. struct Env : Base<OrtEnv> and that ~Base() { OrtRelease(p_); } blindly calls OrtRelease() without checking that a release isn’t necessary since _p = nullptr.
OrtRelease for that class is ORT_DEFINE_RELEASE(Env); which expands with OrtRelease(Ort##NAME* ptr) { GetApi().Release##NAME(ptr); } which calls GetApi() and blindly uses its pointer without checking for nullptr which leads to the null dereference crash.
I would prefer Onnx to check for nullptr before using a pointer. It could be done in the Base class or in the macro. I slightly prefer it in Base so it doesn’t do unnecessary work of calling getapi(), etc. Putting it in the macro would also solve the crash but still runs unnecessary code.
Related https://github.com/microsoft/onnxruntime/issues/2297
About this issue
- Original URL
- State: open
- Created 2 years ago
- Comments: 19 (16 by maintainers)
Commits related to this issue
- check for null `GetApi()` before release api calls - fixes microsoft/onnxruntime#12736 — committed to diablodale/onnxruntime by diablodale 2 years ago
- check for null `GetApi()` before release api calls - fixes microsoft/onnxruntime#12736 — committed to diablodale/onnxruntime by diablodale 2 years ago
- add tests for issue microsoft/onnxruntime#12736 — committed to diablodale/onnxruntime by diablodale 2 years ago
- fix microsoft/onnxruntime#12736 - check GetApi pointer before deref — committed to diablodale/onnxruntime by diablodale 2 years ago
- add tests for issue microsoft/onnxruntime#12736 - support minimal no exception builds - use ORT_THROW() and ORT_NO_EXCEPTIONS macros — committed to diablodale/onnxruntime by diablodale 2 years ago
- fix microsoft/onnxruntime#12736 - check GetApi pointer before deref — committed to diablodale/onnxruntime by diablodale 2 years ago
- check for null `GetApi()` before release api calls - fixes microsoft/onnxruntime#12736 — committed to diablodale/onnxruntime by diablodale 2 years ago
I’m facing the same problem. When define
Ort::Envas a global variable beforemain, the crashing happens resulted fromNULLPTRwith address0x0000000000. This is weird. At least, this issue should be pointed out in the document instead of obscurely wrote in the example code. Quite annoying.False. And to repeat again as I wrote, it is undefined by C++ spec and the implementer of the deleting function is responsible for handling. Please google this topic for detailed C++ spec and discussions on this.
STL and default deleters handle nullptr being passed to the deleters. But custom ones like I discussed DO NOT.
onnx c++/c headers need to check for nullptr. Use
if (theptr) release(theptr)instead of today’s blindrelease(theptr). I discuss this in my OP.As we know, the env is singular…only one. I forget but when I isolated this bug, I think deep down the env is a singleton or something similar. When classes/struct are defined in the app, having a single env as a static member of a class, singleton, etc. is a common approach. Naturally, you know about the static initialization fiasco…and its partner the static deinitialization fiasco. The orders of init and deinit are not guaranteed across translation units and in general you CAN NOT base rules no things like
main().Who writes code with
main()? Today we write DLLs and event loops and more complex applications. Making a rule based onmain()is doomed to fail in anything but the most basic applications. 😁 and in the very link you provide me is a//TODO: Fix the C API issueand//If we don't do this, it will crashwhich identifies that…yes…the API is currently broken. The solutions I provide above can address many of them.Test for non-null…then release. Don’t do it blindly.