tvm: [Bug] PyTorch and TVM loading problem due to conflicting LLVM symbols
Apparently, the new PyTorch release crashes with symbols loaded by TVM, so the following trivial code crashes with invalid pointer Aborted (core dumped) upon exit:
import tvm
import torch
We can workaround this by swapping the import order, but as pointed out in https://github.com/apache/tvm/issues/9349#issuecomment-950685224 this may not always be possible.
Another solution is to remove the use of RTLD_GLOBAL in https://github.com/apache/tvm/blob/dfe4cebbdadab3d4e6e6ba3951276a51a4ffeaf6/python/tvm/_ffi/base.py#L57
See related issues in other repos that moved away from using RTLD_GLOBAL.
https://github.com/dmlc/dgl/issues/2255
https://github.com/pytorch/pytorch/pull/28536
https://github.com/pytorch/pytorch/issues/3059
Is there any particular reason we are using RTLD_GLOBAL? @tqchen @areusch
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Reactions: 4
- Comments: 20 (17 by maintainers)
Commits related to this issue
- Set HIDE_PRIVATE_SYMBOLS ON for manylinux packages. * This is to workaround an issue caused by conflicting LLVM versions, first observed by since we updated Pytorch in TVM * Discussion at: https:/... — committed to leandron/tlcpack by leandron 3 years ago
- Set HIDE_PRIVATE_SYMBOLS ON for manylinux/osx/conda packages. * This is to workaround an issue caused by conflicting LLVM versions, first observed by since we updated Pytorch in TVM * Discussion a... — committed to leandron/tlcpack by leandron 3 years ago
- Set HIDE_PRIVATE_SYMBOLS ON for manylinux/osx/conda packages. (#81) * This is to workaround an issue caused by conflicting LLVM versions, first observed by since we updated Pytorch in TVM * Dis... — committed to tlc-pack/tlcpack by leandron 3 years ago
- [TVMC] Re-enable PyTorch test This test was originally disabled due to the issue documented in #7455 affecting CI. I believe this has since been resolved by #9362. Note: This patch should not be mer... — committed to lhutton1/tvm by lhutton1 3 years ago
- [TVMC] Re-enable PyTorch test This test was originally disabled due to the issue documented in #7455 affecting CI. I believe this has since been resolved by #9362. Note: This patch should not be mer... — committed to lhutton1/tvm by lhutton1 3 years ago
- [TVMC] Re-enable PyTorch test (#9441) This test was originally disabled due to the issue documented in #7455 affecting CI. I believe this has since been resolved by #9362. Note: This patch should... — committed to apache/tvm by lhutton1 3 years ago
- [TVMC] Add test for quantized pytorch model As a follow up to #9417 and now that #9362 is resolved, this PR adds a test to check quantized pytorch mobilenetv2 is converted correctly. Change-Id: Iaf2... — committed to lhutton1/tvm by lhutton1 3 years ago
- [TVMC] Add test for quantized pytorch model (#9467) As a follow up to #9417 and now that #9362 is resolved, this PR adds a test to check quantized pytorch mobilenetv2 is converted correctly. Chan... — committed to apache/tvm by lhutton1 3 years ago
- [TVMC] Re-enable PyTorch test (#9441) This test was originally disabled due to the issue documented in #7455 affecting CI. I believe this has since been resolved by #9362. Note: This patch should... — committed to mehrdadh/tvm by lhutton1 3 years ago
- [TVMC] Add test for quantized pytorch model (#9467) As a follow up to #9417 and now that #9362 is resolved, this PR adds a test to check quantized pytorch mobilenetv2 is converted correctly. Chan... — committed to mehrdadh/tvm by lhutton1 3 years ago
- [TVMC] Re-enable PyTorch test (#9441) This test was originally disabled due to the issue documented in #7455 affecting CI. I believe this has since been resolved by #9362. Note: This patch should... — committed to mehrdadh/tvm by lhutton1 3 years ago
- [TVMC] Add test for quantized pytorch model (#9467) As a follow up to #9417 and now that #9362 is resolved, this PR adds a test to check quantized pytorch mobilenetv2 is converted correctly. Chan... — committed to mehrdadh/tvm by lhutton1 3 years ago
- [TVMC] Re-enable PyTorch test (#9441) This test was originally disabled due to the issue documented in #7455 affecting CI. I believe this has since been resolved by #9362. Note: This patch should... — committed to ylc/tvm by lhutton1 3 years ago
- [TVMC] Add test for quantized pytorch model (#9467) As a follow up to #9417 and now that #9362 is resolved, this PR adds a test to check quantized pytorch mobilenetv2 is converted correctly. Chan... — committed to ylc/tvm by lhutton1 3 years ago
- [TVMC] Add test for quantized pytorch model (#9467) As a follow up to #9417 and now that #9362 is resolved, this PR adds a test to check quantized pytorch mobilenetv2 is converted correctly. Chan... — committed to yangulei/tvm by lhutton1 3 years ago
- [TVMC] Re-enable PyTorch test (#9441) This test was originally disabled due to the issue documented in #7455 affecting CI. I believe this has since been resolved by #9362. Note: This patch should... — committed to ylc/tvm by lhutton1 3 years ago
- [TVMC] Add test for quantized pytorch model (#9467) As a follow up to #9417 and now that #9362 is resolved, this PR adds a test to check quantized pytorch mobilenetv2 is converted correctly. Chan... — committed to ylc/tvm by lhutton1 3 years ago
- Link llvm statically and hide private symbols This fixes: Set hide private symbols to on to avoid the following error: free(): invalid pointer Aborted (core dumped) Reference: https://github.com/... — committed to ekut-es/hannah-tvm by cgerum 2 years ago
- Link llvm statically and hide private symbols This fixes: Set hide private symbols to on to avoid the following error: free(): invalid pointer Aborted (core dumped) Reference: https://github.com/... — committed to ekut-es/hannah-tvm by cgerum 2 years ago
OK, digged a bit into this. I think I know the possible cause. This is because of the conflict of LLVM symbols(due to different versions of LLVM being used). PyTorch also starts to ship with LLVM. To avoid the problem, we need to do two things
set(USE_LLVM "/path/to/llvm-config --link-static")set(HIDE_PRIVATE_SYMBOLS ON). This will effectively hide the LLVM related symbols when we load globally from pytorch.I did a quick experiment locally and when we turn both options ON, things are good, and there will be conflict with either option off.
I can confirm that
HIDE_PRIVATE_SYMBOLS=ONalso fixes it. I think this is a good enough workaround for now cc @lhutton1 .Would be good to find out what is the symbol that get conflicted((perhaps by linking things together)) and resolve it(rename the symbol in tvm side if possible). Note that the same problem will appear in the future if we really make an attempt to link pytorch in a deeper integration. This would serve as a way to resolve the possible issue.
RTLD_GLOBALprovides some convenience to give plugin modules(that are loaded later) symbols oflibtvm_runtimewithout explicitly linking to it, we might need to rethink the plugin mechanism(e.g. vta) a bit if we decided to move away from it.