mlc-llm: ROCM HIP Error: shared object initialization failed
🐛 Bug
sample_mlc_chat.py errors out after a while with:
TVMError: ROCM HIP Error: hipModuleLoadData(&(module_[device_id]), data_.c_str()) failed with error: shared object initialization failed
To Reproduce
Steps to reproduce the behavior:
- Install ROCm 5.6
- Follow get started instructions
- Download models and amend sample code
- Run
python sample_mlc_chat.py
System automatically detected device: rocm
Using model folder: /home/corv/Downloads/mlc/dist/prebuilt/mlc-chat-Llama-2-13b-chat-hf-q4f16_1
Using mlc chat config: /home/corv/Downloads/mlc/dist/prebuilt/mlc-chat-Llama-2-13b-chat-hf-q4f16_1/mlc-chat-config.json
Using library model: /home/corv/Downloads/mlc/dist/prebuilt/lib/Llama-2-13b-chat-hf-q4f16_1-rocm.so
Traceback (most recent call last):
File "/home/corv/Downloads/mlc/sample_mlc_chat.py", line 8, in <module>
cm = ChatModule(model="Llama-2-13b-chat-hf-q4f16_1")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Traceback (most recent call last):
File "/home/corv/Downloads/mlc/sample_mlc_chat.py", line 12, in <module>
output = cm.generate(
^^^^^^^^^^^^
File "/home/corv/.pyenv/versions/3.11.4/lib/python3.11/site-packages/mlc_chat/chat_module.py", line 641, in generate
self._prefill(prompt)
File "/home/corv/.pyenv/versions/3.11.4/lib/python3.11/site-packages/mlc_chat/chat_module.py", line 810, in _prefill
self._prefill_func(input, decode_next_token, place_in_prompt.value)
File "tvm/_ffi/_cython/./packed_func.pxi", line 331, in tvm._ffi._cy3.core.PackedFuncBase.__call__
File "tvm/_ffi/_cython/./packed_func.pxi", line 262, in tvm._ffi._cy3.core.FuncCall
File "tvm/_ffi/_cython/./packed_func.pxi", line 251, in tvm._ffi._cy3.core.FuncCall3
File "tvm/_ffi/_cython/./base.pxi", line 181, in tvm._ffi._cy3.core.CHECK_CALL
tvm._ffi.base.TVMError: Traceback (most recent call last):
10: TVMFuncCall
9: mlc::llm::LLMChatModule::GetFunction(tvm::runtime::String const&, tvm::runtime::ObjectPtr<tvm::runtime::Object> const&)::{lambda(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*)#5}::operator()(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const
at /workspace/mlc-llm/cpp/llm_chat.cc:1083
8: mlc::llm::LLMChat::PrefillStep(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, bool, bool, mlc::llm::PlaceInPrompt)
at /workspace/mlc-llm/cpp/llm_chat.cc:611
7: mlc::llm::LLMChat::ForwardTokens(std::vector<int, std::allocator<int> >, long)
at /workspace/mlc-llm/cpp/llm_chat.cc:836
6: tvm::runtime::relax_vm::VirtualMachineImpl::InvokeClosurePacked(tvm::runtime::ObjectRef const&, tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*)
5: tvm::runtime::PackedFuncObj::Extractor<tvm::runtime::PackedFuncSubObj<tvm::runtime::relax_vm::VirtualMachineImpl::GetClosureInternal(tvm::runtime::String const&, bool)::{lambda(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*)#1}> >::Call(tvm::runtime::PackedFuncObj const*, tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*)
4: tvm::runtime::relax_vm::VirtualMachineImpl::InvokeBytecode(long, std::vector<tvm::runtime::TVMRetValue, std::allocator<tvm::runtime::TVMRetValue> > const&)
3: tvm::runtime::relax_vm::VirtualMachineImpl::RunLoop()
2: tvm::runtime::relax_vm::VirtualMachineImpl::RunInstrCall(tvm::runtime::relax_vm::VMFrame*, tvm::runtime::relax_vm::Instruction)
1: tvm::runtime::PackedFuncObj::Extractor<tvm::runtime::PackedFuncSubObj<tvm::runtime::WrapPackedFunc(int (*)(TVMValue*, int*, int, TVMValue*, int*, void*), tvm::runtime::ObjectPtr<tvm::runtime::Object> const&)::{lambda(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*)#1}> >::Call(tvm::runtime::PackedFuncObj const*, tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*)
0: _ZN3tvm7runtime6deta
4: TVMFuncCall
3: tvm::runtime::PackedFuncObj::Extractor<tvm::runtime::PackedFuncSubObj<tvm::runtime::detail::PackFuncPackedArg_<0, tvm::runtime::ROCMWrappedFunc>(tvm::runtime::ROCMWrappedFunc, std::vector<tvm::runtime::detail::ArgConvertCode, std::allocator<tvm::runtime::detail::ArgConvertCode> > const&)::{lambda(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*)#1}> >::Call(tvm::runtime::PackedFuncObj const*, tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*)
2: tvm::runtime::ROCMWrappedFunc::operator()(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*, void*, unsigned long) const [clone .isra.0]
1: tvm::runtime::ROCMModuleNode::GetFunc(int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)
0: _ZN3tvm7runtime6deta
File "/workspace/tvm/src/runtime/rocm/rocm_module.cc", line 105
File "/workspace/tvm/src/runtime/library_module.cc", line 87
TVMError: ROCM HIP Error: hipModuleLoadData(&(module_[device_id]), data_.c_str()) failed with error: shared object initialization failed
Expected behavior
Expecting it to work as well as mlc_chat_cli
Environment
- Platform (e.g. WebGPU/Vulkan/IOS/Android/CUDA): ROCm
- Operating system (e.g. Ubuntu/Windows/MacOS/…): Arch Linux
- Device (e.g. iPhone 12 Pro, PC+RTX 3090, …) Radeon 6800XT
- How you installed MLC-LLM (
conda, source): conda - How you installed TVM-Unity (
pip, source): pip - Python version (e.g. 3.10): 3.11.4
- GPU driver version (if applicable):
- CUDA/cuDNN version (if applicable):
- TVM Unity Hash Tag (
python -c "import tvm; print('\n'.join(f'{k}: {v}' for k, v in tvm.support.libinfo().items()))", applicable if you compile models): - Any other relevant information: using pyenv / venv
About this issue
- Original URL
- State: closed
- Created a year ago
- Comments: 21 (8 by maintainers)
I tried to use
-mcpu=gfx1030, and it turned out that the compiled lib is not runnable on 7900 XTX, reporting the sameROCM HIP Error: hipModuleLoadData(...)error.@MasterJH5574 could you share a quick command line/process you used to re-compile the TVM module with -mcpu=gfx1030 ?
I’m about to do that myself so i can run MLC models on my 6800xt. Would love a pointer or two if you were successfully able to re-compile the module needed.
Thanks in advance!
yes that’s right.
Ok I was able to reproduce this issue. I think this happens because
Llama-2-7b-chat-hf-q4f16_1-rocm.sothat comes fromhttps://github.com/mlc-ai/binary-mlc-llm-libs.gitwas built for RDNA 3 (gfx 1100). So obviously it only works for the cards from that generation.It does work on RX 6000 series although they have never been officially supported. Like other said, it is likely a TVM-specific issue (e.g. we might need bitcode update).
I can try to reproduce this on my rx 6600xt.
or
HTH
I tried to follow the steps on another machine with 7900 XTX, and unfortunately was not able to reproduce the issue 😦
I noticed that the devices above are all not the latest generation, and am not sure if this is the reason behind. I don’t have available device to test right now.
On the ROCm installation, I tried both
and
as listed in https://docs.amd.com/en/docs-5.6.0/deploy/linux/installer/install.html, and both of them work on my side.
Same with both sample_mlc_chat.py and with mlc_chat_cli - vulkan works well though
Debian 13 - x86_64 Kernel 6.4 rocm 5.6 GPU: RX 6800XT CPU: amd 5950X