TensorRT-LLM: run tritonserver failed for chatglm2
For chatglm2 model, I build engine and launch run.py
all well。
Then ,I build the docker and launch tritonserver
https://github.com/triton-inference-server/tensorrtllm_backend#launch-triton-server-within-ngc-container
get error:
+----------------+---------+--------------------------------------------------------------------------------------------------------------------+
| Model | Version | Status |
+----------------+---------+--------------------------------------------------------------------------------------------------------------------+
| postprocessing | 1 | READY |
| preprocessing | 1 | READY |
| tensorrt_llm | 1 | UNAVAILABLE: Internal: unexpected error when creating modelInstanceState: [TensorRT-LLM][ERROR] Assertion failed: |
| | | mpiSize == tp * pp (/app/tensorrt_llm/cpp/tensorrt_llm/runtime/worldConfig.cpp:80) |
| | | 1 0x7f923a86a645 /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so(+0x17645) [0x7f923a86a645] |
| | | 2 0x7f923a87748d /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so(+0x2448d) [0x7f923a87748d] |
| | | 3 0x7f923a8a9722 /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so(+0x56722) [0x7f923a8a9722] |
| | | 4 0x7f923a8a4335 /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so(+0x51335) [0x7f923a8a4335] |
| | | 5 0x7f923a8a221b /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so(+0x4f21b) [0x7f923a8a221b] |
| | | 6 0x7f923a885ec2 /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so(+0x32ec2) [0x7f923a885ec2] |
| | | 7 0x7f923a885f75 TRITONBACKEND_ModelInstanceInitialize + 101 |
| | | 8 0x7f93641a4116 /opt/tritonserver/bin/../lib/libtritonserver.so(+0x1a0116) [0x7f93641a4116] |
| | | 9 0x7f93641a5356 /opt/tritonserver/bin/../lib/libtritonserver.so(+0x1a1356) [0x7f93641a5356] |
| | | 10 0x7f9364189bd5 /opt/tritonserver/bin/../lib/libtritonserver.so(+0x185bd5) [0x7f9364189bd5] |
| | | 11 0x7f936418a216 /opt/tritonserver/bin/../lib/libtritonserver.so(+0x186216) [0x7f936418a216] |
| | | 12 0x7f936419531d /opt/tritonserver/bin/../lib/libtritonserver.so(+0x19131d) [0x7f936419531d] |
| | | 13 0x7f9363807f68 /usr/lib/x86_64-linux-gnu/libc.so.6(+0x99f68) [0x7f9363807f68] |
| | | 14 0x7f9364181adb /opt/tritonserver/bin/../lib/libtritonserver.so(+0x17dadb) [0x7f9364181adb] |
| | | 15 0x7f936418f865 /opt/tritonserver/bin/../lib/libtritonserver.so(+0x18b865) [0x7f936418f865] |
| | | 16 0x7f9364194682 /opt/tritonserver/bin/../lib/libtritonserver.so(+0x190682) [0x7f9364194682] |
| | | 17 0x7f9364277230 /opt/tritonserver/bin/../lib/libtritonserver.so(+0x273230) [0x7f9364277230] |
| | | 18 0x7f936427a923 /opt/tritonserver/bin/../lib/libtritonserver.so(+0x276923) [0x7f936427a923] |
| | | 19 0x7f93643c3e52 /opt/tritonserver/bin/../lib/libtritonserver.so(+0x3bfe52) [0x7f93643c3e52] |
| | | 20 0x7f9363a72253 /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xdc253) [0x7f9363a72253] |
| | | 21 0x7f9363802b43 /usr/lib/x86_64-linux-gnu/libc.so.6(+0x94b43) [0x7f9363802b43] |
| | | 22 0x7f9363893bb4 clone + 68
About this issue
- Original URL
- State: closed
- Created 8 months ago
- Comments: 24
@byshiue Thank you very much ,
python3 ./scripts/build_wheel.py --cuda_architectures "75-real"
on rtx8000 is work fine.