tensorrtllm_backend: "POST v2/repository/models/${MODEL_NAME}/load" failed on 23.10
Description
Triton Information What version of Triton are you using? triton image: nvcr.io/nvidia/tritonserver:23.10-trtllm-python-py3 tritonserver version: 2.39.0
Are you using the Triton container or did you build it yourself?
I’m using nvcr.io/nvidia/tritonserver:23.10-trtllm-python-py3 docker image.
To Reproduce
- start tritonserver container
there is only
mymodelfolder under/triton_model_repo, and there is1/model.pyandconfig.pbtxtunder/triton_model_repo/mymodel
tritonserver --model-repository=/triton_model_repo --model-control-mode=explicit --disable-auto-complete-config --backend-config=python,shm-region-prefix-name=prefix{}_
- use postman to call the rest api to load mymodel
mymodel uses python backend, it is not ensemble model, the following is the content of /triton_model_repo/mymodel/config.pbtxt:
name: "mymodel"
backend: "python"
max_batch_size: 0
model_transaction_policy {
decoupled: False
}
input [
{
name: "prompt"
data_type: TYPE_STRING
dims: [1]
}
]
output [
{
name: "generated_text"
data_type: TYPE_STRING
dims: [1]
}
]
instance_group [
{
count: 1
kind: KIND_GPU
gpus: [ 0, 1 ]
}
]
Expected behavior
mymodel can be successfully loaded.
About this issue
- Original URL
- State: open
- Created 7 months ago
- Comments: 16
I think the crash happened inside TRT-LLM backend. You were seeing traces back to Triton because TRT-LLM backend uses Triton internally, and the Triton that crashed was launched by
mpirun, seeon your log.
I will transfer your issue to the TRT-LLM team for them to take a look at it.