tensorrtllm_backend: "POST v2/repository/models/${MODEL_NAME}/load" failed on 23.10

Description image

Triton Information What version of Triton are you using? triton image: nvcr.io/nvidia/tritonserver:23.10-trtllm-python-py3 tritonserver version: 2.39.0

Are you using the Triton container or did you build it yourself? I’m using nvcr.io/nvidia/tritonserver:23.10-trtllm-python-py3 docker image.

To Reproduce

  1. start tritonserver container there is only mymodel folder under /triton_model_repo, and there is 1/model.py and config.pbtxt under /triton_model_repo/mymodel
tritonserver --model-repository=/triton_model_repo --model-control-mode=explicit --disable-auto-complete-config --backend-config=python,shm-region-prefix-name=prefix{}_
  1. use postman to call the rest api to load mymodel

mymodel uses python backend, it is not ensemble model, the following is the content of /triton_model_repo/mymodel/config.pbtxt:

name: "mymodel"
backend: "python"
max_batch_size: 0

model_transaction_policy {
  decoupled: False
}

input [
  {
    name: "prompt"
    data_type: TYPE_STRING
    dims: [1]
  }
]
output [
  {
    name: "generated_text"
    data_type: TYPE_STRING
    dims: [1]
  }
]
instance_group [
  {
    count: 1
    kind: KIND_GPU
    gpus: [ 0, 1 ]
  }
]

Expected behavior mymodel can be successfully loaded.

About this issue

  • Original URL
  • State: open
  • Created 7 months ago
  • Comments: 16

Most upvoted comments

If you change the model.py to the Python string identity model, can you still replicate the crash?

I tried this model and it runs on CPU, and the tritonserver didn’t crash after the same steps that I did before.

I think the crash happened inside TRT-LLM backend. You were seeing traces back to Triton because TRT-LLM backend uses Triton internally, and the Triton that crashed was launched by mpirun, see

mpirun noticed that process rank 0 with PID 0 on node 275475e266c6 exited on signal 11 (Segmentation fault).

on your log.

I will transfer your issue to the TRT-LLM team for them to take a look at it.