server: free() invalid pointer
Description
When I shut down triton inference server, there’s one line:

Triton Information What version of Triton are you using? 21.12
Are you using the Triton container or did you build it yourself? Here’s the dockerfile:
FROM nvcr.io/nvidia/tritonserver:21.12-py3
LABEL maintainer="NVIDIA"
LABEL repository="tritonserver"
RUN apt-get update && apt-get -y install swig && apt-get -y install python3-dev && apt-get install -y cmake
RUN pip3 install torch==1.10.1+cu113 torchvision==0.11.2+cu113 torchaudio==0.10.1+cu113 -f https://download.pytorch.org/whl/cu113/torch_stable.html
RUN pip3 install -v kaldifeat
Here’s the model.py.
import kaldifeat
class TritonPythonModel:
def initialize(self, args):
pass
def execute(self, requests):
pass
def finalize(self):
"""`finalize` is called only once when the model is being unloaded.
Implementing `finalize` function is OPTIONAL. This function allows
the model to perform any necessary clean ups before exit.
"""
print('Cleaning up...')
config.pbtxt
name: "model"
backend: "python"
max_batch_size: 64
input [
{
name: "wav"
data_type: TYPE_FP32
dims: [-1]
},
{
name: "wav_lens"
data_type: TYPE_INT32
dims: [1]
}
]
output [
{
name: "speech"
data_type: TYPE_FP16
dims: [-1, 80] # 80
},
{
name: "speech_lengths"
data_type: TYPE_INT32
dims: [1]
}
]
dynamic_batching {
preferred_batch_size: [ 16, 32 ]
}
instance_group [
{
count: 1
kind: KIND_GPU
}
]
To Reproduce
- Build docker based on the above dockerfile.
- Run the model_repo with model.py in it.
- Shut down triton by ‘ctrl-c’
Expected behavior Expect no such line.
I test on 2 different machine. Both will give this error? warning? One will not generate core, and another will generate a core file.
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Comments: 23 (12 by maintainers)
The module
kaldifeathas lots of leaks and invalid read/writes on import.This can be verified using:
However, we do not see the
free() invalid pointererror in this case. Running Triton in valgrind with--trace-children=yesgives more details about the invalid free:The trace demostrates the free() invalid pointer originates in
pybind11::finalize_interpreter()clean-up. The issue comes up when importing kaldifeat with pybind11. A simple reproducer is described below:main.cpp :
CMakeLists.txt
In the directory with these file run the following commands:
When running the example we see the below issue:
The backtrace for the Invalid free for example:
As you can see the free() invalid pointer is raised even when running outside Triton Python Backend. It is coming from
pybind11::finalize_interpreter()when running both within Triton and outside Triton. I have tried the latest pybind11 v2.9.0, it gives the same issue.Closing the issue as the issue is reproducible outside Triton and is shown to manifest when importing kaldifeat within pybind11 interpreter.
Yes, it is reproducible.
Hello, I would like to ask a question. I am using triton 22.04-py3 version of docker. When the specified backend is python, the free problem also occurs when unloading the model. Is it because of kaldifeat?
I just created a GitHub repo to reproduce the
core dumpissue by changingimport kaldifeattoimport torch. Please see https://github.com/csukuangfj/memory-leak-exampleYou can see the output from GitHub actions at https://github.com/csukuangfj/memory-leak-example/runs/5179267107?check_suite_focus=true
A screenshot of the output is given below:
kaldifeatuses PyTorch C++ API and it is the responsibility of PyTorch to manage the memory.[edited]: So memory issues with
kaldifeatshould be reproducible by replacingkaldifeatwithtorch.