jetson-containers: AssertionError in TVM CUDA Initialization
Hi,
I’m just trying test out Live LLaVA using the following command:
./run.sh \
-e SSL_KEY=/data/key.pem -e SSL_CERT=/data/cert.pem \
$(./autotag local_llm) \
python3 -m local_llm.agents.video_query --api=mlc --verbose \
--model liuhaotian/llava-v1.5-7b \
--max-new-tokens 32 \
--video-input /dev/video0 \
--video-output display://0 \
--prompt "How many fingers am I holding up?"
However, it throws the following error from local_llm.agents.video_query module about some assertion error:
/usr/local/lib/python3.8/dist-packages/transformers/utils/hub.py:124: FutureWarning: Using `TRANSFORMERS_CACHE` is deprecated and will be removed in v5 of Transformers. Use `HF_HOME` instead.
warnings.warn(
09:41:03 | DEBUG | Namespace(api='mlc', chat_template=None, debug=True, do_sample=False, log_level='debug', max_new_tokens=32, min_new_tokens=-1, model='liuhaotian/llava-v1.5-7b', prompt=['How many fingers am I holding up?'], quant=None, repetition_penalty=1.0, save_mermaid=None, system_prompt=None, temperature=0.7, top_p=0.95, video_input='v4l2:///dev/video0', video_input_codec=None, video_input_framerate=None, video_input_height=None, video_input_save=None, video_input_width=None, video_output='display://0', video_output_bitrate=None, video_output_codec=None, video_output_save=None, vision_model=None)
09:41:03 | DEBUG | subprocess 694 started
09:41:03 | DEBUG | RUN_PROCESS GIRDI...
09:41:03 | DEBUG | Starting new HTTPS connection (1): huggingface.co:443
09:41:03 | DEBUG | https://huggingface.co:443 "GET /api/models/liuhaotian/llava-v1.5-7b/revision/main HTTP/1.1" 200 2276
Fetching 11 files: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 11/11 [00:00<00:00, 71089.90it/s]
09:41:03 | INFO | loading /data/models/huggingface/models--liuhaotian--llava-v1.5-7b/snapshots/12e054b30e8e061f423c7264bc97d4248232e965 with MLC
Process Process-1:
Traceback (most recent call last):
File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/opt/local_llm/local_llm/agents/video_query.py", line 115, in <module>
agent = VideoQuery(**vars(args)).run()
File "/opt/local_llm/local_llm/agents/video_query.py", line 22, in __init__
self.llm = ProcessProxy((lambda **kwargs: ChatQuery(model, drop_inputs=True, **kwargs)), **kwargs)
File "/opt/local_llm/local_llm/plugins/process_proxy.py", line 34, in __init__
Traceback (most recent call last):
raise RuntimeError(f"subprocess has an invalid initialization status ({init_msg['status']})")
File "/usr/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
self.run()
RuntimeError: subprocess has an invalid initialization status (<class 'AssertionError'>)
File "/usr/lib/python3.8/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/opt/local_llm/local_llm/plugins/process_proxy.py", line 66, in run_process
raise error
File "/opt/local_llm/local_llm/plugins/process_proxy.py", line 63, in run_process
plugin = factory(**kwargs)
File "/opt/local_llm/local_llm/agents/video_query.py", line 22, in <lambda>
self.llm = ProcessProxy((lambda **kwargs: ChatQuery(model, drop_inputs=True, **kwargs)), **kwargs)
File "/opt/local_llm/local_llm/plugins/chat_query.py", line 63, in __init__
self.model = LocalLM.from_pretrained(model, **kwargs)
File "/opt/local_llm/local_llm/local_llm.py", line 72, in from_pretrained
model = MLCModel(model_path, **kwargs)
File "/opt/local_llm/local_llm/models/mlc.py", line 58, in __init__
assert(self.device.exist) # this is needed to initialize CUDA?
AssertionError
What would be the reason for this error? Thanks.
About this issue
- Original URL
- State: open
- Created 5 months ago
- Comments: 15 (6 by maintainers)
@leon-seidel @doruksonmez try changing this line to the following:
https://github.com/dusty-nv/jetson-containers/blob/2d6187b00eaad34a4a51bf1e088baf4a600faa09/packages/llm/local_llm/agents/video_query.py#L22
And when you start the container, mount your local copy of the code into the container like so:
(then any code changes you make to local_llm package will be reflected inside the container without needing to rebuild it)
I’m having the same issue (can not see the video via webrtc or X). However, I was able to work around it using video-viewer. container video output:
--video-output rtsp://@:1234/output \on the host:video-viewer.py rtsp://localhost:1234/output display://0