Radiata: Error when building TensorRT Engine.

Describe the bug

When attempting to build TensorRT Engine, an error message is displayed indicating that cuDNN is not initialized (CUDNN_STATUS_NOT_INITIALIZED).

Error Log

[I]     Total Nodes | Original:  1015, After Folding:   842 |   173 Nodes Folded
[I] Folding Constants | Pass 3
[I]     Total Nodes | Original:   842, After Folding:   842 |     0 Nodes Folded
[INFO] Exporting model: models/accelerate/tensorrt/runwayml/stable-diffusion-v1-5/onnx/unet.onnx

Warning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if any(s % default_overall_up_factor != 0 for s in sample.shape[-2:]):

Traceback (most recent call last):
  File "/env/lib/python3.10/site-packages/gradio/routes.py", line 399, in run_predict
    output = await app.get_blocks().process_api(
  File "/env/lib/python3.10/site-packages/gradio/blocks.py", line 1299, in process_api
    result = await self.call_function(
  File "/env/lib/python3.10/site-packages/gradio/blocks.py", line 1036, in call_function
    prediction = await anyio.to_thread.run_sync(
  File "/env/lib/python3.10/site-packages/anyio/to_thread.py", line 31, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "/env/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread
    return await future
  File "/env/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 867, in run
    result = context.run(func, *args)
  File "/env/lib/python3.10/site-packages/gradio/utils.py", line 488, in async_iteration
    return next(iterator)
  File "/Radiata/modules/tabs/tensorrt.py", line 173, in build_engine
    builder.build()
  File "/Radiata/modules/acceleration/tensorrt/engine.py", line 72, in build
    export_onnx(
  File "/Radiata/lib/tensorrt/utilities.py", line 431, in export_onnx
    torch.onnx.export(
  File "/env/lib/python3.10/site-packages/torch/onnx/utils.py", line 504, in export
    _export(
  File "/env/lib/python3.10/site-packages/torch/onnx/utils.py", line 1529, in _export
    graph, params_dict, torch_out = _model_to_graph(
  File "/env/lib/python3.10/site-packages/torch/onnx/utils.py", line 1111, in _model_to_graph
    graph, params, torch_out, module = _create_jit_graph(model, args)
  File "/env/lib/python3.10/site-packages/torch/onnx/utils.py", line 987, in _create_jit_graph
    graph, torch_out = _trace_and_get_graph_from_model(model, args)
  File "/env/lib/python3.10/site-packages/torch/onnx/utils.py", line 891, in _trace_and_get_graph_from_model
    trace_graph, torch_out, inputs_states = torch.jit._get_trace_graph(
  File "/env/lib/python3.10/site-packages/torch/jit/_trace.py", line 1184, in _get_trace_graph
    outs = ONNXTracedModule(f, strict, _force_outplace, return_inputs, _return_inputs_states)(*args, **kwargs)
  File "/env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/env/lib/python3.10/site-packages/torch/jit/_trace.py", line 127, in forward
    graph, out = torch._C._create_graph_by_tracing(
  File "/env/lib/python3.10/site-packages/torch/jit/_trace.py", line 118, in wrapper
    outs.append(self.inner(*trace_inputs))
  File "/env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1182, in _slow_forward
    result = self.forward(*input, **kwargs)
  File "/env/lib/python3.10/site-packages/diffusers/models/unet_2d_condition.py", line 718, in forward
    sample = self.conv_in(sample)
  File "/env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1182, in _slow_forward
    result = self.forward(*input, **kwargs)
  File "/env/lib/python3.10/site-packages/torch/nn/modules/conv.py", line 463, in forward
    return self._conv_forward(input, self.weight, self.bias)
  File "/env/lib/python3.10/site-packages/torch/nn/modules/conv.py", line 459, in _conv_forward
    return F.conv2d(input, weight, bias, self.stride,
RuntimeError: cuDNN error: CUDNN_STATUS_NOT_INITIALIZED

Reproduction

Run launch.sh with --share and --tensorrt command arguments
Open web ui, go to TensorRT tab
Click Build button.
Error occurs when exporting the first unet.onnx.

Expected behavior

It should be exported and without throwing any errors.

System Info

Ubuntu 20
Python 3.10
CUDA 1.13/1.14
TensorRT 8.6.0 (auto-installer by launch.py)
Torch 1.13.1 (auto-installer by launch.py)
Inside a Jupyter Notebook.

Additional context

No response

Validations

Read the docs.
Check that there isn’t already an issue that reports the same bug to avoid creating a duplicate.

About this issue

Original URL
State: open
Created a year ago
Comments: 15 (3 by maintainers)

Most upvoted comments

I seem to have this issue as well. The cuda version is 12.1. Does anyone have fixes?

DepressiveDude on May 9, 2023