server: Onnx runtime build error for 23.01
Description
- I’m trying to build the Triton container as follows, where
RELEASE_TAG=r23.01
./build.py --enable-logging --enable-stats --enable-tracing --enable-metrics --enable-gpu-metrics --enable-gpu --no-container-interactive --endpoint=http --endpoint=grpc --endpoint=sagemaker --repo-tag=common:$RELEASE_TAG --repo-tag=core:$RELEASE_TAG --repo-tag=backend:$RELEASE_TAG --repo-tag=thirdparty:$RELEASE_TAG --backend=ensemble:$RELEASE_TAG --backend=tensorrt:$RELEASE_TAG --backend=identity:$RELEASE_TAG --backend=repeat:$RELEASE_TAG --backend=square:$RELEASE_TAG --backend=onnxruntime:$RELEASE_TAG --backend=pytorch:$RELEASE_TAG --backend=tensorflow1:$RELEASE_TAG --backend=tensorflow2:$RELEASE_TAG --backend=python:$RELEASE_TAG --backend=dali:$RELEASE_TAG --backend=fil:$RELEASE_TAG --backend=fastertransformer:main --repoagent=checksum:$RELEASE_TAG
Error detected:
[ 70%] Building CXX object CMakeFiles/onnxruntime_providers.dir/workspace/onnxruntime/onnxruntime/core/providers/cpu/math/cumsum.cc.o
/usr/local/cuda/targets/x86_64-linux/include/cuda/std/detail/libcxx/include/type_traits(566): error: "cuda" is ambiguous
/usr/local/cuda/targets/x86_64-linux/include/cuda/std/detail/libcxx/include/type_traits(566): error: too many arguments for alias template "cuda::std::__4::_BoolConstant"
/usr/local/cuda/targets/x86_64-linux/include/cuda/std/detail/libcxx/include/type_traits(568): error: expected a ";"
/usr/local/cuda/targets/x86_64-linux/include/cuda/std/detail/libcxx/include/type_traits(575): error: "cuda" is ambiguous
/usr/local/cuda/targets/x86_64-linux/include/cuda/std/detail/libcxx/include/type_traits(575): error: too many arguments for alias template "cuda::std::__4::_BoolConstant"
/usr/local/cuda/targets/x86_64-linux/include/cuda/std/detail/libcxx/include/type_traits(577): error: expected a ";"
/usr/local/cuda/targets/x86_64-linux/include/cuda/std/detail/libcxx/include/type_traits(1064): error: name followed by "::" must be a class or namespace name
.
.
.
100 errors detected in the compilation of "/workspace/onnxruntime/onnxruntime/contrib_ops/cuda/bert/attention_impl.cu".
Compilation terminated.
make[2]: *** [CMakeFiles/onnxruntime_providers_cuda.dir/build.make:2990: CMakeFiles/onnxruntime_providers_cuda.dir/workspace/onnxruntime/onnxruntime/contrib_ops/cuda/bert/attention_impl.cu.o] Error 255
make[2]: *** Waiting for unfinished jobs....
Triton Information What version of Triton are you using? -> v23.01
Are you using the Triton container or did you build it yourself? -> Building it myself, running into error
To Reproduce Steps to reproduce the behavior.
Describe the models (framework, inputs, outputs), ideally include the model configuration file (if using an ensemble include the model configuration file for that as well).
Expected behavior Can someone help fix this issue, thank you.
About this issue
- Original URL
- State: closed
- Created a year ago
- Comments: 28 (27 by maintainers)
@tanmayv25 @jbkyang-nvi Thanks for the help with this issue! Some snippet of what worked eventually:
I was able to build the container with FT backend *.so from the 22.12 version into the 23.01 container.
Closing this ticket.
I was able to reproduce the error Nikhil is running into. I forgot to include fastertranformer in the build.py command before.
The error is occurring because
CUBLASLT_MATMUL_PREF_EPILOGUE_MASKwas removed in CUDA 12. But FasterTransformer is still using the API.@nskool Let me get back to you with a working solution,