pbrt-v4: Invalid PTX input on RTX 2080 Ti on Windows

Trying to run the latest version of pbrt is failing when creating the OptiX modules from PTX, due to “Invalid PTX input”; the full reported error can be found further down.

I am a bit surprised by the

ptx2llvm-module-001, line 9; warning : Unsupported .version 7.1; current version is ‘6.4’

given that OptiX 7.2.0 does support CUDA 11.1, and the driver used (456.80) is higher than the minimum requested by OptiX 7.2.0 (456.71).

PBRT version: 703953d1e0a18758a3a982a1d36d79050d16e536 + the commit from #68. OS: Windows 10 Pro, version 2004, build 19041.572 GPU: NVIDIA GeForce RTX 2080 Ti NVIDIA driver version: 456.80 Visual Studio: 2019, 16.7.6 CUDA: 11.1; same issue when trying with 11.0 OptiX: 7.2.0 Using the Visual Studio built-in CMake support

[ 2948.000 20201022.233225 D:/Softwares/pbrt-v4/src/pbrt/gpu/accel.cpp:618 ] FATAL OptiX call optixModuleCreateFromPTX(optixContext, &moduleCompileOptions, &pipelineCompileOptions, ptxCode.c_str(), ptxCode.size(), log, &logSize, &optixModule) failed with code 7200: "Invalid PTX input"
COMPILE ERROR: Invalid PTX input: ptx2llvm-module-001: error: Failed to translate PTX input to LLVM
ptx2llvm-module-001, line 9; warning : Unsupported .version 7.1; current version is '6.4'
Call parameter type does not match function signature!
  %94 = load [8 x i8]* %param5, !dbg !2003
 [0 x i8]  %95 = call i32 %88(i32 %89, i64 %90, i32 %91, i64 %92, i64 %93, [8 x i8] %94), !dbg !2003
Broken module found, compilation terminated.
Broken module found, compilation terminated.
Broken module found, compilation terminated.
Broken module found, compilation terminated.
Broken module found, compilation terminated.
Broken module found, compilation terminated.
Broken module found, compilation terminated.
Broken module found, compilation terminated.
Call parameter type does not match function signature!
  %45 = load [8 x i8]* %param5, !dbg !2001
 [0 x i8]  %46 = call i32 %39(i32 %40, i64 %41, i32 %42, i64 %43, i64 %44, [8 x i8] %45), !dbg !2001
Broken module found, compilation terminated.
Broken module found, compilation terminated.
Broken module found, compilation terminated.
Broken module found, compilation terminated.
Broken module found, compilation terminated.
Broken module found, compilation terminated.
Broken module found, compilation terminated.
Broken module found, compilation terminated.
Call parameter type does not match function signature!
  %45 = load [8 x i8]* %param5, !dbg !2001
 [0 x i8]  %46 = call i32 %39(i32 %40, i64 %41, i32 %42, i64 %43, i64 %44, [8 x i8] %45), !dbg !2001
Broken module found, compilation terminated.
Broken module found, compilation terminated.
Broken module found, compilation terminated.
Broken module found, compilation terminated.
Broken module found, compilation terminated.
Broken module found, compilation terminated.
Broken module found, compilation terminated.
Broken module found, compilation terminated.
Broken module found, compilation terminated.
Broken module found, compilation terminated.
Broken module found, compilation terminated.
Broken module found, compilation terminated.
Broken module found, compilation terminated.
Broken module found, compilation terminated.
Broken module found, compilation terminated.
Broken module found, compilation terminated.
Broken module found, compilation terminated.
Broken module found, compilation terminated.
Broken module found, compilation terminated.
Broken module found, compilation terminated.
Broken module found, compilation terminated.
Broken module found, compilation terminated.
Broken module found, compilation terminated.
Broken module found, compilation terminated.
Broken module found, compilation terminated.
Broken module found, compilation terminated.
Broken module found, compilation terminated.
Broken module found, compilation terminated.
Broken module found, compilation terminated.
Broken module found, compilation terminated.
Broken module found, compilation terminated.
Broken module found, compilation terminated.
Broken module found, compilation terminated.
Broken module found, compilation terminated.
Broken module found, compilation terminated.
Broken module found, compilation terminated.
Broken module found, compilation terminated.
Broken module found, compilation terminated.
Broken module found, compilation terminated.
Broken module found, compilation terminated.
Broken module found, compilation terminated.
Broken module found, compilation terminated.
Broken module found, compilation terminated.
Broken module found, compilation terminated.
Broken module found, compilation terminated.
Broken module found, compilation terminated.
Broken module found, compilation terminated.
Broken module found, compilation terminated.
Broken module found, compilation terminated.
Broken module found, compilation terminated.
Broken module found, compilation terminated.
Broken module found, compilation terminated.
Broken module found, compilation terminated.
Broken module found, compilation terminated.
Broken module found, compilation terminated.
Broken module found, compilation terminated.
Broken module found, compilation terminated.
Broken module found, co
(D:\Softwares\pbrt-v4\src\pbrt\util\check.cpp)  0x00007FF744389330 - pbrt::PrintStackTrace + line 120
(D:\Softwares\pbrt-v4\src\pbrt\util\check.cpp)  0x00007FF7443896F0 - pbrt::CheckCallbackScope::Fail + line 148
(D:\Softwares\pbrt-v4\src\pbrt\util\log.cpp)    0x00007FF743EDACD0 - pbrt::LogFatal + line 177
(D:\Softwares\pbrt-v4\src\pbrt\util\log.h)      0x00007FF743E91A60 - pbrt::LogFatal<int,char const *,char (&)[4096]> + line 112
(D:\Softwares\pbrt-v4\src\pbrt\gpu\accel.cpp)   0x00007FF74447F750 - pbrt::GPUAccel::GPUAccel + line 616
(D:\Softwares\pbrt-v4\src\pbrt\gpu\pathintegrator.cpp)  0x00007FF743F95D50 - pbrt::GPUPathIntegrator::GPUPathIntegrator + line 159
(D:\Softwares\pbrt-v4\src\pbrt\gpu\pathintegrator.cpp)  0x00007FF743F957A0 - pbrt::GPURender + line 570
(D:\Softwares\pbrt-v4\src\pbrt\cmd\pbrt.cpp)    0x00007FF743E8FA70 - main + line 237
(D:\agent\_work\9\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl)      0x00007FF7448701E0 - invoke_main + line 79
(D:\agent\_work\9\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl)      0x00007FF74486FF90 - __scrt_common_main_seh + line 288
(D:\agent\_work\9\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl)      0x00007FF74486FF70 - __scrt_common_main + line 331
(D:\agent\_work\9\s\src\vctools\crt\vcstartup\src\startup\exe_main.cpp) 0x00007FF7448702A0 - mainCRTStartup + line 17
(unknown                                 )      0x00007FFD3D377020 - BaseThreadInitThunk
(unknown                                 )      0x00007FFD3E2BCEA0 - RtlUserThreadStart

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 17 (13 by maintainers)

Commits related to this issue

Most upvoted comments

Okay, with the diff I posted above OptiX seems to now be happy in debug build with the PTX, but I then run in an illegal memory access in CUDA in GPUPathIntegrator::GenerateCameraRays<pbrt::HaltonSampler>(), however I will keep that for a separate issue.

The good news is that removing debugging symbols from the PTX generated for the optix kernels seems to fix the problem. The bad news is that I haven’t been able to figure out how to do that for just the PTX build, so the diff below fixes it, but at the cost of not having debugging symbols in any of the other stuff built with the CUDA compiler, which is undesirable…

Heads up there also seems to be a lingering build dependency issue around this stuff. (e.g., if I run touch gpu/optix.cu it seems to only rebuild the .ptx file but not also run bin2c, etc, and update the pbrt executable…)

diff --git a/CMakeLists.txt b/CMakeLists.txt
index e7bd6c7..ca59d50 100644
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -174,7 +174,7 @@ if (CMAKE_CUDA_COMPILER)
 
         set (CMAKE_CUDA_FLAGS "${CMAKE_CUDA_FLAGS} --std=c++17")
         if (CMAKE_BUILD_TYPE MATCHES Debug)
-          set (CMAKE_CUDA_FLAGS "${CMAKE_CUDA_FLAGS} --use_fast_math -G -g")
+          set (CMAKE_CUDA_FLAGS "${CMAKE_CUDA_FLAGS} --use_fast_math")
         else()
           set (CMAKE_CUDA_FLAGS "${CMAKE_CUDA_FLAGS} --use_fast_math -lineinfo -maxrregcount 128")
         endif ()
@@ -213,6 +213,8 @@ if (CMAKE_CUDA_COMPILER)
                 # disable "extern declaration... is treated as a static definition" warning
                 -Xcudafe=--display_error_number -Xcudafe=--diag_suppress=3089
                 )
+          set(CMAKE_CUDA_FLAGS_DEBUG "")
+
           # CUDA integration in Visual Studio seems broken as even if "Use
           # Host Preprocessor Definitions" is checked, the host preprocessor
           # definitions are still not used when compiling device code.