pbrt-v4: Windows 10 HEAD build broken if GPU enabled
Background
In checking out current head db69a23 as of Jan 17, I got errors in the compilation stage. In looking over the commits and testing the likely candidates, I determined that the failure on my platform occurred due to commit 61933b8 when changes were made to GPUPathIntegrator. Compilation using the preceding commit 914af26 succeeds.
I’m a CUDA newbie and haven’t used C++ for a long time, so while it looks like this is due to the declaration change to pstd::vector<LightHandle> envLights, I’m not sure the best way to resolve the error.
Platform
Windows 10 GeForce RTX 2060 nVidia driver 460.89 Cuda 11.2, OptiX 7.2 installed. Visual Studio 2019
Building
On trying Release build, I get the following errors:
Severity Code Description Project File Line Suppression State
Error calling a __host__ function("pstd::vector< ::pbrt::LightHandle, ::pstd::pmr::polymorphic_allocator< ::pbrt::LightHandle> > ::~vector") from a __device__ function("pbrt::GPUPathIntegrator::~GPUPathIntegrator") is not allowed pbrt_lib C:\Users\billk\pbrt-v4\src\pbrt\gpu\workqueue.h 61
Error identifier "pstd::vector< ::pbrt::LightHandle, ::pstd::pmr::polymorphic_allocator< ::pbrt::LightHandle> > ::~vector" is undefined in device code pbrt_lib C:\Users\billk\pbrt-v4\src\pbrt\gpu\workqueue.h 61
Error calling a __host__ function("pstd::vector< ::pbrt::LightHandle, ::pstd::pmr::polymorphic_allocator< ::pbrt::LightHandle> > ::~vector") from a __device__ function("pbrt::GPUPathIntegrator::~GPUPathIntegrator [subobject]") is not allowed pbrt_lib C:\Users\billk\pbrt-v4\src\pbrt\gpu\workqueue.h 61
Error identifier "pstd::vector< ::pbrt::LightHandle, ::pstd::pmr::polymorphic_allocator< ::pbrt::LightHandle> > ::~vector" is undefined in device code pbrt_lib C:\Users\billk\pbrt-v4\src\pbrt\gpu\workqueue.h 61
Error calling a __host__ function("pstd::vector< ::pbrt::LightHandle, ::pstd::pmr::polymorphic_allocator< ::pbrt::LightHandle> > ::~vector") from a __device__ function("pbrt::GPUPathIntegrator::~GPUPathIntegrator") is not allowed pbrt_lib C:\Users\billk\pbrt-v4\build\gpu_workitems_soa.h 127
Error identifier "pstd::vector< ::pbrt::LightHandle, ::pstd::pmr::polymorphic_allocator< ::pbrt::LightHandle> > ::~vector" is undefined in device code pbrt_lib C:\Users\billk\pbrt-v4\build\gpu_workitems_soa.h 127
Error calling a __host__ function("pstd::vector< ::pbrt::LightHandle, ::pstd::pmr::polymorphic_allocator< ::pbrt::LightHandle> > ::~vector") from a __device__ function("pbrt::GPUPathIntegrator::~GPUPathIntegrator [subobject]") is not allowed pbrt_lib C:\Users\billk\pbrt-v4\build\gpu_workitems_soa.h 127
Error identifier "pstd::vector< ::pbrt::LightHandle, ::pstd::pmr::polymorphic_allocator< ::pbrt::LightHandle> > ::~vector" is undefined in device code pbrt_lib C:\Users\billk\pbrt-v4\build\gpu_workitems_soa.h 127
Error calling a __host__ function("pstd::vector< ::pbrt::LightHandle, ::pstd::pmr::polymorphic_allocator< ::pbrt::LightHandle> > ::~vector") from a __device__ function("pbrt::GPUPathIntegrator::~GPUPathIntegrator") is not allowed pbrt_lib C:\Users\billk\pbrt-v4\src\pbrt\gpu\workitems.h 435
Error identifier "pstd::vector< ::pbrt::LightHandle, ::pstd::pmr::polymorphic_allocator< ::pbrt::LightHandle> > ::~vector" is undefined in device code pbrt_lib C:\Users\billk\pbrt-v4\src\pbrt\gpu\workitems.h 435
Error calling a __host__ function("pstd::vector< ::pbrt::LightHandle, ::pstd::pmr::polymorphic_allocator< ::pbrt::LightHandle> > ::~vector") from a __device__ function("pbrt::GPUPathIntegrator::~GPUPathIntegrator [subobject]") is not allowed pbrt_lib C:\Users\billk\pbrt-v4\src\pbrt\gpu\workitems.h 435
Error identifier "pstd::vector< ::pbrt::LightHandle, ::pstd::pmr::polymorphic_allocator< ::pbrt::LightHandle> > ::~vector" is undefined in device code pbrt_lib C:\Users\billk\pbrt-v4\src\pbrt\gpu\workitems.h 435
Error calling a __host__ function("pstd::vector< ::pbrt::LightHandle, ::pstd::pmr::polymorphic_allocator< ::pbrt::LightHandle> > ::~vector") from a __device__ function("pbrt::GPUPathIntegrator::~GPUPathIntegrator") is not allowed pbrt_lib C:\Users\billk\pbrt-v4\src\pbrt\gpu\workitems.h 411
Error identifier "pstd::vector< ::pbrt::LightHandle, ::pstd::pmr::polymorphic_allocator< ::pbrt::LightHandle> > ::~vector" is undefined in device code pbrt_lib C:\Users\billk\pbrt-v4\src\pbrt\gpu\workitems.h 411
Error calling a __host__ function("pstd::vector< ::pbrt::LightHandle, ::pstd::pmr::polymorphic_allocator< ::pbrt::LightHandle> > ::~vector") from a __device__ function("pbrt::GPUPathIntegrator::~GPUPathIntegrator [subobject]") is not allowed pbrt_lib C:\Users\billk\pbrt-v4\src\pbrt\gpu\workitems.h 411
Error identifier "pstd::vector< ::pbrt::LightHandle, ::pstd::pmr::polymorphic_allocator< ::pbrt::LightHandle> > ::~vector" is undefined in device code pbrt_lib C:\Users\billk\pbrt-v4\src\pbrt\gpu\workitems.h 411
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Comments: 50 (32 by maintainers)
Commits related to this issue
- Make GPUPathIntegrator::envLights a pointer to a pstd::vector. This is @pierremoreau's suggested fix for issue #100. I can confirm everything still works on Linux with this fix, but have not checked... — committed to mmp/pbrt-v4 by mmp 3 years ago
- GPU: don't call Film::PixelBounds() from CPU during rendering This fixes the Windows release build for killeroos-gold. (Most other scenes still crash with Windows+GPU, however.) Issues #48, #72, #8... — committed to mmp/pbrt-v4 by mmp 3 years ago
- Add pstd floor(), ceil(), and copysign(). This fixes many of the Windows+GPU crashes. Issues #96, #100, and more... — committed to mmp/pbrt-v4 by mmp 3 years ago
- Re-enable Windows+GPU builds Relates to issue #100, #96, #89, and #48... — committed to mmp/pbrt-v4 by mmp 3 years ago
- Make GPUPathIntegrator::envLights a pointer to a pstd::vector. This is @pierremoreau's suggested fix for issue #100. I can confirm everything still works on Linux with this fix, but have not checked... — committed to Dolkar/pbrt-v4-myod-integration by mmp 3 years ago
- GPU: don't call Film::PixelBounds() from CPU during rendering This fixes the Windows release build for killeroos-gold. (Most other scenes still crash with Windows+GPU, however.) Issues #48, #72, #8... — committed to Dolkar/pbrt-v4-myod-integration by mmp 3 years ago
- Add pstd floor(), ceil(), and copysign(). This fixes many of the Windows+GPU crashes. Issues #96, #100, and more... — committed to Dolkar/pbrt-v4-myod-integration by mmp 3 years ago
- Re-enable Windows+GPU builds Relates to issue #100, #96, #89, and #48... — committed to Dolkar/pbrt-v4-myod-integration by mmp 3 years ago
Closing this out; whew.
(@DocSavage performance should be much improved from that with the just checked in cb78c6494576059c7ba1e3a48dc73965cc981499.)
Ah, I see
GPURender()explains why*thisis captured that way on Windows, makes sense.Edit: It is all starting to make sense, and why this is an issue on Windows and not Linux!
I can reproduce and I will try to look at this issue over the weekend. It is interesting that it only occurs in Release mode and not Debug.
With the latest commit, everything(*) seems to be working on Windows now.
Embarrassingly, the issue was right there in the compiler warnings: GPU code calling std::copysign (et al.), so 597fc75 was all it took. In my defense, something seems to have changed in that those calls didn’t cause any trouble with earlier versions of CUDA/MSVC. In any case, everything I’ve tried renders successfully now 🎉 . I will leave this issue open for now, pending confirmation from others…
(*) There is a performance issue, filed as a separate issue, #128.
@pbrt4bounty I am using this one: https://github.com/mmp/pbrt-v4-scenes/blob/master/killeroos/killeroo-gold.pbrt.
Matt is aware that it does not work for all scenes yet, as noted back in this commit: 5d5a73245dfb49f6da1a30ee2dc3b5f6da1ea770.
This is a shot in the dark, but could you try modifying the
bypassSlab()method at line 51 ofsrc/pbrt/util/memory.hto always returntrueand see if that fixes the release build?Thanks for digging into this!
As far as I understand it, the whole “width”, “height”, and “depth” stuff in
optixLaunchis a convenience for applications. In other words, if you’re tracing one ray for every pixel, then you can launch (width, height, 1) and then it’s easy to associate pixel coordinates with rays in the optix code. pbrt doesn’t have a direct relation between pixels and rays like that (e.g. because some paths get terminated), so it does a 1D launch with the maximum number of rays it may need to trace.Then, in optix code, pbrt only uses the first dimension for indexing into its buffer of rays. So, launching (640, 480, 1) will end up repeatedly doing the work for the first 640 rays only.
Now, that said, this is a very interesting clue! There is apparently some issue of insufficient memory allocation / invalid indexing past the end of an allocation. The fact that it runs successfully only accessing the first 640 elements of the various buffers definitely helps narrow down the possibilities. (I remain unsure why it would only happen on windows, but one step at a time…) I will review all of that code with this clue in mind…
Hmm. Those warnings are benign and in fact the build configuration sets a flag to suppress them.
So I wonder why you are seeing them. I wonder if this is a clue that part of the problem here is that on Windows other, more important, build flags also aren’t being passed through.
So I can confirm that this doesn’t happen with CUDA 11.2 on Linux, at least. This is very puzzling though, since it clearly seems to be coming from nvcc, which should behave the same on both…
Unfortunately I’m not going to be able to chase this down myself in the near future, but as always, PRs welcome!