HIPIFY: [HIPIFY] hipify-clang misbehaves in the presence of preprocessor directives
Consider the following CUDA program:
__global__ void axpy_kernel(float a, float* x, float* y) {
y[threadIdx.x] = a * x[threadIdx.x];
}
void axpy(float a, float* x, float* y) {
axpy_kernel<<<1, 4>>> (a, x, y);
#ifdef SOME_MACRO
axpy_kernel<<<1, 4>>> (a, x, y);
#endif
}
The result of hipifying this, if you don’t pass -DSOME_MACRO is:
#include <hip/hip_runtime.h>
__global__ void axpy_kernel(float a, float* x, float* y) {
y[hipThreadIdx_x] = a * x[hipThreadIdx_x];
}
void axpy(float a, float* x, float* y) {
hipLaunchKernelGGL(axpy_kernel, dim3(1), dim3(4), 0, 0, a, x, y);
#ifdef SOME_MACRO
axpy_kernel<<<1, 4>>> (a, x, y);
#endif
}
Respecting conditional macros isn’t the right thing to do with this sort of mechanised refactoring - what you really want to do is walk the entire tree applying your refactor, regardless of preprocessor conditionals.
This is going to present a relatively nasty obstacle to people with complicated CUDA programs they want to translate…
About this issue
- Original URL
- State: closed
- Created 7 years ago
- Comments: 17 (9 by maintainers)
Commits related to this issue
- Merge pull request #1372 from emankov/master [HIPIFY][#207][fix] Translate all preprocessor's conditional blocks — committed to ROCm/HIP by emankov 5 years ago
- Don't special-case source locations for calls in macros The source location for a call that's inside a macro body will, by default, point into the macro definition itself. The original logic was caus... — committed to ROCm/HIPIFY by ChrisKitching 7 years ago
Both of the example cases from the original report still fail. As I explained in my pull request and commit messages, my work fixes handling of unconditional macros.
You can see examples of the sorts of situations that ROCm-Developer-Tools/HIP#235 fixed here: https://github.com/ROCm-Developer-Tools/HIP/blob/094b2b9b0503c1e2935863a1d596d1045b71e7e4/tests/hipify-clang/axpy.cu#L6-L12
https://github.com/ROCm-Developer-Tools/HIP/blob/094b2b9b0503c1e2935863a1d596d1045b71e7e4/tests/hipify-clang/axpy.cu#L43-L56
A preprocessor check for the value of
__cplusplusbeing >=201703Lseems like it should work on absolutely every compiler ever.Preprocessor directives are not included in AST.
CUDA_8.0/CUDASamples/common/inc/helper_cuda.h:
CUDA_8.0\include\driver_types.h:
CUDA_8.0/CUDASamples/6_Advanced/concurrentKernels/concurrentKernels.cu: