ChatRWKV: 0.5.0 operators.cu fails to compile on compute 6.x
[...]/rwkv/cuda/operators.cu(123): error: no instance of overloaded function "atomicAdd" matches the argument list
argument types are: (__half *, __half)
atomicAdd(&y[k], __float2half(y_local));
^
This is likely because my GPU (a 1060) only supports compute 6.1 while atomicAdd
support for __half
requires compute 7.0 per https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#atomicadd
It seems like
#ifdef __CUDA_ARCH__ <= 600
/* magic stuff here */
#endif
would be needed to support lower compute versions. I don’t know enough about this to contribute anything more helpful, unfortunately.
About this issue
- Original URL
- State: closed
- Created a year ago
- Comments: 15 (10 by maintainers)
lets go for 1: “Reduce to fp32 and convert back to 16 only on older architectures”
yeah i am processing in chunks so vram usage is much smaller for longer inputs
fixed 😃 and it’s 10% faster on A100 too