cudf: [BUG] Loading `libcudf.so` increases device memory usage by ~300MB
In working with rmm and cudf, if one only uses rmm we observe a lower device memory usage than if we load the libcudf shared library and only use rmm. Apologies for the Python only examples:
Example 1, not loading libcudf.so, device memory usage ~335 MiB:
import rmm
buf = rmm.DeviceBuffer(size=5)
del buf
print("Press enter to exit")
input()
Example 2, loading libcudf.so, device memory usage ~651 MiB:
import rmm
import ctypes
libcudf = ctypes.cdll.LoadLibrary("libcudf.so")
buf = rmm.DeviceBuffer(size=5)
del buf
print("Press enter to exit")
input()
Example 3, construct and destruct buffer then load libcudf.so then construct and destruct buffer again, ~335 MiB in first construct/destruct, ~651 MiB after second construct/destruct:
import rmm
import ctypes
buf = rmm.DeviceBuffer(size=5)
del buf
print("Press enter to continue")
input()
libcudf = ctypes.cdll.LoadLibrary("libcudf.so")
buf = rmm.DeviceBuffer(size=5)
del buf
print("Press enter to exit")
input()
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Comments: 32 (31 by maintainers)
I think we are premature optimizing here. Once we remove legacy APIs and NVStrings/NVCategory, I’d expect the library size should go down significantly.
For anyone wondering why reductions are so expensive, it’s because we have to instantiate N * N * K kernels where N is the number of types we support, and K is the number of reduction operators we support.
Maybe we need to switch reductions over to using Jitify as well.
Workloads people are pushing these days are saturating available memory (and often OOMing). Every bit counts.