ROCm: 4.1.0 version stopped working on Fedora 33
Hello. After today update from 4.0.0 -> 4.1.0 OpenCL not working anymore:
❯ rocminfo
ROCk module is loaded
HSA Error: Incompatible kernel and userspace, Vega 20 [Radeon VII] disabled. Upgrade amdgpu.
=====================
HSA System Attributes
=====================
Runtime Version: 1.1
System Timestamp Freq.: 1000.000000MHz
Sig. Max Wait Duration: 18446744073709551615 (0xFFFFFFFFFFFFFFFF) (timestamp count)
Machine Model: LARGE
System Endianness: LITTLE
==========
HSA Agents
==========
*******
Agent 1
*******
Name: AMD Ryzen 3 3300X 4-Core Processor
Uuid: CPU-XX
Marketing Name: AMD Ryzen 3 3300X 4-Core Processor
Vendor Name: CPU
Feature: None specified
Profile: FULL_PROFILE
Float Round Mode: NEAR
Max Queue Number: 0(0x0)
Queue Min Size: 0(0x0)
Queue Max Size: 0(0x0)
Queue Type: MULTI
Node: 0
Device Type: CPU
Cache Info:
L1: 32768(0x8000) KB
Chip ID: 0(0x0)
Cacheline Size: 64(0x40)
Max Clock Freq. (MHz): 3800
BDFID: 0
Internal Node ID: 0
Compute Unit: 8
SIMDs per CU: 0
Shader Engines: 0
Shader Arrs. per Eng.: 0
WatchPts on Addr. Ranges:1
Features: None
Pool Info:
Pool 1
Segment: GLOBAL; FLAGS: KERNARG, FINE GRAINED
Size: 8064576(0x7b0e40) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Alignment: 4KB
Accessible by all: TRUE
Pool 2
Segment: GLOBAL; FLAGS: COARSE GRAINED
Size: 8064576(0x7b0e40) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Alignment: 4KB
Accessible by all: TRUE
ISA Info:
N/A
*** Done ***
Previous 4.0.0 version work without any issues.
- OS: Fedora 33
- Kernel: 5.11.8
- GPU: Radeon VII
- Mesa: 20.3.4
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Reactions: 5
- Comments: 25 (3 by maintainers)
With Linux kernel 5.12-rc4, the most recent released ATM, I still get the “upgrade amdgpu” error. It seems not any Linux kernel is fresh enough, i.e. the required amdgpu changes have not been merged upstream yet?
Personally I prefer using an official Linux kernel with amdgpu, instead of dkms. As such, if the above is correct (?), I’ll have to postpone checking out ROCm 4.1 until an official kernel version is released that includes the required 4.1 admgpu updates.
@ROCmSupport thank you for the clarification!
Please do put out an announcement when the ROCm-4.1 required patches make it into upstream, enabling RadeonVII to be used again with a stock Linux kernel as before.
@ROCmSupport should maybe the issue be reopened during the investigation.
Great to know @tim77, good to know that 4.0 works perfect with 5.11 kernel. This point made me think little more and allowing me to gather more information. Let me gather more information and share some update if any. Thank you.
This seems to be fixed in upstream kernel 5.13: https://github.com/RadeonOpenCompute/ROCm/issues/1478#issuecomment-851189122
I was able to confirm that it works with kernel 5.14. I upgraded the kernel from 5.10.60 to 5.14.21 on the identical system and the error went away.
@ROCmSupport
dkmseven not installed.rocm-openclpackage, version3.6Beta_17_g875c1f8_rocm_rel_4.0_26) works absolutely fine and stable (except Cycles render in Blender which doesn’t work but this is well known issue).Full
clinfooutput: clinfo.txt.I’ve downgraded from
v4.1 -> v4.0.1and OpenCL works perfectly. Withoutdkms. Screenshot withhashcatbenchmark for testing purposes:OpenCL with ROCm 4.1 works fine with my RX 570 on kernel 5.11 (Fedora 33) and no DKMS whatsoever. Here is the proof: https://openbenchmarking.org/result/2103244-AS-ROCM41DAR27
@ROCmSupport I have tried ROCm 3.3 and ROCm 4.0 on Linux kernels 5.10, 5.11 and 5.12 and they all work.
OTOH ROCm 4.1 does not work on any of the Linux kernels 5.10, 5.11, 5.12. The error message “Upgrade amdgpu”, as I understand now, is also misleading – as what is required is a downgrade of amdgpu to Linux kernel 5.8.
And, given this regression from ROCm 4.0 to ROCm 4.1, and the misleading error message, the result is closing the bug report…
Could the documentation please specify which minimal Linux kernel version is required for a ROCm 4.1 install without dkms?
@tim77 Do you reboot after installing rock-dkms-4.1?
You can check whether dkms had installed successfully.