harvester: [BUG] Unable to enable passthrough
Describe the bug I am unable to set the desired PCI Devices as allocatable
To Reproduce Steps to reproduce the behavior:
- Select PCI/nvidia device, including other devices on the same iommu group.
- Enable passthrough
- Device appears as enabled but the logs show otherwise,
Kernel driver in use: vfio-pciwon’t be listed under the specific device either.
Expected behavior
Kernel driver in use: vfio-pci visible under the device and Allocatable Device under the host
Support bundle
supportbundle-enablepassthrough-p5820.zip
Environment
- Harvester ISO version: 1.1.0-rc3
- Underlying Infrastructure (e.g. Baremetal with Dell PowerEdge R630): Dell Precision 5820
Additional context
time="2022-10-18T15:34:28Z" level=info msg="Attempting to disable passthrough for p5820-000065000"
time="2022-10-18T15:34:28Z" level=info msg="Attempting to disable passthrough for p5820-000065001"
time="2022-10-18T15:34:29Z" level=info msg="Attempting to disable passthrough for p5820-000065001"
time="2022-10-18T15:34:29Z" level=error msg="Error updating status for p5820-000065001: Operation cannot be fulfilled on pcideviceclaims.devices.harvesterhci.io \"p5820-000065001\": StorageError: invalid object, Code: 4, Key: /registry/devices.harvesterhci.io/pcideviceclaims/p5820-000065001, ResourceVersion: 0, AdditionalErrorMsg: Precondition failed: UID in precondition: c2bb33ed-a876-4d0a-8d5b-c70e9ba207e8, UID in object meta: "
time="2022-10-18T15:34:29Z" level=error msg="error syncing 'p5820-000065001': handler PCIDeviceClaimOnRemove: Operation cannot be fulfilled on pcideviceclaims.devices.harvesterhci.io \"p5820-000065001\": StorageError: invalid object, Code: 4, Key: /registry/devices.harvesterhci.io/pcideviceclaims/p5820-000065001, ResourceVersion: 0, AdditionalErrorMsg: Precondition failed: UID in precondition: c2bb33ed-a876-4d0a-8d5b-c70e9ba207e8, UID in object meta: , requeuing"
WARNING: error parsing the pci address "10000:00:02.0"
WARNING: failed to get device information for PCI address 10000:00:02.0
WARNING: error parsing the pci address "10000:00:03.0"
WARNING: failed to get device information for PCI address 10000:00:03.0
time="2022-10-18T15:34:48Z" level=info msg="Attempting to enable passthrough for p5820-000065000"
time="2022-10-18T15:34:48Z" level=info msg="Attempting to enable passthrough for p5820-000065001"
time="2022-10-18T15:34:48Z" level=info msg="Binding device p5820-000065000 [10de 1cb1] to vfio-pci"
time="2022-10-18T15:34:48Z" level=info msg="Adding p5820-000065000 to KubeVirt list of permitted devices"
time="2022-10-18T15:34:49Z" level=info msg="Binding device p5820-000065001 [10de 0fb9] to vfio-pci"
time="2022-10-18T15:34:49Z" level=info msg="Adding p5820-000065001 to KubeVirt list of permitted devices"
time="2022-10-18T15:34:49Z" level=info msg="Attempting to enable passthrough for p5820-000065000"
time="2022-10-18T15:34:49Z" level=info msg="Attempting to enable passthrough for p5820-000065001"
time="2022-10-18T15:34:50Z" level=info msg="Binding device p5820-000065000 [10de 1cb1] to vfio-pci"
time="2022-10-18T15:34:50Z" level=info msg="Adding p5820-000065000 to KubeVirt list of permitted devices"
time="2022-10-18T15:34:50Z" level=error msg="Device at address 0000:65:00.1 is not bound to driver snd_hda_intel"
time="2022-10-18T15:34:50Z" level=info msg="Binding device p5820-000065001 [10de 0fb9] to vfio-pci"
time="2022-10-18T15:34:50Z" level=info msg="Adding p5820-000065001 to KubeVirt list of permitted devices"
time="2022-10-18T15:34:50Z" level=error msg="Error updating status for p5820-000065000: Operation cannot be fulfilled on pcideviceclaims.devices.harvesterhci.io \"p5820-000065000\": the object has been modified; please apply your changes to the latest version and try again"
time="2022-10-18T15:34:51Z" level=error msg="Error updating status for p5820-000065001: Operation cannot be fulfilled on pcideviceclaims.devices.harvesterhci.io \"p5820-000065001\": the object has been modified; please apply your changes to the latest version and try again"
^C
p5820:~ # lspci | grep -i nvid
0000:65:00.0 VGA compatible controller: NVIDIA Corporation GP107GL [Quadro P1000] (rev a1)
0000:65:00.1 Audio device: NVIDIA Corporation GP107GL High Definition Audio Controller (rev a1)
p5820:~ # lspci -nn -s 0000:65:00.0 -v
0000:65:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP107GL [Quadro P1000] [10de:1cb1] (rev a1) (prog-if 00 [VGA controller])
Subsystem: Dell Device [1028:11bc]
Flags: bus master, fast devsel, latency 0, IRQ 11, NUMA node 0
Memory at d7000000 (32-bit, non-prefetchable) [size=16M]
Memory at c0000000 (64-bit, prefetchable) [size=256M]
Memory at d0000000 (64-bit, prefetchable) [size=32M]
I/O ports at b000 [size=128]
Expansion ROM at 000c0000 [disabled] [size=128K]
Capabilities: [60] Power Management version 3
Capabilities: [68] MSI: Enable- Count=1/1 Maskable- 64bit+
Capabilities: [78] Express Legacy Endpoint, MSI 00
Capabilities: [100] Virtual Channel
Capabilities: [250] Latency Tolerance Reporting
Capabilities: [128] Power Budgeting <?>
Capabilities: [420] Advanced Error Reporting
Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?>
Capabilities: [900] #19```
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Comments: 16 (10 by maintainers)
The root cause is identified as the harv-install will not be executed during the upgrade, also the sed will not apply since the
iommudoes not exist in v1.0.3 …Verified fixed on
master-b0d883ce-head(11/22) of the following check point. Close this issue.