kubevirt: vm can't restart after addvolume
What happened:
- i start a centos vm
- add a blank dv to vm by /apis/subresources.kubevirt.io/v1/namespaces/{namespace:[a-z0-9]}/virtualmachines/{name:[a-z0-9][a-z0-9-]}/addvolume
- dv hotplug successed and os console can detect the new disk
- request /apis/subresources.kubevirt.io/v1/namespaces/{namespace:[a-z0-9]}/virtualmachines/{name:[a-z0-9][a-z0-9-]}/restart
- my vm can’t running and stay in phase starting
Additional context:
- running vm
spec:
dataVolumeTemplates:
- metadata:
name: dv-disk-1
spec:
pvc:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 30Gi
storageClassName: xxxxx
volumeMode: Block
source:
blank: {}
running: true
template:
metadata:
spec:
domain:
cpu:
cores: 1
model: host-model
sockets: 1
threads: 1
devices:
disks:
- bootOrder: 1
cache: none
disk:
bus: virtio
name: disk-1
- bootOrder: 2
cdrom:
bus: sata
name: cdrom-1
interfaces:
- masquerade: {}
name: default
firmware:
bootloader:
bios:
useSerial: true
machine:
type: q35
memory:
guest: 2Gi
resources:
limits:
cpu: "1"
memory: 2Gi
requests:
cpu: "1"
memory: 2Gi
networks:
- name: default
pod: {}
volumes:
- dataVolume:
name: dv-disk-1
name: disk-1
- dataVolume:
name: centos
name: cdrom-1
- create blank dv
apiVersion: cdi.kubevirt.io/v1beta1
kind: DataVolume
metadata:
name: blank
spec:
source:
blank: {}
pvc:
storageClassName: xxxxxxx
accessModes:
- ReadWriteOnce
volumeMode: Block
resources:
requests:
storage: 50Gi
- addvolue
devices:
disks:
- bootOrder: 1
cache: none
disk:
bus: virtio
name: disk-1
- bootOrder: 2
cdrom:
bus: sata
name: cdrom-1
- cache: none
disk:
bus: scsi
name: disk-add-1
----------------------
volumes:
- dataVolume:
name: dv-disk-1
name: disk-1
- dataVolume:
name: centos
name: cdrom-1
- dataVolume:
hotpluggable: true
name: blank
name: disk-add-1
- restart vm, virt-launcher pod err msg
{"component":"virt-launcher","level":"info","msg":"Marked as ready","pos":"virt-launcher.go:74","timestamp":"2022-10-29T09:07:03.499371Z"}
{"component":"virt-launcher","kind":"","level":"info","msg":"Executing PreStartHook on VMI pod environment","name":"centos8","namespace":"default","pos":"manager.go:513","timestamp":"2022-10-29T09:07:17.590819Z","uid":"fae3730b-e359-4a44-aada-e21248c9f988"}
{"component":"virt-launcher","kind":"","level":"info","msg":"Starting PreCloudInitIso hook","name":"centos8","namespace":"default","pos":"manager.go:534","timestamp":"2022-10-29T09:07:17.590870Z","uid":"fae3730b-e359-4a44-aada-e21248c9f988"}
{"component":"virt-launcher","level":"info","msg":"Found nameservers in /etc/resolv.conf: \ufffd\u001e\u0000\n","pos":"network.go:289","timestamp":"2022-10-29T09:07:17.591873Z"}
{"component":"virt-launcher","level":"info","msg":"Found search domains in /etc/resolv.conf: default.svc.cluster.local svc.cluster.local cluster.local","pos":"network.go:290","timestamp":"2022-10-29T09:07:17.591903Z"}
{"component":"virt-launcher","level":"info","msg":"Starting SingleClientDHCPServer","pos":"server.go:65","timestamp":"2022-10-29T09:07:17.591970Z"}
{"component":"virt-launcher","level":"error","msg":"Direct IO check failed for /dev/disk-1","pos":"converter.go:417","reason":"open /dev/disk-1: operation not permitted","timestamp":"2022-10-29T09:07:17.592144Z"}
{"component":"virt-launcher","kind":"","level":"error","msg":"pre start setup for VirtualMachineInstance failed.","name":"centos8","namespace":"default","pos":"manager.go:853","reason":"Unable to use 'none' cache mode, file system where /dev/disk-1 is stored does not support direct I/O","timestamp":"2022-10-29T09:07:17.592197Z","uid":"fae3730b-e359-4a44-aada-e21248c9f988"}
{"component":"virt-launcher","kind":"","level":"error","msg":"Failed to sync vmi","name":"centos8","namespace":"default","pos":"server.go:185","reason":"Unable to use 'none' cache mode, file system where /dev/disk-1 is stored does not support direct I/O","timestamp":"2022-10-29T09:07:17.592236Z","uid":"fae3730b-e359-4a44-aada-e21248c9f988"}
{"component":"virt-launcher","kind":"","level":"info","msg":"Executing PreStartHook on VMI pod environment","name":"centos8","namespace":"default","pos":"manager.go:513","timestamp":"2022-10-29T09:07:17.693970Z","uid":"fae3730b-e359-4a44-aada-e21248c9f988"}
{"component":"virt-launcher","kind":"","level":"info","msg":"Starting PreCloudInitIso hook","name":"centos8","namespace":"default","pos":"manager.go:534","timestamp":"2022-10-29T09:07:17.694030Z","uid":"fae3730b-e359-4a44-aada-e21248c9f988"}
- delete disk-1 cache : none to empty
{"component":"virt-launcher","kind":"","level":"info","msg":"Domain defined.","name":"centos8","namespace":"default","pos":"manager.go:866","timestamp":"2022-10-31T07:16:07.172539Z","uid":"baca3985-45a5-4c83-aed5-bc24d118b780"}
{"component":"virt-launcher","level":"info","msg":"DomainLifecycle event 0 with reason 0 received","pos":"client.go:436","timestamp":"2022-10-31T07:16:07.172672Z"}
{"component":"virt-launcher","level":"info","msg":"kubevirt domain status: Shutoff(5):Unknown(0)","pos":"client.go:289","timestamp":"2022-10-31T07:16:07.174244Z"}
{"component":"virt-launcher","level":"info","msg":"Successfully connected to domain notify socket at /var/run/kubevirt/domain-notify-pipe.sock","pos":"client.go:168","timestamp":"2022-10-31T07:16:07.176317Z"}
{"component":"virt-launcher","level":"error","msg":"Cannot access storage file '/dev/disk-1': Operation not permitted","pos":"virStorageSourceReportBrokenChain:1248","subcomponent":"libvirt","thread":"31","timestamp":"2022-10-31T07:16:07.176000Z"}
{"component":"virt-launcher","kind":"","level":"error","msg":"Failed to start VirtualMachineInstance with flags 0.","name":"centos8","namespace":"default","pos":"manager.go:891","reason":"virError(Code=38, Domain=18, Message='Cannot access storage file '/dev/disk-1': Operation not permitted')","timestamp":"2022-10-31T07:16:07.177438Z","uid":"baca3985-45a5-4c83-aed5-bc24d118b780"}
{"component":"virt-launcher","kind":"","level":"error","msg":"Failed to sync vmi","name":"centos8","namespace":"default","pos":"server.go:185","reason":"virError(Code=38, Domain=18, Message='Cannot access storage file '/dev/disk-1': Operation not permitted')","timestamp":"2022-10-31T07:16:07.177504Z","uid":"baca3985-45a5-4c83-aed5-bc24d118b780"}
- i don’t know why hotplug disk disk-add-1 can affect the before disk disk-1
Environment:
- KubeVirt version (use
virtctl version): 0.56 - Kubernetes version (use
kubectl version): 1.22.1
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Comments: 31 (22 by maintainers)
I think I have some thoughts. Indeed, the issues can be reproduced when we have a mix of hotpluggable and non-hotpluggable block volumes.
When a block volume is non-hotpluggable (i.e. it is specified explicitly in the VMI spec), the device cgroup permissions are managed purely by Kubernetes and CRI. For v2, that means a BPF program is assigned to the POD’s cgroup. However, when we manage hotplug volumes, we overwrite the BPF program to allow access to the new block device. The problem is that we do not know what the existing BPF program does, hence we just follow some assumptions about the ‘default’ devices that we need to allow (e.g.
/dev/kvmand some others). I guess we need to also consider the non-hotpluggable volumes.Tentative fix https://github.com/kubevirt/kubevirt/pull/8828