kubevirt: vm can't restart after addvolume

What happened:

  1. i start a centos vm
  2. add a blank dv to vm by /apis/subresources.kubevirt.io/v1/namespaces/{namespace:[a-z0-9]}/virtualmachines/{name:[a-z0-9][a-z0-9-]}/addvolume
  3. dv hotplug successed and os console can detect the new disk
  4. request /apis/subresources.kubevirt.io/v1/namespaces/{namespace:[a-z0-9]}/virtualmachines/{name:[a-z0-9][a-z0-9-]}/restart
  5. my vm can’t running and stay in phase starting

Additional context:

  1. running vm
spec:
  dataVolumeTemplates:
  - metadata:
      name: dv-disk-1
    spec:
      pvc:
        accessModes:
        - ReadWriteMany
        resources:
          requests:
            storage: 30Gi
        storageClassName: xxxxx
        volumeMode: Block
      source:
        blank: {}
  running: true
  template:
    metadata:
    spec:
      domain:
        cpu:
          cores: 1
          model: host-model
          sockets: 1
          threads: 1
        devices:
          disks:
          - bootOrder: 1
            cache: none
            disk:
              bus: virtio
            name: disk-1
          - bootOrder: 2
            cdrom:
              bus: sata
            name: cdrom-1
          interfaces:
            - masquerade: {}
              name: default
        firmware:
          bootloader:
            bios:
              useSerial: true
        machine:
          type: q35
        memory:
          guest: 2Gi
        resources:
          limits:
            cpu: "1"
            memory: 2Gi
          requests:
            cpu: "1"
            memory: 2Gi
      networks:
        - name: default
          pod: {}
      volumes:
      - dataVolume:
          name: dv-disk-1
        name: disk-1
      - dataVolume:
          name: centos
        name: cdrom-1
  1. create blank dv
apiVersion: cdi.kubevirt.io/v1beta1
kind: DataVolume
metadata:
  name: blank
spec:
  source:
    blank: {}
  pvc:
    storageClassName: xxxxxxx
    accessModes:
      - ReadWriteOnce
    volumeMode: Block
    resources:
      requests:
        storage: 50Gi
  1. addvolue
        devices:
          disks:
          - bootOrder: 1
            cache: none
            disk:
              bus: virtio
            name: disk-1
          - bootOrder: 2
            cdrom:
              bus: sata
            name: cdrom-1
          - cache: none
            disk:
              bus: scsi
            name: disk-add-1
      ----------------------
      volumes:
      - dataVolume:
          name: dv-disk-1
        name: disk-1
      - dataVolume:
          name: centos
        name: cdrom-1
      - dataVolume:
          hotpluggable: true
          name: blank
        name: disk-add-1
  1. restart vm, virt-launcher pod err msg
{"component":"virt-launcher","level":"info","msg":"Marked as ready","pos":"virt-launcher.go:74","timestamp":"2022-10-29T09:07:03.499371Z"}
{"component":"virt-launcher","kind":"","level":"info","msg":"Executing PreStartHook on VMI pod environment","name":"centos8","namespace":"default","pos":"manager.go:513","timestamp":"2022-10-29T09:07:17.590819Z","uid":"fae3730b-e359-4a44-aada-e21248c9f988"}
{"component":"virt-launcher","kind":"","level":"info","msg":"Starting PreCloudInitIso hook","name":"centos8","namespace":"default","pos":"manager.go:534","timestamp":"2022-10-29T09:07:17.590870Z","uid":"fae3730b-e359-4a44-aada-e21248c9f988"}
{"component":"virt-launcher","level":"info","msg":"Found nameservers in /etc/resolv.conf: \ufffd\u001e\u0000\n","pos":"network.go:289","timestamp":"2022-10-29T09:07:17.591873Z"}
{"component":"virt-launcher","level":"info","msg":"Found search domains in /etc/resolv.conf: default.svc.cluster.local svc.cluster.local cluster.local","pos":"network.go:290","timestamp":"2022-10-29T09:07:17.591903Z"}
{"component":"virt-launcher","level":"info","msg":"Starting SingleClientDHCPServer","pos":"server.go:65","timestamp":"2022-10-29T09:07:17.591970Z"}
{"component":"virt-launcher","level":"error","msg":"Direct IO check failed for /dev/disk-1","pos":"converter.go:417","reason":"open /dev/disk-1: operation not permitted","timestamp":"2022-10-29T09:07:17.592144Z"}
{"component":"virt-launcher","kind":"","level":"error","msg":"pre start setup for VirtualMachineInstance failed.","name":"centos8","namespace":"default","pos":"manager.go:853","reason":"Unable to use 'none' cache mode, file system where /dev/disk-1 is stored does not support direct I/O","timestamp":"2022-10-29T09:07:17.592197Z","uid":"fae3730b-e359-4a44-aada-e21248c9f988"}
{"component":"virt-launcher","kind":"","level":"error","msg":"Failed to sync vmi","name":"centos8","namespace":"default","pos":"server.go:185","reason":"Unable to use 'none' cache mode, file system where /dev/disk-1 is stored does not support direct I/O","timestamp":"2022-10-29T09:07:17.592236Z","uid":"fae3730b-e359-4a44-aada-e21248c9f988"}
{"component":"virt-launcher","kind":"","level":"info","msg":"Executing PreStartHook on VMI pod environment","name":"centos8","namespace":"default","pos":"manager.go:513","timestamp":"2022-10-29T09:07:17.693970Z","uid":"fae3730b-e359-4a44-aada-e21248c9f988"}
{"component":"virt-launcher","kind":"","level":"info","msg":"Starting PreCloudInitIso hook","name":"centos8","namespace":"default","pos":"manager.go:534","timestamp":"2022-10-29T09:07:17.694030Z","uid":"fae3730b-e359-4a44-aada-e21248c9f988"}
  1. delete disk-1 cache : none to empty
{"component":"virt-launcher","kind":"","level":"info","msg":"Domain defined.","name":"centos8","namespace":"default","pos":"manager.go:866","timestamp":"2022-10-31T07:16:07.172539Z","uid":"baca3985-45a5-4c83-aed5-bc24d118b780"}
{"component":"virt-launcher","level":"info","msg":"DomainLifecycle event 0 with reason 0 received","pos":"client.go:436","timestamp":"2022-10-31T07:16:07.172672Z"}
{"component":"virt-launcher","level":"info","msg":"kubevirt domain status: Shutoff(5):Unknown(0)","pos":"client.go:289","timestamp":"2022-10-31T07:16:07.174244Z"}
{"component":"virt-launcher","level":"info","msg":"Successfully connected to domain notify socket at /var/run/kubevirt/domain-notify-pipe.sock","pos":"client.go:168","timestamp":"2022-10-31T07:16:07.176317Z"}
{"component":"virt-launcher","level":"error","msg":"Cannot access storage file '/dev/disk-1': Operation not permitted","pos":"virStorageSourceReportBrokenChain:1248","subcomponent":"libvirt","thread":"31","timestamp":"2022-10-31T07:16:07.176000Z"}
{"component":"virt-launcher","kind":"","level":"error","msg":"Failed to start VirtualMachineInstance with flags 0.","name":"centos8","namespace":"default","pos":"manager.go:891","reason":"virError(Code=38, Domain=18, Message='Cannot access storage file '/dev/disk-1': Operation not permitted')","timestamp":"2022-10-31T07:16:07.177438Z","uid":"baca3985-45a5-4c83-aed5-bc24d118b780"}
{"component":"virt-launcher","kind":"","level":"error","msg":"Failed to sync vmi","name":"centos8","namespace":"default","pos":"server.go:185","reason":"virError(Code=38, Domain=18, Message='Cannot access storage file '/dev/disk-1': Operation not permitted')","timestamp":"2022-10-31T07:16:07.177504Z","uid":"baca3985-45a5-4c83-aed5-bc24d118b780"}
  1. i don’t know why hotplug disk disk-add-1 can affect the before disk disk-1

Environment:

  • KubeVirt version (use virtctl version): 0.56
  • Kubernetes version (use kubectl version): 1.22.1

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Comments: 31 (22 by maintainers)

Most upvoted comments

I think I have some thoughts. Indeed, the issues can be reproduced when we have a mix of hotpluggable and non-hotpluggable block volumes.

When a block volume is non-hotpluggable (i.e. it is specified explicitly in the VMI spec), the device cgroup permissions are managed purely by Kubernetes and CRI. For v2, that means a BPF program is assigned to the POD’s cgroup. However, when we manage hotplug volumes, we overwrite the BPF program to allow access to the new block device. The problem is that we do not know what the existing BPF program does, hence we just follow some assumptions about the ‘default’ devices that we need to allow (e.g. /dev/kvm and some others). I guess we need to also consider the non-hotpluggable volumes.