kubevirt: bridge and masqurade networking doesn't work, but slirp does

/kind bug

What happened:

Starting a VM with an interface in either bridge or masquerade mode gives a broken network where the default gateway is 169.254.1.1 and the VM is unable to ping it or connect to anything on the network.

If I use slirp, then networking seems to work correctly.

The VM has worked perfectly a long time on kubevirt v0.35.0 / kubernetes v1.16.3 / ubuntu 18.04 with network mode bridge.

What you expected to happen:

I would expect both bridge and masquerade to work.

Anything else we need to know?:

Environment:

  • KubeVirt version (use virtctl version): Client Version: version.Info{GitVersion:“v0.47.1”, GitCommit:“c34de42a48f5564f4fd2c21b6cbda7b96664c65b”, GitTreeState:“clean”, BuildDate:“2021-11-11T16:01:45Z”, GoVersion:“go1.16.6”, Compiler:“gc”, Platform:“linux/amd64”} Server Version: version.Info{GitVersion:“v0.47.1-dirty”, GitCommit:“c34de42a48f5564f4fd2c21b6cbda7b96664c65b”, GitTreeState:“dirty”, BuildDate:“2021-11-11T16:20:45Z”, GoVersion:“go1.16.6”, Compiler:“gc”, Platform:“linux/amd64”}

  • Kubernetes version (use kubectl version):

Client Version: version.Info{Major:“1”, Minor:“20”, GitVersion:“v1.20.5”, GitCommit:“6b1d87acf3c8253c123756b9e61dac642678305f”, GitTreeState:“clean”, BuildDate:“2021-03-18T01:10:43Z”, GoVersion:“go1.15.8”, Compiler:“gc”, Platform:“linux/amd64”} Server Version: version.Info{Major:“1”, Minor:“21”, GitVersion:“v1.21.5”, GitCommit:“aea7bbadd2fc0cd689de94a54e5b7b758869d691”, GitTreeState:“clean”, BuildDate:“2021-09-15T21:04:16Z”, GoVersion:“go1.16.8”, Compiler:“gc”, Platform:“linux/amd64”}

  • VM or VMI specifications:

This VM doesn’t work, the only difference between this and one that works is that bridge is replaced with slirp

` apiVersion: kubevirt.io/v1 kind: VirtualMachine metadata: annotations: kubevirt.io/latest-observed-api-version: v1 kubevirt.io/storage-observed-api-version: v1alpha3 generation: 1 labels: app.kubernetes.io/instance: golden-virtual-machines managedFields:

  • apiVersion: kubevirt.io/v1alpha3 fieldsType: FieldsV1 fieldsV1: f:metadata: f:annotations: .: {} f:kubectl.kubernetes.io/last-applied-configuration: {} f:labels: .: {} f:app.kubernetes.io/instance: {} f:spec: .: {} f🏃 {} f:template: .: {} f:metadata: .: {} f:annotations: {} f:labels: {} f:spec: .: {} f:domain: .: {} f:cpu: .: {} f:cores: {} f:devices: .: {} f:disks: {} f:interfaces: {} f:machine: .: {} f:type: {} f:memory: .: {} f:guest: {} f:resources: .: {} f:requests: .: {} f:memory: {} f:networks: {} f:nodeSelector: .: {} f:kubernetes.io/hostname: {} f:volumes: {} manager: argocd-application-controller operation: Update time: “2021-11-19T12:01:10Z”

  • apiVersion: kubevirt.io/v1alpha3 fieldsType: FieldsV1 fieldsV1: f:metadata: f:annotations: f:kubevirt.io/latest-observed-api-version: {} f:kubevirt.io/storage-observed-api-version: {} f:status: .: {} f:conditions: {} f:created: {} f:printableStatus: {} f:ready: {} f:volumeSnapshotStatuses: {} manager: Go-http-client operation: Update time: “2021-11-19T12:01:34Z” name: win10 namespace: golden-virtual-machines resourceVersion: “21294670” uid: 093633db-59dc-4c1a-8055-21e7b8538999 spec: running: true template: metadata: annotations: debugLogs: “true” creationTimestamp: null labels: debugLogs: “true” vmName: win10 spec: domain: cpu: cores: 2 devices: disks: - disk: bus: virtio name: boot - disk: bus: virtio name: service - disk: bus: virtio name: tools - cdrom: bus: sata name: secret-volume interfaces: - bridge: {} macAddress: 02:05:11:b0:42:42 name: default ports: - port: 22 - port: 80 - port: 443 - port: 3389 - port: 3400 machine: type: q35 memory: guest: 8Gi resources: requests: memory: 8Gi networks:

    • name: default pod: {} nodeSelector: kubernetes.io/hostname: glengrant volumes:
    • name: boot persistentVolumeClaim: claimName: win10
    • containerDisk: image: registry.stibo.dk/rm/public/kubevirt/build-agent-service:11e66bd66fe017989f4f46bacbfe924fdf2ac6c8 name: service
    • containerDisk: image: registry.stibo.dk/containers/public/windows-vm-tools-container:2a4b5912eec127f9ee910fd4c532bfa7e30feb5b name: tools
    • name: secret-volume secret: secretName: service-mode status: conditions:
  • lastProbeTime: null lastTransitionTime: “2021-11-19T12:01:31Z” status: “True” type: Ready

  • lastProbeTime: null lastTransitionTime: null message: ‘cannot migrate VMI: PVC win10 is not shared, live migration requires that all PVCs must be shared (using ReadWriteMany access mode)’ reason: DisksNotLiveMigratable status: “False” type: LiveMigratable created: true printableStatus: Running ready: true volumeSnapshotStatuses:

  • enabled: false name: boot reason: ‘No VolumeSnapshotClass: Volume snapshots are not configured for this StorageClass [cephfs] [boot]’

  • enabled: false name: service reason: Snapshot is not supported for this volumeSource type [service]

  • enabled: false name: tools reason: Snapshot is not supported for this volumeSource type [tools]

  • enabled: false name: secret-volume reason: Snapshot is not supported for this volumeSource type [secret-volume] `

  • Cloud provider or hardware configuration:

Bare metal RKE v1.3.1 cluster with metallb in BGP mode.

The hosts are connected with ipoib (Infiniband).

  • OS (e.g. from /etc/os-release):

Ubuntu 20.04.3 LTS

  • Kernel (e.g. uname -a):

Linux glengrant 5.4.0-89-generic #100-Ubuntu SMP Fri Sep 24 14:50:10 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 30 (16 by maintainers)

Most upvoted comments

I finally found my issue (between chair and keyboard 😐).

I do my test in a vm (VirtualBox) which runs the kubernetes node. I forgot that my VM had two network interfaces and one was using NAT. I noticed that one of the ip of my VirtualBox VM collides with the IP of the VM (which could breaks the routing mechanism). I fixed it by setting

pod:
  vmNetworkCIDR: 10.0.3.0/24

Thank you for your comment (and yes I was looking in virt-launcher pod), I’ll check the documentation you pointed.

But I noticed this in the referenced documentation which is contrary to your comment (out of dated documentation ?). image