kubevirt: Can't create VMs in openshift CI using GCP

/kind bug

What happened: All HCO GCP tests are failing. These tests are running in openshift-ci on gcp cluster. The tests fail to create VMs.

The errors in virt-handler look like this:

{"component":"virt-handler","kind":"","level":"error","msg":"Synchronizing the VirtualMachineInstance failed.","name":"testvmi-svqq5","namespace":"kubevirt-test-default1","pos":"vm.go:1435","reason":"failed to configure vmi network: failed to detect isolation for launcher pod: dial unix //pods/acbcc68e-eb17-4c1d-8f30-0537e7cd0cae/volumes/kubernetes.io~empty-dir/sockets/launcher-sock: connect: connection refused","timestamp":"2021-03-29T08:45:54.782956Z","uid":"9d4a231a-dc50-4d86-9668-9cf1ee189bd9"}

There is no direct access to the cluster, and there are no virt-launcher logs.

There are many examples of these failures because all GCP tests are failing. Here is one: https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/kubevirt_hyperconverged-cluster-operator/1242/pull-ci-kubevirt-hyperconverged-cluster-operator-master-hco-e2e-image-index-gcp/1381618709780500480

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 39 (38 by maintainers)

Most upvoted comments

I AM REALLY GLAD TO SEE THIS! CONGRATS!

@erkanerol Network Manager usually just writes whatever the DHCP server said to /etc/resolv.conf. IMO, the only way this is its fault is if it truncates the search list to 72 characters. To me it looks a lot more like a GCP+OCP cluster misconfiguration. I could be interesting to run a dhcp client in verbose mode to see the IP of the DHCP server. That IP should point you directly to the culprit!

c.openshift-gce-devel- does not appear to be all that valid, at least dig does not like it:

$ dig c.openshift-gce-devel-
dig: 'c.openshift-gce-devel-' is not a legal IDN name (string start/ends with forbidden hyphen), use +noidnin

$ dig c.openshift-gce-devel- +noidnin
dig: 'c.openshift-gce-devel-.' is not a legal IDNA2008 name (string start/ends with forbidden hyphen), use +noidnout

$ dig c.openshift-gce-devel- +noidnin +noidnout

; <<>> DiG 9.11.20-RedHat-9.11.20-5.el8 <<>> c.openshift-gce-devel- +noidnin +noidnout
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 37996
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;c.openshift-gce-devel-.		IN	A

;; Query time: 0 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Fri Sep 24 17:10:47 IDT 2021
;; MSG SIZE  rcvd: 51

Do we have an understanding of whoever is configuring that particular DNS name ?