image-builder: build-qemu-ubuntu-2204 stuck in "Waiting for SSH to become available..."

Hi, I installed the image builder based on this doc. Then to building an image for Openstack I use this doc. But with this command make build-qemu-ubuntu-2204 it stucks in the ssh step. My operating system is ubuntu 22.04 This is the log of the command:

hack/ensure-ansible.sh
fatal: not a git repository (or any of the parent directories): .git
Starting galaxy collection install process
Nothing to do. All requested collections are already installed. If you want to reinstall them, consider using `--force`.
hack/ensure-packer.sh
hack/ensure-goss.sh
Right version of binary present
packer build -var-file="/root/image-builder/images/capi/packer/config/kubernetes.json"  -var-file="/root/image-builder/images/capi/packer/config/cni.json"  -var-file="/root/image-builder/images/capi/packer/config/containerd.json"  -var-file="/root/image-builder/images/capi/packer/config/wasm-shims.json"  -var-file="/root/image-builder/images/capi/packer/config/ansible-args.json"  -var-file="/root/image-builder/images/capi/packer/config/goss-args.json"  -var-file="/root/image-builder/images/capi/packer/config/common.json"  -var-file="/root/image-builder/images/capi/packer/config/additional_components.json"  -color=true -var-file="/root/image-builder/images/capi/packer/qemu/qemu-ubuntu-2204.json"  packer/qemu/packer.json
fatal: not a git repository (or any of the parent directories): .git
qemu: output will be in this color.

==> qemu: Retrieving ISO
==> qemu: Trying https://releases.ubuntu.com/22.04/ubuntu-22.04.1-live-server-amd64.iso
==> qemu: Trying https://releases.ubuntu.com/22.04/ubuntu-22.04.1-live-server-amd64.iso?checksum=sha256%3A10f19c5b2b8d6db711582e0e27f5116296c34fe4b313ba45f9b201a5007056cb
    qemu: ubuntu-22.04.1-live-server-amd64.iso 1.37 GiB / 1.37 GiB [==================================================================================================================] 100.00% 1m15s
==> qemu: https://releases.ubuntu.com/22.04/ubuntu-22.04.1-live-server-amd64.iso?checksum=sha256%3A10f19c5b2b8d6db711582e0e27f5116296c34fe4b313ba45f9b201a5007056cb => /root/.cache/packer/281aa9855752339063385b35198e73db74cd61ba.iso
==> qemu: Starting HTTP server on port 8247
==> qemu: Found port for communicator (SSH, WinRM, etc): 2769.
==> qemu: Looking for available port between 5900 and 6000 on 127.0.0.1
==> qemu: Starting VM, booting from CD-ROM
    qemu: The VM will be run headless, without a GUI. If you want to
    qemu: view the screen of the VM, connect via VNC without a password to
    qemu: vnc://127.0.0.1:5952
==> qemu: Waiting 10s for boot...
==> qemu: Connecting to VM via VNC (127.0.0.1:5952)
==> qemu: Typing the boot commands over VNC...
    qemu: Not using a NetBridge -- skipping StepWaitGuestAddress
==> qemu: Using SSH communicator to connect: 127.0.0.1
==> qemu: Waiting for SSH to become available...

/kind bug [One or more /area label. See https://github.com/kubernetes-sigs/cluster-api/labels?q=area for the list of labels]

About this issue

  • Original URL
  • State: open
  • Created a year ago
  • Reactions: 1
  • Comments: 22 (8 by maintainers)

Most upvoted comments

Maybe it’s time to stop using ubuntu legacy live iso image for newest releases ? I observed that it’s seems to be the main cause of all these problems. Legacy image is deprecated and tends to be replaced by ubuntu cloudimg : https://cloud-images.ubuntu.com/

So (on my side) I’m currently replacing ubuntu image & script used by image-builder, using this server cloudimg and everything works like a charm.

Came across this issue as well. After 22min i cancelled the first run as it seemed to have stucked on ==> qemu: Waiting for SSH to become available.... That was an assumption at that point.

After that I made the changes recommended by @mikejoh which seems to have “solved it”.

I speculate that because the default image is rather old, the package upgrade step takes too long, and depending on the environment might even pass the ssh timeout set by packer, or the patience of the user (like me who killed the first run after 22min assuming it was stuck). So using the newer image made the package upgrade faster and after ~10min i get into the config phase.

Not sure what an appropriate fix would be for this. Bump the packer ssh timeout, document it and “periodically” update the base images to newer?

I think this indeed is an issue stemming from the fact that we have a very big apt upgrade that happens.

@mikejoh I ran into some issues well that i couldnt troubleshoot. People in the capo slack channel pointed out to me that the are running ubuntu 22.04 images but they are built with https://image-builder.sigs.k8s.io/capi/providers/openstack-remote.html and not the qemu-builder. the openstack-remote provider worked for me as well. I’ve opened a ticket (#1137) with my findings for the qemu built. I hope this helps

will try that out today and let you know 🤞

That’s great to hear @tibeer. I only ever ran into a stuck build once, but it failed much earlier in the process (error while entering the boot command).

I’m not seeing anything obvious in the output you pasted and it might have just been taking a long time installing the base system. This could very well be due to problems in the build environment (e.g. network issues) that only present themselves intermittedly, but aren’t directly caused by a misconfiguration of the build process.

The problem is resolved for us at least. CI-CD now works again. Regarding the reason: I honestly cannot tell you. Seems that it was just a hick-up.