minikube: minikube start fails with driver kvm2 on AMD Ryzen CPU

minikube fails to create a new cluster on Manjaro, using driver kvm2. The VM using boot2docker fails to boot properly.

The exact command to reproduce the issue:

minikube start --vm-driver=kvm2

The full output of the command that failed:

😄  minikube v1.6.2 on Arch 18.1.4
✨  Selecting 'kvm2' driver from user configuration (alternates: [virtualbox none])
🔥  Creating kvm2 VM (CPUs=2, Memory=2000MB, Disk=20000MB) ...

💣  Unable to start VM. Please investigate and run 'minikube delete' if possible: create:
Error creating machine: Error in driver during machine creation: machine didn't return
an IP after 120 seconds

😿  minikube is exiting due to an error. If the above message is not useful, open an issue:
👉  https://github.com/kubernetes/minikube/issues/new/choose

The output of the minikube logs command:

💣  command runner
❌  Error: [SSH_AUTH_FAILURE] getting ssh client for bootstrapper: Error dialing tcp via ssh client: ssh: handshake failed: ssh: unable to authenticate, attempted methods [none publickey], no supported methods remain
💡  Suggestion: Your host is failing to route packets to the minikube VM. If you have VPN software, try turning it off or configuring it so that it does not re-route traffic to the VM IP. If not, check your VM environment routing options.
📘  Documentation: https://minikube.sigs.k8s.io/docs/reference/networking/vpn/
⁉️   Related issues:
    ▪ https://github.com/kubernetes/minikube/issues/3930

😿  If the above advice does not help, please let us know: 
👉  https://github.com/kubernetes/minikube/issues/new/choose

The operating system version: Manjaro 18.1.4 (Arch), running kernel 5.4.2-1-MANJARO, with QEMU 4.2

More details:

Despite the 120 seconds to get an IP, the issue doesn’t seem to be related to networking. After a lot of digging around I narrowed down the problem to this command:

/usr/bin/qemu-system-x86_64
   -machine pc-i440fx-4.2,accel=kvm,usb=off,dump-guest-core=off
   -cpu host
   -m 1908
   -boot menu=on
   -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2
   -device lsi,id=scsi0,bus=pci.0,addr=0x4
   -drive file=$HOME/.minikube/machines/minikube/boot2docker.iso,format=raw,if=none,id=drive-scsi0-0-2,readonly=on
   -device scsi-cd,bus=scsi0.0,scsi-id=2,device_id=drive-scsi0-0-2,drive=drive-scsi0-0-2,id=scsi0-0-2,bootindex=1
   -drive file=$HOME/.minikube/machines/minikube/minikube.rawdisk,format=raw,if=none,id=drive-virtio-disk0,aio=threads
   -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x5,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=2

On a Debian system running the same QEMU on kernel 5.3.0-2-amd64, that command shows, on the VM serial console, a lot of [ OK ] lines up to the minikube login: line. But in Manjaro, I get this:

Welcome to Buildroot 2019.02.7!

[  OK  ] Created slice User and Session Slice.
[FAILED] Failed to start Slices.
See 'systemctl status slices.target' for details.
[FAILED] Failed to listen on Journal Audit Socket.
See 'systemctl status systemd-journald-audit.socket' for details.
[FAILED] Failed to listen on Network Service Netlink Socket.
See 'systemctl status systemd-networkd.socket' for details.
[FAILED] Failed to listen on Journal Socket.
See 'systemctl status systemd-journald.socket' for details.
[DEPEND] Dependency failed for Journal Service.
[DEPEND] Dependency failed for Flus\u2026Journal to Persistent Storage.
[FAILED] Failed to mount Huge Pages File System.
See 'systemctl status dev-hugepages.mount' for details.
[FAILED] Failed to start Remount Root and Kernel File Systems.
See 'systemctl status systemd-remount-fs.service' for details.
[FAILED] Failed to listen on Journal Socket (/dev/log).
See 'systemctl status systemd-journald-dev-log.socket' for details.
[FAILED] Failed to start system-getty.slice.
See 'systemctl status system-getty.slice' for details.
[DEPEND] Dependency failed for Getty on tty1.
[FAILED] Failed to listen on udev Kernel Socket.
See 'systemctl status systemd-udevd-kernel.socket' for details.
[FAILED] Failed to start NFS client services.
See 'systemctl status nfs-client.target' for details.
[FAILED] Failed to start Swap.
See 'systemctl status swap.target' for details.
[FAILED] Failed to mount Temporary Directory (/tmp).
See 'systemctl status tmp.mount' for details.
[DEPEND] Dependency failed for Network Time Synchronization.
[DEPEND] Dependency failed for Network Name Resolution.
[FAILED] Failed to start Host and Network Name Lookups.
See 'systemctl status nss-lookup.target' for details.
[DEPEND] Dependency failed for NFS \u2026 monitor for NFSv2/3 locking..
[FAILED] Failed to start System Time Synchronized.
See 'systemctl status time-sync.target' for details.
[FAILED] Failed to start Create lis\u2026 nodes for the current kernel.
See 'systemctl status kmod-static-nodes.service' for details.
[FAILED] Failed to mount POSIX Message Queue File System.
See 'systemctl status dev-mqueue.mount' for details.
[FAILED] Failed to mount FUSE Control File System.
See 'systemctl status sys-fs-fuse-connections.mount' for details.
[FAILED] Failed to start Forward Pa\u2026uests to Wall Directory Watch.
See 'systemctl status systemd-ask-password-wall.path' for details.
[FAILED] Failed to mount Kernel Debug File System.
See 'systemctl status sys-kernel-debug.mount' for details.
[FAILED] Failed to listen on initctl Compatibility Named Pipe.
See 'systemctl status systemd-initctl.socket' for details.
[FAILED] Failed to start Apply Kernel Variables.
See 'systemctl status systemd-sysctl.service' for details.
[FAILED] Failed to mount NFSD configuration filesystem.
See 'systemctl status proc-fs-nfsd.mount' for details.
[DEPEND] Dependency failed for NFS Mount Daemon.
[DEPEND] Dependency failed for NFS server and services.
[FAILED] Failed to start Remote File Systems (Pre).
See 'systemctl status remote-fs-pre.target' for details.
[FAILED] Failed to start Dispatch P\u2026ts to Console Directory Watch.
See 'systemctl status systemd-ask-password-console.path' for details.
[FAILED] Failed to listen on udev Control Socket.
See 'systemctl status systemd-udevd-control.socket' for details.
[FAILED] Failed to start udev Coldplug all Devices.
See 'systemctl status systemd-udev-trigger.service' for details.
[FAILED] Failed to start udev Wait \u2026omplete Device Initialization.
See 'systemctl status systemd-udev-settle.service' for details.
[DEPEND] Dependency failed for minikube automount.
[FAILED] Failed to start Paths.
See 'systemctl status paths.target' for details.
[FAILED] Failed to start system-serial\x2dgetty.slice.
See 'systemctl status "system-serial\\x2dgetty.slice"' for details.
[DEPEND] Dependency failed for Serial Getty on ttyS0.
[FAILED] Failed to start Login Prompts.
See 'systemctl status getty.target' for details.
[FAILED] Failed to start Create Static Device Nodes in /dev.
See 'systemctl status systemd-tmpfiles-setup-dev.service' for details.
[FAILED] Failed to start Local File Systems (Pre).
See 'systemctl status local-fs-pre.target' for details.
[FAILED] Failed to start Local File Systems.
See 'systemctl status local-fs.target' for details.
[FAILED] Failed to start Preprocess NFS configuration.
See 'systemctl status nfs-config.service' for details.
[FAILED] Failed to start udev Kernel Device Manager.
See 'systemctl status systemd-udevd.service' for details.
[FAILED] Failed to start Network Service.
See 'systemctl status systemd-networkd.service' for details.
[FAILED] Failed to start Network.
See 'systemctl status network.target' for details.
[DEPEND] Dependency failed for Notify NFS peers of a restart.
[FAILED] Failed to start Remote File Systems.
See 'systemctl status remote-fs.target' for details.
[FAILED] Failed to start Containers.
See 'systemctl status machines.target' for details.
[FAILED] Failed to start RPC Port Mapper.
See 'systemctl status rpcbind.target' for details.
[FAILED] Failed to start Create Volatile Files and Directories.
See 'systemctl status systemd-tmpfiles-setup.service' for details.
[FAILED] Failed to start Update UTMP about System Boot/Shutdown.
See 'systemctl status systemd-update-utmp.service' for details.
[DEPEND] Dependency failed for Upda\u2026about System Runlevel Changes.
[FAILED] Failed to start Rebuild Journal Catalog.
See 'systemctl status systemd-journal-catalog-update.service' for details.
[FAILED] Failed to start Update is Completed.
See 'systemctl status systemd-update-done.service' for details.
[FAILED] Failed to start System Initialization.
See 'systemctl status sysinit.target' for details.
[DEPEND] Dependency failed for RPCbind Server Activation Socket.
[DEPEND] Dependency failed for RPC bind service.
[DEPEND] Dependency failed for OpenSSH server daemon.
[DEPEND] Dependency failed for Hyper-V FCOPY Daemon.
[DEPEND] Dependency failed for Hyper-V VSS Daemon.
[DEPEND] Dependency failed for Basic System.
[DEPEND] Dependency failed for Multi-User System.
[DEPEND] Dependency failed for Login Service.
[DEPEND] Dependency failed for D-Bus System Message Bus Socket.
[DEPEND] Dependency failed for D-Bus System Message Bus.
[DEPEND] Dependency failed for vmtoolsd for openvmtools.
[DEPEND] Dependency failed for Hyper-V Key Value Pair Daemon.
[DEPEND] Dependency failed for VirtualBox Guest Service.
[DEPEND] Dependency failed for Dail\u2026anup of Temporary Directories.
[FAILED] Failed to start Timers.
See 'systemctl status timers.target' for details.
[FAILED] Failed to start Sockets.
See 'systemctl status sockets.target' for details.

And the VM just stays hung there. Is there any way I could troubleshoot that boot process?

Both computers are loading the same boot2docker.iso (sha256: a24153a2e49f082d5f4a36ea5d1608cba2482d563e8642a8dffd6560c40f3ed2).

The other difference between the systems is the CPU:

  • Debian: Intel® Core™ i7-8550U
  • Manjaro: AMD Ryzen 7 3800X 8-Core

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Comments: 20 (1 by maintainers)

Most upvoted comments

Ok, I think I got it. I checked out branch afbjorklund:systemd-amd (from https://github.com/kubernetes/minikube/pull/6183) and built minikube.iso. I ran it with QEMU (without the -cpu hack) and it worked:

...
[  OK  ] Started Notify NFS peers of a restart.
[  OK  ] Started Login Service.
[  OK  ] Started OpenSSH server daemon.
[  OK  ] Reached target Multi-User System.
         Starting Update UTMP about System Runlevel Changes...
[  OK  ] Started Update UTMP about System Runlevel Changes.

Welcome to minikube
minikube login: root
                         _             _            
            _         _ ( )           ( )           
  ___ ___  (_)  ___  (_)| |/')  _   _ | |_      __  
/' _ ` _ `\| |/' _ `\| || , <  ( ) ( )| '_`\  /'__`\
| ( ) ( ) || || ( ) || || |\`\ | (_) || |_) )(  ___/
(_) (_) (_)(_)(_) (_)(_)(_) (_)`\___/'(_,__/'`\____)

# cat /proc/cpuinfo
processor	: 0
vendor_id	: AuthenticAMD
cpu family	: 23
model		: 113
model name	: AMD Ryzen 7 3800X 8-Core Processor
stepping	: 0
microcode	: 0x1000065
cpu MHz		: 3899.998
cache size	: 512 KB
physical id	: 0
...

I tried it with VirtualBox as well, and it also worked fine.

Ok, I managed to make it work. I tried a handful of CPUs manually and found -cpu kvm64 to have good performance. Other CPUs ran considerably slower.

I couldn’t find any clean/elegant way to have minikube start the VM with a different CPU. So I ended up renaming /usr/bin/qemu-system-x86_64 to /usr/bin/qemu-system-x86_64.orig and putting this script in its place:

#!/usr/bin/env python

import os
import sys

argv = sys.argv[:]

if "-cpu" in argv:
    i = argv.index("-cpu")
    if argv[i + 1] == "host":
        argv[i + 1] = "kvm64"

os.execvp("qemu-system-x86_64.orig", argv)

After this, minikube start created the cluster successfully.

This is clearly not a solution, and the bug remains somewhere in boot2docker or systemd, but at least this unblocks me and I hope it’ll help others.

Probably yet another systemd bug, similar to this one: https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1835809