minikube: vmwarefusion: failed to start after stop: Error configuring auth on host: Too many retries waiting for SSH to be available

Is this a BUG REPORT or FEATURE REQUEST? (choose one): BUG REPORT

Minikube version (use minikube version): 0.18.0

Environment:

  • OS (e.g. from /etc/os-release): MacOS 10.12.4
  • VM Driver (e.g. cat ~/.minikube/machines/minikube/config.json | grep DriverName): vmwarefusion
  • ISO version (e.g. cat ~/.minikube/machines/minikube/config.json | grep -i ISO or minikube ssh cat /etc/VERSION): boot2docker.iso
  • Install tools:
  • Others:

What happened: Using Vmware Fusion in Mac OS, the first time minikube is started, it works flawlessly. However, after minikube stop, if I run again minikube start --vm-driver=vmwarefusion, it will fail and never run the minikube.

Starting local Kubernetes cluster...
Starting VM...
Waiting for SSH to be available...
E0419 23:27:50.099029    1781 start.go:116] Error starting host: Temporary Error: Error configuring auth on host: Too many retries waiting for SSH to be available.  Last error: Maximum number of retries (60) exceeded.

What you expected to happen: Be able to start the cluster after stopping it.

How to reproduce it (as minimally and precisely as possible):

minikube start --vm-driver=vmwarefusion
minikube stop
minikube start --vm-driver=vmwarefusion

Anything else do we need to know: The only solution I’ve found so far is to minikube delete and start over.

About this issue

  • Original URL
  • State: closed
  • Created 7 years ago
  • Comments: 18 (1 by maintainers)

Most upvoted comments

using latest v0.23.0 and still getting the same issue, is the fix included in that version?

is there any nightly build to test it?

the easiest way of fixing it is just ssh-copy-id -i ~/.minikube/machines/minikube/id_rsa.pub docker@$(minikube ip) while minikube is starting, the password is here cat ~/.minikube/machines/minikube/config.json|grep -i pass

This commit seems to be a fix for the issue (minikube itself has no code dictating when userdata is copied.) Can we pull it in to minikube?

Thanks. After making a fresh cluster I put the tar file in by hand.

  1. minikube ssh
  2. sudo cp /Users/[mylogin]/.minikube/machines/minikube/userdata.tar /var/lib/boot2docker/ and it now starts after a stop.

Experiencing same issue here.

Did some digging with vmrun and found that guest /home/docker/.ssh dir is missing.

As a workaround I found I could get the cluster running again by:

minikube start -v 10 (get it to start the vm for you, [ctrl]+[c} once you start to see the 255 errors)

Then running this script on host to restore missing ssh keys in guest:

#!/bin/bash

MINIKUBE="${HOME}/.minikube/machines/minikube"
VMX="$MINIKUBE/minikube.vmx"
DOCKER_PUB_KEY="$MINIKUBE/id_rsa.pub"

function vmrun {
	GUESTCMD=$1; shift
	"/Applications/VMware Fusion.app/Contents/Library/vmrun" -gu docker -gp tcuser $GUESTCMD "$VMX" "$@"
}

vmrun runScriptInGuest /bin/bash "mkdir -p /home/docker/.ssh"
vmrun CopyFileFromHostToGuest "$DOCKER_PUB_KEY" /home/docker/.ssh/authorized_keys 
vmrun runScriptInGuest /bin/bash "chown -R docker /home/docker/.ssh" 
vmrun runScriptInGuest /bin/bash "chmod -R 700 /home/docker/.ssh" 

Then running start again now that ssh access is restored to bring it up: minikube start -v 10

Did a some quick digging for a cause, found this in minikube-automount logs, minikube-automount restores userdata.tar to populate the /home/docker/.ssh dir and so without that we get the 255 error from the client ssh

May 14 11:50:05 minikube minikube-automount[4977]: + tar xf /var/lib/boot2docker/userdata.tar -C /home/docker/
May 14 11:50:05 minikube minikube-automount[4977]: tar: can't open '/var/lib/boot2docker/userdata.tar': No such file or directory
May 14 11:50:05 minikube minikube-automount[4977]: + chown -R docker:docker /home/docker/.ssh
May 14 11:50:05 minikube minikube-automount[4977]: chown: /home/docker/.ssh: No such file or directory

/var/lib/boot2docker points onto persistent storage, so that is good:

$ ls -la /var/lib                             
total 0
drwxr-xr-x    7 root     root             0 May 14 11:50 .
drwxr-xr-x    4 root     root             0 May 14 11:50 ..
drwxr-xr-x    2 root     root             0 Feb  8 19:46 arpd
lrwxrwxrwx    1 root     root            29 May 14 11:50 boot2docker -> /mnt/sda1/var/lib/boot2docker
lrwxrwxrwx    1 root     root            21 May 14 11:50 cni -> /mnt/sda1/var/lib/cni
drwxr-xr-x    2 root     root             0 Feb  8 19:43 dbus
lrwxrwxrwx    1 root     root            24 May 14 11:50 docker -> /mnt/sda1/var/lib/docker
lrwxrwxrwx    1 root     root            25 May 14 11:50 kubelet -> /mnt/sda1/var/lib/kubelet
lrwxrwxrwx    1 root     root            27 May 14 11:50 localkube -> /mnt/sda1/var/lib/localkube
drwx------    2 root     root             0 May 14 11:50 machines
lrwxrwxrwx    1 root     root             9 Feb  8 19:23 misc -> ../../tmp
lrwxrwxrwx    1 root     root            21 May 14 11:50 rkt -> /mnt/sda1/var/lib/rkt
drwx--x--x    3 root     root             0 Feb  8 19:52 sudo
drwxr-xr-x    4 root     root             0 May 14 11:50 systemd

But there is no userdata.tar contained within.

$ find /mnt/sda1/var/lib/boot2docker -ls
  1835011      4 drwxr-xr-x   3  root     root         4096 May 12 21:46 /mnt/sda1/var/lib/boot2docker
  1835040      4 drwxr-xr-x   2  root     root         4096 May 12 21:46 /mnt/sda1/var/lib/boot2docker/etc

Yet to find out why userdata.tar is missing… But looks to be handled here: https://github.com/kubernetes/minikube/blob/k8s-v1.7/deploy/iso/minikube-iso/package/automount/minikube-automount

So I’m thinking the logs from the guest on first boot (journalctl -t minikube-automount) might show us the problem… will try to grab when I can.