rancher-desktop: docker socket not available after sleep/wake

⚠️ Workaround for this issue ⚠️

If your docker socket (or file mounts) are broken in this way, please quit “Rancher Desktop” and start it again.


Rancher Desktop Version

1.0.0-beta1

Rancher Desktop K8s Version

1.22.5

What operating system are you using?

macOS

Operating System / Build Version

12.1

What CPU architecture are you using?

arm64 (Apple Silicon)

Linux only: what package format did you use to install Rancher Desktop?

N/A

Windows User Only

No response

Actual Behavior

Client docker on macOS loses the sock connection with the docker agent on Lima VM. It happens after sleep wake process

macOS terminal

:~/ $ docker info                                                                                                                                                                                                               [9:27:08]
Client:
 Context:    default
 Debug Mode: false

Server:
ERROR: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
errors pretty printing info

when I connect to the lima VM and run docker info:

docker info
Client:
 Context:    default
 Debug Mode: false

Server:
 Containers: 11
  Running: 0
  Paused: 0
  Stopped: 11
 Images: 77
 Server Version: 20.10.11
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 1
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 1e5ef943eb76627a6d3b6de8cd1ef6537f393a71
 runc version: b9ee9c6314599f1b4a7f497e1f1f856fe433d3b7
 init version:
 Security Options:
  seccomp
   Profile: default
 Kernel Version: 5.10.88-0-virt
 Operating System: Alpine Linux v3.14
 OSType: linux
 Architecture: aarch64
 CPUs: 4
 Total Memory: 5.79GiB
 Name: lima-rancher-desktop
 ID: NRV2:C6GL:E4XF:CK4L:HIZZ:6EM4:3ERM:CCWW:DHGH:7F7F:CTIJ:WUMC
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

Steps to Reproduce

  1. Select the dockerd engine option
  2. Put the macOS on sleep mode for more than 1~ hour
  3. Turn it on again and run any docker related command on macOS

Result

/ $ docker info                                                                                                                                                                                                               [9:27:08]
Client:
 Context:    default
 Debug Mode: false

Server:
ERROR: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
errors pretty printing info

Expected Behavior

After awake daemon should be reachable via socket

Additional Information

I need directions regarding which information would be helpful for troubleshooting this. Same behavior foud in the issue #1119

About this issue

  • Original URL
  • State: open
  • Created 2 years ago
  • Reactions: 54
  • Comments: 67 (5 by maintainers)

Most upvoted comments

I tried to make a symbolic link to file docker.sock in ~$USER/.rd/docker.sock and its work fine. Try this:

sudo ln -s ~$USER/.rd/docker.sock /var/run/docker.sock

Would be better to report this issue on the lima repository?

It is fine here. I already started looking into it, and even reproduced the error once (after 3 days) on my M1 mini, but unfortunately had to reboot it to finish some work for the upcoming release, which is almost done now.

It will be my top priority to look into this problem once 1.1.0 has shipped.

I have gotten this with 0.7.1 and 1.0.0.beta.1, Mac OS 12.1 on Intel. Quitting and restarting Rancher Desktop fixes but is an annoyance. Biggest (and really only) gripe with Rancher Desktop.

I have not yet had this happen with 1.0.0.

Wanted to report that I have had this happen after I upgraded to 1.0.0 (MacOS 11.2.2, Intel). Restarting Rancher Desktop works but is a pain.

Intel MacBook Pro on power-supply (shouldn’t sleep), MacOS 12.2.1, RD 1.0.1

I left a docker run unattended over night and it seems I got kicked out of the container:

$ docker run -it --rm registry.suse.com/suse/sle15:15.3
07e5c8c92f87:/ #
07e5c8c92f87:/ # ERRO[16263] error waiting for container: unexpected EOF  
$ docker run -it --rm registry.suse.com/suse/sle15:15.3
docker: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?.
See 'docker run --help'.
$ 

If I look at the logs, it seems to have happened around 2022-02-16T22:24:41.057Z, I wasn’t in front of the Mac

images.log:

2022-02-16T22:24:41.057Z: [object Object]
2022-02-16T22:24:46.145Z: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?

2022-02-16T22:24:51.074Z: moby images: same error message #2
2022-02-16T22:24:56.090Z: moby images: same error message #3

lima.ha.stderr.log shows the following (time shift probably due to timezone?)

{"level":"debug","msg":"guest agent event: {Time:2022-02-16 16:44:15.892645334 +0000 UTC LocalPortsAdded:[{IP:127.0.0.1 Port:8081}] LocalPortsRemoved:[] Errors:[]}","time":"2022-02-16T17:44:16+01:00"}
{"level":"info","msg":"Forwarding TCP from 127.0.0.1:8081 to 0.0.0.0:8081","time":"2022-02-16T17:44:16+01:00"}
{"error":"unexpected EOF","level":"warning","msg":"connection to the guest agent was closed unexpectedly","time":"2022-02-16T23:24:41+01:00"}
{"error":"EOF","level":"debug","msg":"sftp server for /Users/gvey exited with EOF (negligible)","time":"2022-02-16T23:24:41+01:00"}
{"error":"EOF","level":"debug","msg":"sftp server for /tmp/rancher-desktop exited with EOF (negligible)","time":"2022-02-16T23:24:41+01:00"}
{"error":"EOF","level":"debug","msg":"sftp server for /Users/gvey/Library/Caches/rancher-desktop/k3s exited with EOF (negligible)","time":"2022-02-16T23:24:41+01:00"}
{"level":"info","msg":"Forwarding \"/run/lima-guestagent.sock\" (guest) to \"/Users/gvey/Library/Application Support/rancher-desktop/lima/0/ga.sock\" (host)","time":"2022-02-16T23:24:51+01:00"}
{"level":"debug","msg":"guest agent info: \u0026{LocalPorts:[{IP:127.0.0.1 Port:10248} {IP:127.0.0.1 Port:10249} {IP:127.0.0.1 Port:6444} {IP:127.0.0.1 Port:45103} {IP:127.0.0.1 Port:10256} {IP:127.0.0.1 Port:10257} {IP:127.0.0.1 Port:10258} {IP:127.0.0.1 Port:10259} {IP:0.0.0.0 Port:22} {IP:0.0.0.0 Port:32695} {IP:0.0.0.0 Port:31287} {IP:0.0.0.0 Port:32186} {IP::: Port:10250} {IP::: Port:10251} {IP::: Port:6443} {IP::: Port:22} {IP:127.0.0.1 Port:80} {IP:127.0.0.1 Port:443} {IP:127.0.0.1 Port:8081}]}","time":"2022-02-16T23:24:51+01:00"}
{"level":"debug","msg":"guest agent event: {Time:2022-02-16 22:24:50.468314334 +0000 UTC LocalPortsAdded:[{IP:127.0.0.1 Port:10248} {IP:127.0.0.1 Port:10249} {IP:127.0.0.1 Port:6444} {IP:127.0.0.1 Port:45103} {IP:127.0.0.1 Port:10256} {IP:127.0.0.1 Port:10257} {IP:127.0.0.1 Port:10258} {IP:127.0.0.1 Port:10259} {IP:0.0.0.0 Port:22} {IP:0.0.0.0 Port:32695} {IP:0.0.0.0 Port:31287} {IP:0.0.0.0 Port:32186} {IP::: Port:10250} {IP::: Port:10251} {IP::: Port:6443} {IP::: Port:22} {IP:127.0.0.1 Port:80} {IP:127.0.0.1 Port:443} {IP:127.0.0.1 Port:8081}] LocalPortsRemoved:[] Errors:[]}","time":"2022-02-16T23:24:51+01:00"}
{"level":"info","msg":"Forwarding TCP from 127.0.0.1:10248 to 0.0.0.0:10248","time":"2022-02-16T23:24:51+01:00"}
{"level":"info","msg":"Forwarding TCP from 127.0.0.1:10249 to 0.0.0.0:10249","time":"2022-02-16T23:24:51+01:00"}
{"level":"info","msg":"Forwarding TCP from 127.0.0.1:6444 to 0.0.0.0:6444","time":"2022-02-16T23:24:51+01:00"}
{"level":"info","msg":"Forwarding TCP from 127.0.0.1:45103 to 0.0.0.0:45103","time":"2022-02-16T23:24:51+01:00"}
{"level":"info","msg":"Forwarding TCP from 127.0.0.1:10256 to 0.0.0.0:10256","time":"2022-02-16T23:24:51+01:00"}
{"level":"info","msg":"Forwarding TCP from 127.0.0.1:10257 to 0.0.0.0:10257","time":"2022-02-16T23:24:51+01:00"}
{"level":"info","msg":"Forwarding TCP from 127.0.0.1:10258 to 0.0.0.0:10258","time":"2022-02-16T23:24:51+01:00"}
{"level":"info","msg":"Forwarding TCP from 127.0.0.1:10259 to 0.0.0.0:10259","time":"2022-02-16T23:24:51+01:00"}
{"level":"info","msg":"Not forwarding TCP 0.0.0.0:22","time":"2022-02-16T23:24:51+01:00"}
{"level":"info","msg":"Forwarding TCP from 0.0.0.0:32695 to 0.0.0.0:32695","time":"2022-02-16T23:24:51+01:00"}
{"level":"info","msg":"Forwarding TCP from 0.0.0.0:31287 to 0.0.0.0:31287","time":"2022-02-16T23:24:51+01:00"}
{"level":"info","msg":"Forwarding TCP from 0.0.0.0:32186 to 0.0.0.0:32186","time":"2022-02-16T23:24:51+01:00"}
{"level":"info","msg":"Forwarding TCP from [::]:10250 to 0.0.0.0:10250","time":"2022-02-16T23:24:51+01:00"}
{"level":"info","msg":"Forwarding TCP from [::]:10251 to 0.0.0.0:10251","time":"2022-02-16T23:24:51+01:00"}
{"level":"info","msg":"Forwarding TCP from [::]:6443 to 0.0.0.0:6443","time":"2022-02-16T23:24:51+01:00"}
{"level":"info","msg":"Not forwarding TCP [::]:22","time":"2022-02-16T23:24:51+01:00"}
{"level":"info","msg":"Forwarding TCP from 127.0.0.1:80 to 0.0.0.0:80","time":"2022-02-16T23:24:51+01:00"}
{"level":"info","msg":"Forwarding TCP from 127.0.0.1:443 to 0.0.0.0:443","time":"2022-02-16T23:24:51+01:00"}
{"level":"info","msg":"Forwarding TCP from 127.0.0.1:8081 to 0.0.0.0:8081","time":"2022-02-16T23:24:51+01:00"}
{"level":"debug","msg":"guest agent event: {Time:2022-02-17 06:43:26.469907334 +0000 UTC LocalPortsAdded:[] LocalPortsRemoved:[] Errors:[exit status 4]}","time":"2022-02-17T07:43:27+01:00"}
{"level":"warning","msg":"received error from the guest: \"exit status 4\"","time":"2022-02-17T07:43:27+01:00"}
{"level":"debug","msg":"guest agent event: {Time:2022-02-17 06:43:29.484860334 +0000 UTC LocalPortsAdded:[{IP:127.0.0.1 Port:80} {IP:127.0.0.1 Port:443} {IP:127.0.0.1 Port:8081}] LocalPortsRemoved:[] Errors:[]}","time":"2022-02-17T07:43:30+01:00"}
{"level":"info","msg":"Forwarding TCP from 127.0.0.1:80 to 0.0.0.0:80","time":"2022-02-17T07:43:30+01:00"}
{"level":"info","msg":"Forwarding TCP from 127.0.0.1:443 to 0.0.0.0:443","time":"2022-02-17T07:43:30+01:00"}
{"level":"info","msg":"Forwarding TCP from 127.0.0.1:8081 to 0.0.0.0:8081","time":"2022-02-17T07:43:30+01:00"}

[EDIT: lima.log was classified as “binary” by VS Code, hence I didn’t look at it earlier. Turns out it has something legible in it…]

lima.log is a single line, with a lot of NUL characters in front of it.

2022-02-16T22:24:41.056Z: Log process exited with 255/null, restarting...

as hexdump:

00000000: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000010: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000020: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000030: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000040: 0000 0000 0000 0000 0000 0000 0000 0000  ................
[...]
00004220: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00004230: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00004240: 0000 0000 0032 3032 322d 3032 2d31 3654  .....2022-02-16T
00004250: 3232 3a32 343a 3431 2e30 3536 5a3a 204c  22:24:41.056Z: L
00004260: 6f67 2070 726f 6365 7373 2065 7869 7465  og process exite
00004270: 6420 7769 7468 2032 3535 2f6e 756c 6c2c  d with 255/null,
00004280: 2072 6573 7461 7274 696e 672e 2e2e 0a     restarting....

Other logs are either empty or don’t have anything for that time.

I hope this helps narrowing it down. If you think this is a different issue, I’m happy to create a new one for this.

Having the same issue with RD 1.0.0 (MacOS 12.1, Intel). After a long while dockerd becomes unavailable, not sleeping is required. Restarting RD brings dockerd back.

I’ve noticed this does not always happen after sleep (sometimes it continues to function), but also, it sometimes happens when the system has not slept, i.e. it’ll just sorta die. So it may not be caused by sleep/wake. As before, quit and restart fixes on 1.0.0.beta.1 Mac OS 12.1 Intel. Have not found any useful info in logs but I’ve been running in debug mode for a few days now.

There seems to be a mix of responses: for some this issue was fixed in RD 1.1.1 already, but others still experience it with 1.2.1. In the upcoming 1.3.0 release (hopefully later this week), we have a change to use sftp-server instead of the built-in server in the Lima Host Agent. I only see this making a difference for the host mounts, but not for the forwarded sockets, but we’ll see…

It would be helpful if everyone still experiencing this issue could try 1.3.0 (once released) and report if continue to have this problem or not! Thank you!

Ever since upgrading to Rancher Desktop v1.1.1 about 4 days ago I have not seen this problem any longer. This on a MacBook Pro Intel. MacOS v11.6.1

My system has been on/off network. sleep. joining different networks (traveling) and docker and k8s inside Rancher Desktop has been running without issues.

So, an interesting change in behavior: I am one of those that just needs the docker portion of Rancher Desktop. I found a stop_k3s shell script somewhere in a Rancher Desktop issue and last Friday morning I ran it after starting up Rancher Desktop. This evening I realized I have not had to restart Rancher Desktop on Sat, Sun, or today (Mon), and I unslept my MBPro each day. It worked all day long today, which has never happened before.

I haven’t dug into it, but at this point it seems that stopping k3s has changed something for me, for the better.

BTW, my collection of comments and observations is at https://gist.github.com/mrballcb/9996e94b7bf357dc8e70d1692d57da29

@mrballcb I think you might be onto something. After stopping k3s in a manner similar to rancher_k3s_stop.sh, dockerd has been accessible by the docker cli for over 24 hours without a restart on macOS Catalina—a record for me (but I’ll respond if my luck changes 😃

For reproducibility, the above gist appears to assume limactl is in one’s PATH—which it wasn’t for me, so expanding on https://github.com/rancher-sandbox/rancher-desktop/issues/985#issuecomment-1026186522 I used:

LIMA_HOME="$HOME/Library/Application Support/rancher-desktop/lima" "/Applications/Rancher Desktop.app/Contents/Resources/resources/darwin/lima/bin/limactl" shell 0 sudo rc-service k3s stop
docker rm -f $(docker ps -aq)

A few days before this, I had also tried stopping kubernetes per the method listed in the Rancher Desktop FAQ

kubectl config use-context rancher-desktop
kubectl delete node lima-rancher-desktop

but that didn’t resolve the issue.

I experience this daily on 1x MacBook Pro M1 2020, 2x Mac Mini M1 and 2x Mac Mini x86. My MacBook Pro is going to sleep so that might be related but the other four Mac Minis are used for CI and always on. I don’t use Docker for any running services, it’s only used when CI needs Docker to perform specific tasks. Sometimes there is no communication with the Docker daemon for over 24 hours.

This happens under macOS version 10.15.6, 11.1, 12.0.1 and 12.2 so it does not seem related to a specific OS version.

About 18 hours ago I restarted Rancher Desktop from the command line on all four Mac Minis. Today, all of them had lost connection to the daemon, easiest shown by just running:

% docker ps
Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?

Killing the remaining processes and restarting Rancher Desktop solves the issue:

% pkill 'Rancher Desktop' && sleep 30 && open /Applications/Rancher\ Desktop.app

I have not done any change or configuration meaning that by default I also run the default Kubernetes feature.

Maybe the very little actual use of Docker and the daemon might be the reason? However, since the Kubernetes feature is enable I’m constantly running at least the 11 default containers so it doesn’t feel very likely.

I can also provide any kind of logs or testing that might help you.

I still have not figured out how to restart it from the commandline.

You could try this:

osascript -e 'tell application "Rancher Desktop" to quit'
while osascript -e 'if application "Rancher Desktop" is not running then error 1' 2>/dev/null; do sleep 2; done
open -a "Rancher Desktop"

I had this happen to me yesterday after a day of working (~8 hours) and my computer not going to sleep. Rancher version 0.7.1.

Screen Shot 2022-01-19 at 08 17 01

That work-around worked for me. As one-line: ~/.rd/bin/rdctl shell sudo /etc/init.d/docker restart

I also needed export DOCKER_HOST=unix://$HOME/.rd/docker.sock

For me, I had to start docker daemon in the lima vm after creating the symlink.

$ ~/.rd/bin/rdctl  shell
lima-rancher-desktop:~$ sudo /etc/init.d/docker start
 * /var/log/docker.log: creating file
 * /var/log/docker.log: correcting owner
 * Starting Docker Daemon ...                                                                                                                               [ ok ]
lima-rancher-desktop:~$ /etc/init.d/docker status
 * status: started
lima-rancher-desktop:~$ exit

Once that was done everything was fine. Maybe I missed that step somewhere in the docs.

Same issue’s happening here 😞

error: Cannot connect to the Docker daemon at unix:///var/run/docker.sock

Rancher Desktop Version 1.4.1 Captura de Tela 2022-06-17 às 16 16 37

So, an interesting change in behavior: I am one of those that just needs the docker portion of Rancher Desktop. I found a stop_k3s shell script somewhere in a Rancher Desktop issue and last Friday morning I ran it after starting up Rancher Desktop. This evening I realized I have not had to restart Rancher Desktop on Sat, Sun, or today (Mon), and I unslept my MBPro each day. It worked all day long today, which has never happened before.

I haven’t dug into it, but at this point it seems that stopping k3s has changed something for me, for the better.

BTW, my collection of comments and observations is at https://gist.github.com/mrballcb/9996e94b7bf357dc8e70d1692d57da29

I’m having this issue as well. Every morning when I open my laptop, all docker commands fail with:

Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?`

Clicking the “Reset Kubernetes” button under “Kubernetes Settings” fixes it, but then I have to restart all my containers.

Here’s my system info: Rancher Desktop version: 1.0.0 MacOS: 11.6.2 (Big Sur) Using the dockerd container runtime Using kubernetes version v1.19.16 Using Rancher to creating symbolic links under /usr/local/bin/docker and /usr/local/bin/nerdctl (These checkboxes are selected under “Supporting Utilities”)

Every time my laptop sleeps I get the same error.

Cannot connect to the Docker daemon at unix:///Users/username/.rd/docker.sock. Is the docker daemon running?

I tried the ~/.rd/bin/rdctl shell sudo /etc/init.d/docker restart suggestion as well and it also didn’t change anything.

 * Stopping Docker Daemon ...                                                                                                                                                                     [ ok ]
 * Starting Docker Daemon ...                                                                                                                                                                     [ ok ]

Rancher Desktop 1.8.1 macOS Ventura 13.4

This just happened but does not usually, at least up until now.

I tried the ~/.rd/bin/rdctl shell sudo /etc/init.d/docker restart suggestion to no avail.

I needed to quit Rancher Desktop and launch it again in order for docker ps to work.

Rancher Desktop 1.8.1 macOS Ventura 13.3.1

Our mac is running 27/7 we are still affected and we have to restart the rancher-desktop from time to time. Do anyone know how to restart the Rancher-Desktop-UI from commandline? 😄

Have now had this happen once with 1.0.1, i.e. put computer to sleep, after wake the socket was not available; but also, I have had it not happen: Put computer to sleep, after wake everything was fine.

I think this is a duplicate of #716, but keeping both issues open, as there is discussion on both of them.

I tried to just restore the docker.sock file, but no success.

Command to create the symlink: ln -sf /var/run/docker.sock $HOME/Library/Application\ Support/rancher-desktop/lima/0/sock/docker