moby: can't create unix socket /var/run/docker.sock: is a directory

Description

Steps to reproduce the issue:

  1. Restart Docker

Describe the results you received:

Docker can’t boot up after a restart. An error from journal:

dockerd[30701]: time="2017-01-22T08:38:55.077780858Z" level=fatal msg="can't create unix socket /var/run/docker.sock: is a directory"

At this point /var/run/docker.sock is indeed, a directory. (wut?)

$ ls -lah /var/run/docker.sock
total 0
drwxr-xr-x  2 root root   40 Jan 22 08:37 .
drwxr-xr-x 30 root root 1.2K Jan 22 08:38 ..

Describe the results you expected:

To restart Docker without an error

Additional information you deem important (e.g. issue happens only occasionally):

This happened from time to time upon restart of dockerd daemon. After I removed this directory manually, Docker boots up easily creating a socket, but after several restarts, this issue came back.

Output of docker version:

Client:
 Version:      1.12.6
 API version:  1.24
 Go version:   go1.6.4
 Git commit:   78d1802
 Built:        Wed Jan 11 00:23:16 2017
 OS/Arch:      linux/amd64

Server:
 Version:      1.12.6
 API version:  1.24
 Go version:   go1.6.4
 Git commit:   78d1802
 Built:        Wed Jan 11 00:23:16 2017
 OS/Arch:      linux/amd64

Output of docker info:

Containers: 42
 Running: 20
 Paused: 0
 Stopped: 22
Images: 14
Server Version: 1.12.6
Storage Driver: overlay
 Backing Filesystem: xfs
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: overlay bridge null host
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Security Options: apparmor seccomp
Kernel Version: 4.4.0-57-generic
Operating System: Ubuntu 16.04.1 LTS
OSType: linux
Architecture: x86_64
CPUs: 2
Total Memory: 7.795 GiB
Name: <node-name>
ID: <id>
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
WARNING: No swap limit support
Insecure Registries:
 127.0.0.0/8

Additional environment details (AWS, VirtualBox, physical, etc.):

AWS EC2 instance

About this issue

  • Original URL
  • State: open
  • Created 7 years ago
  • Reactions: 13
  • Comments: 33 (11 by maintainers)

Commits related to this issue

Most upvoted comments

Hm, what happened here is that when bind-mounting files or directories from the host, the host path is automatically created by docker if it doesn’t exist (we tried deprecating that behavior, but there are many people relying on this; see https://github.com/docker/docker/pull/21666, and issues linked from that). If the path (/var/run/docker.sock in this case) does not exist, docker assumes it must be a directory, so creates a directory /var/run/docker.sock and bind-mounts that into the container. As an alternative, you could bind-mount /var/run into the container instead of the docker.sock.

I’m not sure though how the container / bind-mount could be created before the daemon was “up” (and the socket created).

ping @cpuguy83 any idea?

It’s a race condition. The http server is spun up separately from the daemon.

Today I figured out that I had two odd settings in my system, which I was not aware of.

  1. /var/run was not (as usual) a symlink of /run
  2. I did not enable docker.sock but directly started docker.service

The second point might contribute to the race condition. If docker.sock is enabled systemd will take care of the socket even before dockerd and any container is active. Thus, the socket is there already and there is no chance that a container creates a folder instead.

systemctl enable docker.socket
systemctl start docker.socket
systemctl enable docker.service
systemctl start docker.service

the standard service and sock files are set-up so that docker.service will get started AFTER docker.socket.

The first point was noticed as the docker.socket file pointed to /run/docker.sock, which is ok, if /var/run is just a symlink of /run. If not dockerd and all the container looking for /var/run/docker.sock (the standard path) will not find it.

I stopped all services, renamed the original folder mv /var/run/ /var/run_delete_me and symlinked

cd /var/
ln -s ../run

After a reboot all dynamic content was placed into /run and I could delete

rm -rf /var/run_delete_me

Now I have no problem with the socket creation anymore at boot time. Hope this helps others as well.

I got hit with this issue while upgrading from 20.10.8 to 20.10.9 - so during a routine update.

The only container that mounted /var/run/docker.sock was the datadog-agent. For now I am going to remove that container again as its not worthy the trouble.

Not sure how to solve this, but it seems strange that the containers in “restart-mode” would start before the docker-daemon is listening to the socket?

Maybe something can be built for docker to wait for that prior to launching containers? Or add the workaround mentioned by @stuszynski in the upstream packaging 😃

I ran into the same issue this morning.

docker version
Client:
 Version:      17.03.1-ce
 API version:  1.27
 Go version:   go1.7.5
 Git commit:   c6d412e
 Built:        Fri Mar 24 00:40:33 2017
 OS/Arch:      linux/amd64

Server:
 Version:      17.03.1-ce
 API version:  1.27 (minimum version 1.12)
 Go version:   go1.7.5
 Git commit:   c6d412e
 Built:        Fri Mar 24 00:40:33 2017
 OS/Arch:      linux/amd64
 Experimental: false

We do have containers with /var/run/docker.sock mounted.

All I did was service docker restart to see the following error messages in the logs.

tail -f /var/log/upstart/docker.log
can't create unix socket /var/run/docker.sock: is a directory
/var/run/docker.sock is up
can't create unix socket /var/run/docker.sock: is a directory
/var/run/docker.sock is up
can't create unix socket /var/run/docker.sock: is a directory
/var/run/docker.sock is up
can't create unix socket /var/run/docker.sock: is a directory
lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 14.04.5 LTS
Release:        14.04
Codename:       trusty

Manually removing the directory was the workaround for me.

rm -rf /var/run/docker.sock

it seems strange that the containers in “restart-mode” would start before the docker-daemon is listening to the socket?

The daemon and API are separate bits in the code; the daemon may be up, but the API not yet listening. There’s also scenarios (e.g. the live-restore option) where containers can continue running during daemon restarts.

Or add the workaround mentioned by @stuszynski in the upstream packaging 😃

The systemd socket approach (https://github.com/moby/moby/issues/30348#issuecomment-286796717) is already in use in all current versions of docker: https://github.com/docker/docker-ce-packaging/blob/a5db88ae1a64189e79d97f780f91e5c852d0ef3f/systemd/docker.service#L6-L13

The default is for the docker daemon to use -H fd:// (which is the file-descriptor of the socket that’s created by systemd before the docker service starts).

There may be one fix for the systemd unit file related to this, that hasn’t shipped yet; https://github.com/docker/docker-ce-packaging/pull/575 (not sure if it addresses this particular issue, but might help)

You can’t remove those if docker is running (or containers) are running (which could be if the daemon has live-restore enabled. I’d also not recommend to only remove the /containers subdirectory, because that could mean that other files inside /var/lib/docker that keep track of state now no longer match with what’s there.

To fix the issue with /var/run/docker.sock if you ended up in a situation where it was created as directory, first stop docker (sudo systemctl stop docker) if it’s running, then sudo rm -r /var/run/docker.sock. If you want to remove _all_ docker things (containers, images, volumes, networks, etc), you could sudo systemctl stop docker, then sudo rm -rf /var/lib/docker`. But (as said) that removes all your docker data.

I am still facing this problem, and I am surprised there seems to be no solution yet (or I could not find it). This always catches me cold after an "almost finish, quickly to a reboot"maintenance session…

Disregard my upstart init mod. After talking more about this with @thaJeztah, bad things can happen with that approach. Instead, I am going to make sure any container that needs access to the docker socket, bind-mount /var/run instead of /var/run/docker.sock.