harbor: Harbor containers fail to start on docker startup
After restarting the docker daemon (or rebooting the system), not all harbor containers get started successfully. Because of the restart: always directive in docker-compose.yml, I expected harbor to automatically start up after rebooting.
Note about my setup: I’m running harbor behind nginx as a reverse proxy, as explained in #3114 (main changes: port of proxy service, public_url fixup in prepare script, X-Forwarded-Proto changes as detailed in troubleshooting documentation).
Steps to reproduce the problem:
- restart docker
- observe
docker ps -aand see that only some containers reachUpstatus
Manually starting harbor via docker-compose -f ./docker-compose.yml -f docker-compose.clair.yml up -d works fine.
- harbor version: 1.7.1
- docker engine version: 5:18.09.2~3-0~debian-stretch
- docker-compose version: 1.23.2, build 1110ad01
Additional context:
See below for log messages from dockerd. It seems that the ordering is somehow not observed, that is, services that depend on the log service try to start up before syslog is fully up and running, but give up early, when they find that they cannot reach syslog. Docker doesn’t appear to be attempting any more restarts in this case.
Feb 23 16:39:17 code dockerd[28575]: time="2019-02-23T16:39:17.457432463+01:00" level=info msg="Loading containers: start."
Feb 23 16:39:19 code dockerd[28575]: time="2019-02-23T16:39:19.534774219+01:00" level=info msg="Default bridge (docker0) is assigned with an IP address 172.17.0.0/16. Daemon option --bip can be used to set a preferred IP address"
Feb 23 16:39:22 code dockerd[28575]: time="2019-02-23T16:39:22.951084811+01:00" level=error msg="5cbeb21200bb05437f9fd599d829babd6ca5d6b946ddbc0986733d85b52ec3ca cleanup: failed to delete container from containerd: no such container"
Feb 23 16:39:22 code dockerd[28575]: time="2019-02-23T16:39:22.951114412+01:00" level=error msg="Failed to start container 5cbeb21200bb05437f9fd599d829babd6ca5d6b946ddbc0986733d85b52ec3ca: failed to initialize logging driver: dial tcp 127.0.0.1:1514: connect: connection refused"
Feb 23 16:39:25 code dockerd[28575]: time="2019-02-23T16:39:25.395062463+01:00" level=error msg="942dd05349486b3d1646415572abc126ebce7b31aec54888ea859467b2d0b422 cleanup: failed to delete container from containerd: no such container"
Feb 23 16:39:25 code dockerd[28575]: time="2019-02-23T16:39:25.395085466+01:00" level=error msg="Failed to start container 942dd05349486b3d1646415572abc126ebce7b31aec54888ea859467b2d0b422: failed to initialize logging driver: dial tcp 127.0.0.1:1514: connect: connection refused"
Feb 23 16:39:25 code dockerd[28575]: time="2019-02-23T16:39:25.491026866+01:00" level=error msg="371c7e506f09e1f6c1c154d6fcacb862432bd7b4cb831a59d6fdc4b3dac92d87 cleanup: failed to delete container from containerd: no such container"
Feb 23 16:39:25 code dockerd[28575]: time="2019-02-23T16:39:25.491056484+01:00" level=error msg="Failed to start container 371c7e506f09e1f6c1c154d6fcacb862432bd7b4cb831a59d6fdc4b3dac92d87: failed to initialize logging driver: dial tcp 127.0.0.1:1514: connect: connection refused"
Feb 23 16:39:26 code dockerd[28575]: time="2019-02-23T16:39:26.670989556+01:00" level=error msg="0420313e7b29e1eb74776495d3ccd6be383ab6c14dc2a4c9c31f770b3f8ac724 cleanup: failed to delete container from containerd: no such container"
Feb 23 16:39:26 code dockerd[28575]: time="2019-02-23T16:39:26.671032205+01:00" level=error msg="Failed to start container 0420313e7b29e1eb74776495d3ccd6be383ab6c14dc2a4c9c31f770b3f8ac724: failed to initialize logging driver: dial tcp 127.0.0.1:1514: connect: connection refused"
Feb 23 16:39:27 code dockerd[28575]: time="2019-02-23T16:39:27.771129969+01:00" level=error msg="5dc8e982846d7f339a86ff9e20c937b4ce18b7634d5b28e3b291723dc5403a38 cleanup: failed to delete container from containerd: no such container"
Feb 23 16:39:27 code dockerd[28575]: time="2019-02-23T16:39:27.771157135+01:00" level=error msg="Failed to start container 5dc8e982846d7f339a86ff9e20c937b4ce18b7634d5b28e3b291723dc5403a38: failed to initialize logging driver: dial tcp 127.0.0.1:1514: connect: connection refused"
Feb 23 16:39:27 code containerd[871]: time="2019-02-23T16:39:27.791326620+01:00" level=info msg="shim containerd-shim started" address="/containerd-shim/moby/71a79f0da980e15c3e425066c0ef83d455d67361140245cfc34d6abdd37913f6/shim.sock" debug=false pid=28994
Feb 23 16:39:28 code containerd[871]: time="2019-02-23T16:39:28.161320952+01:00" level=info msg="shim containerd-shim started" address="/containerd-shim/moby/d462cae2d8bc4614fbfddd692706866e301d5e0650cd8cd232929470a40d4c67/shim.sock" debug=false pid=29062
Feb 23 16:39:28 code containerd[871]: time="2019-02-23T16:39:28.874403639+01:00" level=info msg="shim containerd-shim started" address="/containerd-shim/moby/5004316792bc356ade270d746be0fe886df634f50ae6bf4730d1696f95632fcc/shim.sock" debug=false pid=29165
Feb 23 16:39:29 code dockerd[28575]: time="2019-02-23T16:39:29.278982945+01:00" level=error msg="9708ab1bfbcfce250d7d6876517792f01e17fdbd682e4f6e80445313ff638de6 cleanup: failed to delete container from containerd: no such container"
Feb 23 16:39:29 code dockerd[28575]: time="2019-02-23T16:39:29.279020093+01:00" level=error msg="Failed to start container 9708ab1bfbcfce250d7d6876517792f01e17fdbd682e4f6e80445313ff638de6: failed to initialize logging driver: dial tcp 127.0.0.1:1514: connect: connection refused"
Feb 23 16:39:29 code dockerd[28575]: time="2019-02-23T16:39:29.678990965+01:00" level=error msg="05e5805d46720f24a854a361159cefdbfe31773ccf52c784c08c14a23b8a57d0 cleanup: failed to delete container from containerd: no such container"
Feb 23 16:39:29 code dockerd[28575]: time="2019-02-23T16:39:29.679026771+01:00" level=error msg="Failed to start container 05e5805d46720f24a854a361159cefdbfe31773ccf52c784c08c14a23b8a57d0: failed to initialize logging driver: dial tcp 127.0.0.1:1514: connect: connection refused"
Feb 23 16:39:30 code dockerd[28575]: time="2019-02-23T16:39:30.078974849+01:00" level=error msg="ae98ae73d7b252321a86f2bef8c27da5f974f98e74435b9fd35775bf98c9bf84 cleanup: failed to delete container from containerd: no such container"
Feb 23 16:39:30 code dockerd[28575]: time="2019-02-23T16:39:30.079012597+01:00" level=error msg="Failed to start container ae98ae73d7b252321a86f2bef8c27da5f974f98e74435b9fd35775bf98c9bf84: failed to initialize logging driver: dial tcp 127.0.0.1:1514: connect: connection refused"
Feb 23 16:39:30 code dockerd[28575]: time="2019-02-23T16:39:30.079048260+01:00" level=info msg="Loading containers: done."
Feb 23 16:39:30 code dockerd[28575]: time="2019-02-23T16:39:30.115257599+01:00" level=info msg="Docker daemon" commit=6247962 graphdriver(s)=overlay2 version=18.09.2
Feb 23 16:39:30 code dockerd[28575]: time="2019-02-23T16:39:30.115336823+01:00" level=info msg="Daemon has completed initialization"
Feb 23 16:39:30 code dockerd[28575]: time="2019-02-23T16:39:30.139043317+01:00" level=info msg="API listen on /var/run/docker.sock"
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Reactions: 5
- Comments: 38 (1 by maintainers)
You can also create a systemd service (see config below). This wil start all Harbor Docker containers using systemd.
I believe this problem still exists.
docker-compose uses “depends_on” to make sure the logging container is started before the others. The “restart: always” setting ensures the docker daemon starts the containers again after a restart, but it has no notion of any dependencies. So all containers are started at the same time, but since the syslog service is not reachable, some of the containers fail to start up, and the daemon doesn’t try to start them again afterwards.
Using “depends_on” seems to be a poor way to ensure a correct startup procedure, since the docker daemon doesn’t support it.
So the fix is to either get rid of the syslog logging (which is quite confusing anyway, tbh) and just use “docker logs” to view log entries, or add a systemd unit as a workaround (as mentioned above)
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
This problem still exists in the latest release
I have a fix for the logging error:
failed to initialize logging driver: dial tcp 127.0.0.1:1514: connect: connection refused.Configured the logging driver option
mode: "non-blocking"inside thedocker-compose.ymlfile (for each container).Please add this mode option to official docker-compose.yml file
Great ! Works fine 😃
Please add this mode option to official
docker-compose.ymlfileSolved the problem by removing the “log” entry from every “depends_on” block, and the below section from every service.
@rvanbutselaar 's solution still works for me (#7008 (comment))
Without it, I still have the error “failed to initialize logging driver: dial tcp 127.0.0.1:1514: connect: connection refused” randomly. The systemd unit ensures a start of all the services with the correct dependencies. @Gui13, are you sure your unit has correctly started ? What is your output of “systemctl status <your unit>” ?
Mine is:
My config is: docker v19.03.6 docker-compose v1.24.0 harbor v2.1.1
Even v2.0.0 has this issue. After a reboot of the server only half of the containers are restarted. Running “docker-compose start” manually again is a possible workaround.
I experience the same issue with:
harbor version: 1.7.5 docker engine version: 18.09.5 docker-compose version: 1.24.0, build 0aa5906 OS: CentOS 7.6
“docker container ls -a” shows 4 exited containers after some reboot: aa8f4ee26790 goharbor/harbor-jobservice:v1.7.5 “/harbor/start.sh” About an hour ago Exited (137) 3 minutes ago harbor-jobservice 860f4e5cd2d8 goharbor/clair-photon:v2.0.8-v1.7.5 “/docker-entrypoint.…” About an hour ago Exited (137) 3 minutes ago 6060-6061/tcp clair e014632adcce goharbor/redis-photon:v1.7.5 “docker-entrypoint.s…” About an hour ago Exited (137) 3 minutes ago 6379/tcp redis a7e7bb4f42df goharbor/harbor-adminserver:v1.7.5 “/harbor/start.sh” About an hour ago Exited (137) 3 minutes ago harbor-adminserver
“docker inspect” on harbor-adminserver shows the error: “State”: { “Status”: “exited”, “Running”: false, “Paused”: false, “Restarting”: false, “OOMKilled”: false, “Dead”: false, “Pid”: 0, “ExitCode”: 137, “Error”: “failed to initialize logging driver: dial tcp 127.0.0.1:1514: connect: connection refused”,
It seems to be related to start ordering issue. It should not be an issue with the “restart: always” policy.
Actually, the issue seems that this restart policy does not seem to be applied. It’s more a docker issue in my opinion.