certificates: Health check timeout (container state: unhealthy)

Subject of the issue

The step-ca container health state is shown as Up (health: starting), later it will turn to Up (unhealthy). But the service runs fine and it also logs that it is listening now, so apparently the health check always fails.

Your environment

  • OS: WSL 2
  • Version: 2 (Windows 10 x64)

Steps to reproduce

docker-compose.yml:

version: '3.7'

services:

  # Smallstep Step CA
  step-ca:
    image: smallstep/step-ca:0.15.6
    restart: always
> docker-compose up -d
> docker-compose ps
Up (health: starting)
# after some time (about one minute)
> docker-compose ps
Up (unhealthy)

Expected behaviour

As the CA service runs correctly, the health check should pass and the container state should become Up (healthy) or similar, but not Up (unhealthy). Also the health check needs too long ((health: starting)) for one minute.

Actual behaviour

Health check needs too long (Up (health: starting)) and then fails after about a minute (Up (unhealthy)).

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Reactions: 1
  • Comments: 20 (8 by maintainers)

Most upvoted comments

The name in the health check URL has to match a name in dnsNames in ca.json. So, use ca.diesel.net,127.0.0.1,localhost in ca.json, and then change defaults.json to use localhost (or 127.0.0.1).

When you got connect: connection refused above, it looks like you hadn’t yet started the step-ca server.

I ran into the same DNS resolution problem when using docker swarm. Instead of modifying ca.json and defaults.json, I used the extra_hosts service option to provide DNS resolution in the container of ca.diesel.net to 127.0.0.1. This is my stack compose file. Note that it takes around 30 seconds for the service to finish coming up.

version: '3.4'

networks:
  step:
    external: true

volumes:
  step_home_step:
    external: true

services:
  step:
    extra_hosts:
      - 'ca.diesel.net:127.0.0.1'
    image: smallstep/step-ca:0.20.0
    networks:
      - step
    volumes:
      - source: step_home_step
        target: /home/step
        type: volume
        volume:
          nocopy: true

@tomdaley92 Ok I see what’s going on with your last output, step-ca is not running.

Looking at your ansible configuration in your github, you’re mounting a pre-created configuration, good. So to imitate this using docker run, you need to pre-create the configuration with step ca init, and make sure that the paths in ca.json and defaults.json point to /home/step/* instead of your local path.

Then start the ca with the volume mounted, using the default command and running the health check:

docker run --mount type=bind,source="/tmp/docker",target=/home/step -it -e STEPDEBUG=1 smallstep/step-ca:0.16.0

And in another terminal, exec in and try to health:

$ docker exec -it 10ce907bea0e sh
~ $ ps
PID   USER     TIME  COMMAND
    1 step      0:00 /usr/local/bin/step-ca --password-file /home/step/secrets/password /home/step/config/ca.json
   47 step      0:00 sh
   55 step      0:00 ps
~ $ step ca health
ok

And if you look at the output of the step-ca, you will see that health check (the one in the docker file) is running every 30s

Happy to help. The command to start step-ca in the container is /usr/local/bin/step-ca --password-file $PWDPATH $CONFIGPATH (see the Dockerfile’s CMD line)