caddy: obtaining certificate: context canceled, but does not restart
I’m trying to automate a caddy deployment and since I’m using multiple sub-domains, I’d like to get a wildcard TLS certificate.
In order to provision wildcard certificates, I need to use DNS-01 challenge. And I did not want to put the API keys to my domain registrar on a static file that would be on the file system on the server.
Hence, I put in a dummy key and then loaded the actual key manually via the caddy API.
However, the problem is that as soon as the caddy server starts, it starts the certificate maintenance routine which would fail with acme: error presenting token: digitalocean: HTTP 401: unauthorized.
Now, after I put in the correct API key for the registrar in via the caddy API, the caddy server reloads again. And this time when the certificate maintenance routine starts, it notices that there is already another obtaining certificate context and kills it.
log
-- Logs begin at Sat 2020-03-28 22:04:26 UTC, end at Sun 2020-03-29 18:50:54 UTC. --
Mar 28 22:07:28 systemd[1]: Started Caddy Web Server.
Mar 28 22:07:28 caddy[21276]: caddy.HomeDir=/var/lib/caddy
Mar 28 22:07:28 caddy[21276]: caddy.AppDataDir=/var/lib/caddy/.local/share/caddy
Mar 28 22:07:28 caddy[21276]: caddy.AppConfigDir=/var/lib/caddy/.config/caddy
Mar 28 22:07:28 caddy[21276]: caddy.ConfigAutosavePath=/var/lib/caddy/.config/caddy/autosave.json
Mar 28 22:07:28 caddy[21276]: runtime.GOOS=linux
Mar 28 22:07:28 caddy[21276]: runtime.GOARCH=amd64
Mar 28 22:07:28 caddy[21276]: runtime.Compiler=gc
Mar 28 22:07:28 caddy[21276]: runtime.NumCPU=1
Mar 28 22:07:28 caddy[21276]: runtime.GOMAXPROCS=1
Mar 28 22:07:28 caddy[21276]: runtime.Version=go1.14
Mar 28 22:07:28 caddy[21276]: os.Getwd=/
Mar 28 22:07:28 caddy[21276]: LANG=C.UTF-8
Mar 28 22:07:28 caddy[21276]: PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
Mar 28 22:07:28 caddy[21276]: HOME=/var/lib/caddy
Mar 28 22:07:28 caddy[21276]: LOGNAME=caddy
Mar 28 22:07:28 caddy[21276]: USER=caddy
Mar 28 22:07:28 caddy[21276]: INVOCATION_ID=cd51f6d2b270456882c64abf5851abfd
Mar 28 22:07:28 caddy[21276]: JOURNAL_STREAM=9:51261
Mar 28 22:07:28 caddy[21276]: {"level":"info","ts":1585433248.8972936,"msg":"no autosave file exists","autosave_file":"/var/lib/caddy/.config/caddy/autosave.json"}
Mar 28 22:07:28 caddy[21276]: {"level":"info","ts":1585433248.8984275,"msg":"using provided configuration","config_file":"/var/lib/caddy/.config/caddy/caddy.json","config_adapter":""}
Mar 28 22:07:28 caddy[21276]: {"level":"info","ts":1585433248.9018965,"logger":"admin","msg":"admin endpoint started","address":"localhost:2019","enforce_origin":false,"origins":["localhost:2019"]}
Mar 28 22:07:28 caddy[21276]: {"level":"info","ts":1585433248.9043777,"logger":"http","msg":"enabling automatic TLS certificate management","domains":["*.shine.caddy.test.shinenelson.xyz","*.mark.caddy.test.shinenelson.xyz","*.lola.caddy.test.shinenelson.xyz"]}
Mar 28 22:07:28 caddy[21276]: {"level":"info","ts":1585433248.9058797,"logger":"tls","msg":"cleaned up storage units"}
Mar 28 22:07:28 caddy[21276]: {"level":"info","ts":1585433248.9075916,"msg":"autosaved config","file":"/var/lib/caddy/.config/caddy/autosave.json"}
Mar 28 22:07:28 caddy[21276]: {"level":"info","ts":1585433248.909218,"msg":"serving initial configuration"}
Mar 28 22:07:28 caddy[21276]: 2020/03/28 22:07:28 [INFO][cache:0xc000660be0] Started certificate maintenance routine
Mar 28 22:07:28 caddy[21276]: 2020/03/28 22:07:28 [INFO][*.lola.caddy.test.shinenelson.xyz] Obtain certificate; acquiring lock...
Mar 28 22:07:28 caddy[21276]: 2020/03/28 22:07:28 [INFO][*.lola.caddy.test.shinenelson.xyz] Obtain: Lock acquired; proceeding...
Mar 28 22:07:28 caddy[21276]: 2020/03/28 22:07:28 [INFO][*.shine.caddy.test.shinenelson.xyz] Obtain certificate; acquiring lock...
Mar 28 22:07:28 caddy[21276]: 2020/03/28 22:07:28 [INFO][*.shine.caddy.test.shinenelson.xyz] Obtain: Lock acquired; proceeding...
Mar 28 22:07:28 caddy[21276]: 2020/03/28 22:07:28 [INFO][*.mark.caddy.test.shinenelson.xyz] Obtain certificate; acquiring lock...
Mar 28 22:07:28 caddy[21276]: 2020/03/28 22:07:28 [INFO][*.mark.caddy.test.shinenelson.xyz] Obtain: Lock acquired; proceeding...
Mar 28 22:07:31 caddy[21276]: 2020/03/28 22:07:31 [INFO][*.lola.caddy.test.shinenelson.xyz] Waiting on rate limiter...
Mar 28 22:07:31 caddy[21276]: 2020/03/28 22:07:31 [INFO][*.lola.caddy.test.shinenelson.xyz] Done waiting
Mar 28 22:07:31 caddy[21276]: 2020/03/28 22:07:31 [INFO] [*.lola.caddy.test.shinenelson.xyz] acme: Obtaining bundled SAN certificate given a CSR
Mar 28 22:07:31 caddy[21276]: 2020/03/28 22:07:31 [INFO][*.shine.caddy.test.shinenelson.xyz] Waiting on rate limiter...
Mar 28 22:07:31 caddy[21276]: 2020/03/28 22:07:31 [INFO][*.shine.caddy.test.shinenelson.xyz] Done waiting
Mar 28 22:07:31 caddy[21276]: 2020/03/28 22:07:31 [INFO] [*.shine.caddy.test.shinenelson.xyz] acme: Obtaining bundled SAN certificate given a CSR
Mar 28 22:07:31 caddy[21276]: 2020/03/28 22:07:31 [INFO][*.mark.caddy.test.shinenelson.xyz] Waiting on rate limiter...
Mar 28 22:07:31 caddy[21276]: 2020/03/28 22:07:31 [INFO][*.mark.caddy.test.shinenelson.xyz] Done waiting
Mar 28 22:07:31 caddy[21276]: 2020/03/28 22:07:31 [INFO] [*.mark.caddy.test.shinenelson.xyz] acme: Obtaining bundled SAN certificate given a CSR
Mar 28 22:07:32 caddy[21276]: 2020/03/28 22:07:32 [INFO] [*.lola.caddy.test.shinenelson.xyz] AuthURL: https://acme-staging-v02.api.letsencrypt.org/acme/authz-v3/3621231393
Mar 28 22:07:32 caddy[21276]: 2020/03/28 22:07:32 [INFO] [*.lola.caddy.test.shinenelson.xyz] acme: use dns-01 solver
Mar 28 22:07:32 caddy[21276]: 2020/03/28 22:07:32 [INFO] [*.lola.caddy.test.shinenelson.xyz] acme: Preparing to solve DNS-01
Mar 28 22:07:32 caddy[21276]: 2020/03/28 22:07:32 [INFO] [*.lola.caddy.test.shinenelson.xyz] acme: Cleaning DNS-01 challenge
Mar 28 22:07:32 caddy[21276]: 2020/03/28 22:07:32 [WARN] [*.lola.caddy.test.shinenelson.xyz] acme: cleaning up failed: digitalocean: unknown record ID for '_acme-challenge.lola.caddy.test.shinenelson.xyz.'
Mar 28 22:07:32 caddy[21276]: 2020/03/28 22:07:32 [INFO] [*.shine.caddy.test.shinenelson.xyz] AuthURL: https://acme-staging-v02.api.letsencrypt.org/acme/authz-v3/3621231548
Mar 28 22:07:32 caddy[21276]: 2020/03/28 22:07:32 [INFO] [*.shine.caddy.test.shinenelson.xyz] acme: use dns-01 solver
Mar 28 22:07:32 caddy[21276]: 2020/03/28 22:07:32 [INFO] [*.shine.caddy.test.shinenelson.xyz] acme: Preparing to solve DNS-01
Mar 28 22:07:33 caddy[21276]: 2020/03/28 22:07:33 [INFO] Deactivating auth: https://acme-staging-v02.api.letsencrypt.org/acme/authz-v3/3621231393
Mar 28 22:07:33 caddy[21276]: 2020/03/28 22:07:33 [INFO] [*.mark.caddy.test.shinenelson.xyz] AuthURL: https://acme-staging-v02.api.letsencrypt.org/acme/authz-v3/3621231624
Mar 28 22:07:33 caddy[21276]: 2020/03/28 22:07:33 [INFO] [*.mark.caddy.test.shinenelson.xyz] acme: use dns-01 solver
Mar 28 22:07:33 caddy[21276]: 2020/03/28 22:07:33 [INFO] [*.mark.caddy.test.shinenelson.xyz] acme: Preparing to solve DNS-01
Mar 28 22:07:33 caddy[21276]: 2020/03/28 22:07:33 [ERROR] acme: Error -> One or more domains had a problem:
Mar 28 22:07:33 caddy[21276]: [*.lola.caddy.test.shinenelson.xyz] [*.lola.caddy.test.shinenelson.xyz] acme: error presenting token: digitalocean: HTTP 401: unauthorized: Unable to authenticate you.
Mar 28 22:07:33 caddy[21276]: (challenge=dns-01 remaining=[])
Mar 28 22:07:33 caddy[21276]: 2020/03/28 22:07:33 [INFO] [*.mark.caddy.test.shinenelson.xyz] acme: Cleaning DNS-01 challenge
Mar 28 22:07:33 caddy[21276]: 2020/03/28 22:07:33 [WARN] [*.mark.caddy.test.shinenelson.xyz] acme: cleaning up failed: digitalocean: unknown record ID for '_acme-challenge.mark.caddy.test.shinenelson.xyz.'
Mar 28 22:07:33 caddy[21276]: 2020/03/28 22:07:33 [INFO] [*.shine.caddy.test.shinenelson.xyz] acme: Cleaning DNS-01 challenge
Mar 28 22:07:33 caddy[21276]: 2020/03/28 22:07:33 [WARN] [*.shine.caddy.test.shinenelson.xyz] acme: cleaning up failed: digitalocean: unknown record ID for '_acme-challenge.shine.caddy.test.shinenelson.xyz.'
Mar 28 22:07:34 caddy[21276]: 2020/03/28 22:07:34 [INFO] Deactivating auth: https://acme-staging-v02.api.letsencrypt.org/acme/authz-v3/3621231624
Mar 28 22:07:34 caddy[21276]: 2020/03/28 22:07:34 [INFO] Deactivating auth: https://acme-staging-v02.api.letsencrypt.org/acme/authz-v3/3621231548
Mar 28 22:07:34 caddy[21276]: 2020/03/28 22:07:34 [ERROR] acme: Error -> One or more domains had a problem:
Mar 28 22:07:34 caddy[21276]: [*.mark.caddy.test.shinenelson.xyz] [*.mark.caddy.test.shinenelson.xyz] acme: error presenting token: digitalocean: HTTP 401: unauthorized: Unable to authenticate you.
Mar 28 22:07:34 caddy[21276]: (challenge=dns-01 remaining=[])
Mar 28 22:07:34 caddy[21276]: 2020/03/28 22:07:34 [ERROR] acme: Error -> One or more domains had a problem:
Mar 28 22:07:34 caddy[21276]: [*.shine.caddy.test.shinenelson.xyz] [*.shine.caddy.test.shinenelson.xyz] acme: error presenting token: digitalocean: HTTP 401: unauthorized: Unable to authenticate you.
Mar 28 22:07:34 caddy[21276]: (challenge=dns-01 remaining=[])
Mar 28 22:07:35 caddy[21276]: 2020/03/28 22:07:35 [ERROR] attempt 1: [*.lola.caddy.test.shinenelson.xyz] Obtain: [*.lola.caddy.test.shinenelson.xyz] acme: Error -> One or more domains had a problem:
Mar 28 22:07:35 caddy[21276]: [*.lola.caddy.test.shinenelson.xyz] [*.lola.caddy.test.shinenelson.xyz] acme: error presenting token: digitalocean: HTTP 401: unauthorized: Unable to authenticate you.
Mar 28 22:07:35 caddy[21276]: - retrying in 1m0s (6.672599298s/720h0m0s elapsed)...
Mar 28 22:07:36 caddy[21276]: 2020/03/28 22:07:36 [ERROR] attempt 1: [*.mark.caddy.test.shinenelson.xyz] Obtain: [*.mark.caddy.test.shinenelson.xyz] acme: Error -> One or more domains had a problem:
Mar 28 22:07:36 caddy[21276]: [*.mark.caddy.test.shinenelson.xyz] [*.mark.caddy.test.shinenelson.xyz] acme: error presenting token: digitalocean: HTTP 401: unauthorized: Unable to authenticate you.
Mar 28 22:07:36 caddy[21276]: - retrying in 1m0s (7.660701675s/720h0m0s elapsed)...
Mar 28 22:07:36 caddy[21276]: 2020/03/28 22:07:36 [ERROR] attempt 1: [*.shine.caddy.test.shinenelson.xyz] Obtain: [*.shine.caddy.test.shinenelson.xyz] acme: Error -> One or more domains had a problem:
Mar 28 22:07:36 caddy[21276]: [*.shine.caddy.test.shinenelson.xyz] [*.shine.caddy.test.shinenelson.xyz] acme: error presenting token: digitalocean: HTTP 401: unauthorized: Unable to authenticate you.
Mar 28 22:07:36 caddy[21276]: - retrying in 1m0s (7.729894166s/720h0m0s elapsed)...
Mar 28 22:07:50 caddy[21276]: {"level":"info","ts":1585433270.1349359,"logger":"admin.api","msg":"received request","method":"PATCH","uri":"/id/dns-challenge","remote_addr":"127.0.0.1:45044","headers":{"Accept":["*/*"],"Content-Length":["139"],"Content-Type":["application/json"],"User-Agent":["curl/7.58.0"]}}
Mar 28 22:07:50 caddy[21276]: {"level":"info","ts":1585433270.1362393,"logger":"admin","msg":"admin endpoint started","address":"localhost:2019","enforce_origin":false,"origins":["localhost:2019"]}
Mar 28 22:07:50 caddy[21276]: {"level":"info","ts":1585433270.1371884,"logger":"http","msg":"enabling automatic TLS certificate management","domains":["*.shine.caddy.test.shinenelson.xyz","*.mark.caddy.test.shinenelson.xyz","*.lola.caddy.test.shinenelson.xyz"]}
Mar 28 22:07:50 caddy[21276]: 2020/03/28 22:07:50 [INFO][cache:0xc000660be0] Stopped certificate maintenance routine
Mar 28 22:07:50 caddy[21276]: {"level":"info","ts":1585433270.1380074,"msg":"autosaved config","file":"/var/lib/caddy/.config/caddy/autosave.json"}
Mar 28 22:07:50 caddy[21276]: {"level":"info","ts":1585433270.1384685,"logger":"admin","msg":"stopped previous server"}
Mar 28 22:07:50 caddy[21276]: 2020/03/28 22:07:50 [INFO][cache:0xc0007fca50] Started certificate maintenance routine
Mar 28 22:07:50 caddy[21276]: 2020/03/28 22:07:50 [INFO][*.shine.caddy.test.shinenelson.xyz] Obtain: Releasing lock
Mar 28 22:07:50 caddy[21276]: 2020/03/28 22:07:50 [ERROR] *.shine.caddy.test.shinenelson.xyz: obtaining certificate: context canceled
Mar 28 22:07:50 caddy[21276]: 2020/03/28 22:07:50 [INFO][*.mark.caddy.test.shinenelson.xyz] Obtain: Releasing lock
Mar 28 22:07:50 caddy[21276]: 2020/03/28 22:07:50 [ERROR] *.mark.caddy.test.shinenelson.xyz: obtaining certificate: context canceled
Mar 28 22:07:50 caddy[21276]: 2020/03/28 22:07:50 [INFO][*.lola.caddy.test.shinenelson.xyz] Obtain: Releasing lock
Mar 28 22:07:50 caddy[21276]: 2020/03/28 22:07:50 [ERROR] *.lola.caddy.test.shinenelson.xyz: obtaining certificate: context canceled
Mar 28 22:08:48 caddy[21276]: {"level":"error","ts":1585433328.1396031,"logger":"http.log.error","msg":"making dial info: upstream localhost: invalid dial address localhost: address localhost: missing port in address","request":{"method":"GET","uri":"/","proto":"HTTP/1.1","remote_addr":"183.82.162.3:55114","host":"a.shine.caddy.test.shinenelson.xyz","headers":{"Accept":["*/*"],"User-Agent":["curl/7.69.0-DEV"]}}}
Mar 28 22:09:08 caddy[21276]: {"level":"info","ts":1585433348.8840406,"logger":"admin.api","msg":"received request","method":"PATCH","uri":"/id/proxy_shine","remote_addr":"127.0.0.1:45048","headers":{"Accept":["*/*"],"Content-Length":["103"],"Content-Type":["application/json"],"User-Agent":["curl/7.58.0"]}}
Mar 28 22:09:08 caddy[21276]: {"level":"info","ts":1585433348.8849828,"logger":"admin","msg":"admin endpoint started","address":"localhost:2019","enforce_origin":false,"origins":["localhost:2019"]}
Mar 28 22:09:08 caddy[21276]: {"level":"info","ts":1585433348.8881805,"logger":"http","msg":"enabling automatic TLS certificate management","domains":["*.shine.caddy.test.shinenelson.xyz","*.mark.caddy.test.shinenelson.xyz","*.lola.caddy.test.shinenelson.xyz"]}
Mar 28 22:09:08 caddy[21276]: 2020/03/28 22:09:08 [INFO][cache:0xc0007fca50] Stopped certificate maintenance routine
Mar 28 22:09:08 caddy[21276]: 2020/03/28 22:09:08 [INFO][cache:0xc00072b360] Started certificate maintenance routine
Mar 28 22:09:08 caddy[21276]: 2020/03/28 22:09:08 [INFO][*.shine.caddy.test.shinenelson.xyz] Obtain certificate; acquiring lock...
Mar 28 22:09:08 caddy[21276]: 2020/03/28 22:09:08 [INFO][*.shine.caddy.test.shinenelson.xyz] Obtain: Lock acquired; proceeding...
Mar 28 22:09:08 caddy[21276]: 2020/03/28 22:09:08 [INFO][*.shine.caddy.test.shinenelson.xyz] Waiting on rate limiter...
Mar 28 22:09:08 caddy[21276]: 2020/03/28 22:09:08 [INFO][*.shine.caddy.test.shinenelson.xyz] Done waiting
Mar 28 22:09:08 caddy[21276]: 2020/03/28 22:09:08 [INFO] [*.shine.caddy.test.shinenelson.xyz] acme: Obtaining bundled SAN certificate given a CSR
Mar 28 22:09:08 caddy[21276]: 2020/03/28 22:09:08 [INFO][*.mark.caddy.test.shinenelson.xyz] Obtain certificate; acquiring lock...
Mar 28 22:09:08 caddy[21276]: 2020/03/28 22:09:08 [INFO][*.mark.caddy.test.shinenelson.xyz] Obtain: Lock acquired; proceeding...
Mar 28 22:09:08 caddy[21276]: 2020/03/28 22:09:08 [INFO][*.mark.caddy.test.shinenelson.xyz] Waiting on rate limiter...
Mar 28 22:09:08 caddy[21276]: 2020/03/28 22:09:08 [INFO][*.mark.caddy.test.shinenelson.xyz] Done waiting
Mar 28 22:09:08 caddy[21276]: 2020/03/28 22:09:08 [INFO] [*.mark.caddy.test.shinenelson.xyz] acme: Obtaining bundled SAN certificate given a CSR
Mar 28 22:09:08 caddy[21276]: 2020/03/28 22:09:08 [INFO][*.lola.caddy.test.shinenelson.xyz] Obtain certificate; acquiring lock...
Mar 28 22:09:08 caddy[21276]: 2020/03/28 22:09:08 [INFO][*.lola.caddy.test.shinenelson.xyz] Obtain: Lock acquired; proceeding...
Mar 28 22:09:08 caddy[21276]: 2020/03/28 22:09:08 [INFO][*.lola.caddy.test.shinenelson.xyz] Waiting on rate limiter...
Mar 28 22:09:08 caddy[21276]: 2020/03/28 22:09:08 [INFO][*.lola.caddy.test.shinenelson.xyz] Done waiting
Mar 28 22:09:08 caddy[21276]: 2020/03/28 22:09:08 [INFO] [*.lola.caddy.test.shinenelson.xyz] acme: Obtaining bundled SAN certificate given a CSR
Mar 28 22:09:08 caddy[21276]: {"level":"info","ts":1585433348.9491303,"msg":"autosaved config","file":"/var/lib/caddy/.config/caddy/autosave.json"}
Mar 28 22:09:09 caddy[21276]: {"level":"info","ts":1585433349.3899143,"logger":"admin","msg":"stopped previous server"}
Mar 28 22:09:09 caddy[21276]: 2020/03/28 22:09:09 [INFO] [*.lola.caddy.test.shinenelson.xyz] AuthURL: https://acme-staging-v02.api.letsencrypt.org/acme/authz-v3/3621253163
Mar 28 22:09:09 caddy[21276]: 2020/03/28 22:09:09 [INFO] [*.lola.caddy.test.shinenelson.xyz] acme: use dns-01 solver
Mar 28 22:09:09 caddy[21276]: 2020/03/28 22:09:09 [INFO] [*.lola.caddy.test.shinenelson.xyz] acme: Preparing to solve DNS-01
Mar 28 22:09:10 caddy[21276]: 2020/03/28 22:09:10 [INFO] [*.shine.caddy.test.shinenelson.xyz] AuthURL: https://acme-staging-v02.api.letsencrypt.org/acme/authz-v3/3621253201
Mar 28 22:09:10 caddy[21276]: 2020/03/28 22:09:10 [INFO] [*.shine.caddy.test.shinenelson.xyz] acme: use dns-01 solver
Mar 28 22:09:10 caddy[21276]: 2020/03/28 22:09:10 [INFO] [*.shine.caddy.test.shinenelson.xyz] acme: Preparing to solve DNS-01
Mar 28 22:09:10 caddy[21276]: 2020/03/28 22:09:10 [INFO] [*.mark.caddy.test.shinenelson.xyz] AuthURL: https://acme-staging-v02.api.letsencrypt.org/acme/authz-v3/3621253348
Mar 28 22:09:10 caddy[21276]: 2020/03/28 22:09:10 [INFO] [*.mark.caddy.test.shinenelson.xyz] acme: use dns-01 solver
Mar 28 22:09:10 caddy[21276]: 2020/03/28 22:09:10 [INFO] [*.mark.caddy.test.shinenelson.xyz] acme: Preparing to solve DNS-01
Mar 28 22:09:10 caddy[21276]: 2020/03/28 22:09:10 [INFO] [*.shine.caddy.test.shinenelson.xyz] acme: Trying to solve DNS-01
Mar 28 22:09:10 caddy[21276]: 2020/03/28 22:09:10 [INFO] [*.shine.caddy.test.shinenelson.xyz] acme: Checking DNS record propagation using [127.0.0.53:53]
Mar 28 22:09:10 caddy[21276]: 2020/03/28 22:09:10 [INFO] Wait for propagation [timeout: 1m0s, interval: 5s]
Mar 28 22:09:10 caddy[21276]: 2020/03/28 22:09:10 [INFO] [*.lola.caddy.test.shinenelson.xyz] acme: Trying to solve DNS-01
Mar 28 22:09:10 caddy[21276]: 2020/03/28 22:09:10 [INFO] [*.lola.caddy.test.shinenelson.xyz] acme: Checking DNS record propagation using [127.0.0.53:53]
Mar 28 22:09:10 caddy[21276]: 2020/03/28 22:09:10 [INFO] Wait for propagation [timeout: 1m0s, interval: 5s]
Mar 28 22:09:11 caddy[21276]: 2020/03/28 22:09:11 [INFO] [*.mark.caddy.test.shinenelson.xyz] acme: Trying to solve DNS-01
Mar 28 22:09:11 caddy[21276]: 2020/03/28 22:09:11 [INFO] [*.mark.caddy.test.shinenelson.xyz] acme: Checking DNS record propagation using [127.0.0.53:53]
Mar 28 22:09:11 caddy[21276]: 2020/03/28 22:09:11 [INFO] Wait for propagation [timeout: 1m0s, interval: 5s]
Mar 28 22:09:16 caddy[21276]: 2020/03/28 22:09:16 [INFO] [*.shine.caddy.test.shinenelson.xyz] The server validated our request
Mar 28 22:09:16 caddy[21276]: 2020/03/28 22:09:16 [INFO] [*.shine.caddy.test.shinenelson.xyz] acme: Cleaning DNS-01 challenge
Mar 28 22:09:18 caddy[21276]: 2020/03/28 22:09:18 [INFO] [*.shine.caddy.test.shinenelson.xyz] acme: Validations succeeded; requesting certificates
Mar 28 22:09:18 caddy[21276]: 2020/03/28 22:09:18 [INFO] [*.lola.caddy.test.shinenelson.xyz] The server validated our request
Mar 28 22:09:18 caddy[21276]: 2020/03/28 22:09:18 [INFO] [*.lola.caddy.test.shinenelson.xyz] acme: Cleaning DNS-01 challenge
Mar 28 22:09:19 caddy[21276]: 2020/03/28 22:09:19 [INFO] [*.lola.caddy.test.shinenelson.xyz] acme: Validations succeeded; requesting certificates
Mar 28 22:09:19 caddy[21276]: 2020/03/28 22:09:19 [INFO] [*.shine.caddy.test.shinenelson.xyz] Server responded with a certificate.
Mar 28 22:09:19 caddy[21276]: 2020/03/28 22:09:19 [INFO][*.shine.caddy.test.shinenelson.xyz] Certificate obtained successfully
Mar 28 22:09:19 caddy[21276]: 2020/03/28 22:09:19 [INFO][*.shine.caddy.test.shinenelson.xyz] Obtain: Releasing lock
Mar 28 22:09:20 caddy[21276]: 2020/03/28 22:09:20 [INFO] [*.mark.caddy.test.shinenelson.xyz] The server validated our request
Mar 28 22:09:20 caddy[21276]: 2020/03/28 22:09:20 [INFO] [*.mark.caddy.test.shinenelson.xyz] acme: Cleaning DNS-01 challenge
Mar 28 22:09:21 caddy[21276]: 2020/03/28 22:09:21 [INFO] [*.lola.caddy.test.shinenelson.xyz] Server responded with a certificate.
Mar 28 22:09:21 caddy[21276]: 2020/03/28 22:09:21 [INFO][*.lola.caddy.test.shinenelson.xyz] Certificate obtained successfully
Mar 28 22:09:21 caddy[21276]: 2020/03/28 22:09:21 [INFO][*.lola.caddy.test.shinenelson.xyz] Obtain: Releasing lock
Mar 28 22:09:21 caddy[21276]: 2020/03/28 22:09:21 [INFO] [*.mark.caddy.test.shinenelson.xyz] acme: Validations succeeded; requesting certificates
Mar 28 22:09:22 caddy[21276]: 2020/03/28 22:09:22 [INFO] [*.mark.caddy.test.shinenelson.xyz] Server responded with a certificate.
Mar 28 22:09:22 caddy[21276]: 2020/03/28 22:09:22 [INFO][*.mark.caddy.test.shinenelson.xyz] Certificate obtained successfully
Mar 28 22:09:22 caddy[21276]: 2020/03/28 22:09:22 [INFO][*.mark.caddy.test.shinenelson.xyz] Obtain: Releasing lock
The problem is that the certificate maintenance routine does not retry obtaining the certificate again. I’ll have to reload the caddy server again before it’ll try again and generate the TLS certificates ( if the API keys are correct ) using the DNS challenge.
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Comments: 23 (13 by maintainers)
Commits related to this issue
- Allow duplicate 'obtain' jobs See https://github.com/caddyserver/caddy/issues/3202 — committed to caddyserver/certmagic by mholt 4 years ago
Great to hear it!
Here’s what was happening:
When a Caddy config is loaded and started, it initializes all the apps, like the
tlsapp which manages certificates. If a new config fails to load completely successfully, it is stopped and the old one keeps running. If it succeeds, it stops the old one instead. For a brief period, both configs are active. (Explained here)Under the hood, CertMagic works on “obtain” and “renew” jobs asynchronously (in the background). When Caddy asks to obtain certificates, CertMagic creates a series of jobs, one for each name. Each job had a unique name/ID so that jobs wouldn’t be duplicated. In other words, you don’t need 3 jobs that all attempt to renew the same certificate. If a job with the same name was already queued, CertMagic would ignore the new job so as to not duplicate it and clog up the queue.
The problem is that Caddy calls
ManageAsync()before shutting down the previous config. That’s by design, but it was problematic because CertMagic would see the old config’s “obtain” jobs of the same name and discard the new ones, not knowing that the old ones were about to be canceled! Hence, the second config load only canceled the obtain job from the first config, and did not actually start the new ones.The change I made removes job de-duplication for “obtain” jobs – i.e. gives them an empty name. This is because we now assume that whoever is calling
ManageAsync()will not do so excessively, and/or they will cancel the context of any prior calls.In other words, the fix is to allow multiple/overlapping “obtain” jobs, so for a brief time, yes there are two jobs that try to obtain the same certificate, but they are synced by a lock mechanism, and one of them is canceled quickly anyway.
So, I think this is a nice and simple solution to a tricky and obscure problem.
Thanks for your patience and diligence in getting it sorted out!
All good! I too tested it multiple times. I broke the DNS auth a couple of times to make sure it would still generate the new certificates. This is good to go 🚀
It’s fine, this was a weird case that is caused by the interplay between the two. CertMagic would have been a more confusing place to report this bug, but the fix was much much easier in CertMagic.
On it.
Side note : I don’t know why I come up with issues that should be reported at caddyserver/certmagic and report them here. I’ll try to remember that the certmagic repository also exists next time I have an issue with TLS certificates.
sure; it’ll take me a bit to get my test instance ready. Let me go and get that instance up by the time you merge that change in.
Interesting… I think I see what you’re talking about now. I have a hunch, I’ll need more time to mull it over though and try some things.
If I’m right, the good news is that we’re not leaking resources – the bad news is that we’re cleaning up too many resources. Either way, as you’ve noted, a practical workaround in the meantime is to trigger another config reload. You can do this by POSTing your config to the /load endpoint with
Cache-Control: must-revalidatein the request header (which I just realized I forgot to document).Ah, interesting. I think the template is on the v1 branch, we’ll need to add it to the new v2/master branch now that we migrated the default branch. Thanks for pointing that out
I know I messed up the issue template the last time ( unfortunately, it somehow doesn’t exist in the repository or come up while creating a new issue. I had to copy-paste it from an earlier comment ) and went all over the place, so this time, I’m going to be a little bit more diligent with the issue template at least.
I’m going to try and collate everything I said previously into this one comment so that you don’t have to scroll back and forth for context.
1. Environment
1a. Operating system and version
$ lsb_release --all1b. Caddy version (run
caddy versionor paste commit SHA)latest
masterhttps://github.com/caddyserver/caddy/commit/68cebb28d063a7a71705ce022f118b5e1205fa3f + tls.dns.digitalocean1c. Go version (if building Caddy from source; run
go version)go version go1.14 linux/amd642. Description
2a. What happens (briefly explain what is wrong)
If a
certificate maintenance routineis in a wait context and the server is reloaded, the new instance of the server kills the existing context, but does not start a new one.2b. Why it’s a bug (if it’s not obvious)
While the killing of the previous
obtain certificatecontext is desirable, I don’t think the new server not starting a new context is the expected behavior.2c. Log output
I’ll merge the log along with the Steps to Reproduce so that I have the flexibility of breaking the logs by steps making it further comprehensible otherwise the log output might be confusing to understand what I’m talking about.
2d. Initial configuration
PS : This is the configuration that the server is started with. The configuration is updated later via the API.
3. Steps to Reproduce
caddyserver with above configuration file$ caddy run --environ --config /var/lib/caddy/.config/caddy/caddy.jsonLog
This is expected since the
digitalocean_auth_tokenis not an actualauth_token.PATCHexisting configuration with proper actualauth_token$ curl -X PATCH -H 'Content-Type: application/json' -d '{ "@id": "dns-challenge", "provider": "digitalocean", "auth_token": "actualdigitaloceanauthtoken" }' localhost:2019/id/dns-challengeLog
This is where the issue is. Ideally, the new
certificate maintenance routineshould start a newobtaining certificatecontext. Instead, it just stays there. The last time, I waited only for 1 minute. This time, I waited longer - around 5 minutes.PATCHing some other@id$ curl -X PATCH -H 'Content-Type: application/json' -d '{ "@id": "proxy_mark", "handler": "reverse_proxy", "upstreams": [ { "dial": "localhost" } ] }' localhost:2019/id/proxy_markLog
I’m not sure if this is intended behaviour, but I’d expect the above to happen right after the previous
obtain certificatecontext is canceled in Step 2.Sidenotes ( from previous comment )
This was because the
--resumeflag was missing in the service file and I had missed to look at the documentation for all parameters of the command line. I took the service file for granted. The new README.md#L52 explains it clearly, but that file came only 5 days ago 🙈 I haven’t tested with the flag yet but I’m guessing the flag was the culprit after all.I’m sorry I couldn’t get this tested sooner. I’m currently limited by a test environment, so I will not be able to give you more detailed test cases for reproduction.
I will get back to this as soon as I have my test environment back online.
~ shine – Sent from /e/ Mail.
On 30 March 2020 8:04:22 PM IST, Matt Holt notifications@github.com wrote: