kuma: DPP w/invalid or missing CA cert should fail instead of endless error loop

Summary

If a DPP starts with a missing or invalid CA cert, both the DPP and CP get in an endless error loop. The DPP should fail and exit as a retry isn’t going to ever succeed:

DPP:

[2021-05-21 17:19:18.477][1921][warning][upstream] [source/common/upstream/health_discovery_service.cc:334] StreamHealthCheck gRPC config stream closed: 14, upstream connect error or disconnect/reset before headers. reset reason: connection failure, transport failure reason: TLS error: 268435581:SSL routines:OPENSSL_internal:CERTIFICATE_VERIFY_FAILED
[2021-05-21 17:19:18.477][1921][warning][upstream] [source/common/upstream/health_discovery_service.cc:71] HdsDelegate stream/connection failure, will retry in 535 ms.
[2021-05-21 17:19:19.015][1921][warning][upstream] [source/common/upstream/health_discovery_service.cc:334] StreamHealthCheck gRPC config stream closed: 14, upstream connect error or disconnect/reset before headers. reset reason: connection failure, transport failure reason: TLS error: 268435581:SSL routines:OPENSSL_internal:CERTIFICATE_VERIFY_FAILED
[2021-05-21 17:19:19.015][1921][warning][upstream] [source/common/upstream/health_discovery_service.cc:71] HdsDelegate stream/connection failure, will retry in 226 ms.
[2021-05-21 17:19:19.246][1921][warning][upstream] [source/common/upstream/health_discovery_service.cc:334] StreamHealthCheck gRPC config stream closed: 14, upstream connect error or disconnect/reset before headers. reset reason: connection failure, transport failure reason: TLS error: 268435581:SSL routines:OPENSSL_internal:CERTIFICATE_VERIFY_FAILED
[2021-05-21 17:19:19.246][1921][warning][upstream] [source/common/upstream/health_discovery_service.cc:71] HdsDelegate stream/connection failure, will retry in 427 ms.
[2021-05-21 17:19:19.671][1921][warning][upstream] [source/common/upstream/health_discovery_service.cc:334] StreamHealthCheck gRPC config stream closed: 14, upstream connect error or disconnect/reset before headers. reset reason: connection failure, transport failure reason: TLS error: 268435581:SSL routines:OPENSSL_internal:CERTIFICATE_VERIFY_FAILED
[2021-05-21 17:19:19.671][1921][warning][upstream] [source/common/upstream/health_discovery_service.cc:71] HdsDelegate stream/connection failure, will retry in 918 ms.
[2021-05-21 17:19:20.594][1921][warning][upstream] [source/common/upstream/health_discovery_service.cc:334] StreamHealthCheck gRPC config stream closed: 14, upstream connect error or disconnect/reset before headers. reset reason: connection failure, transport failure reason: TLS error: 268435581:SSL routines:OPENSSL_internal:CERTIFICATE_VERIFY_FAILED
[2021-05-21 17:19:20.594][1921][warning][upstream] [source/common/upstream/health_discovery_service.cc:71] HdsDelegate stream/connection failure, will retry in 628 ms.

CP:

2021/05/21 17:19:12 http: TLS handshake error from 172.31.4.238:40740: remote error: tls: unknown certificate authority
2021/05/21 17:19:13 http: TLS handshake error from 172.31.4.238:40742: remote error: tls: unknown certificate authority
2021/05/21 17:19:13 http: TLS handshake error from 172.31.4.238:40744: remote error: tls: unknown certificate authority
2021/05/21 17:19:13 http: TLS handshake error from 172.31.4.238:40746: remote error: tls: unknown certificate authority
2021/05/21 17:19:14 http: TLS handshake error from 172.31.4.238:40748: remote error: tls: unknown certificate authority
2021/05/21 17:19:14 http: TLS handshake error from 172.31.4.238:40750: remote error: tls: unknown certificate authority
2021/05/21 17:19:15 http: TLS handshake error from 172.31.4.238:40752: remote error: tls: unknown certificate authority
2021/05/21 17:19:15 http: TLS handshake error from 172.31.4.238:40754: remote error: tls: unknown certificate authority
2021/05/21 17:19:15 http: TLS handshake error from 172.31.4.238:40756: remote error: tls: unknown certificate authority
2021/05/21 17:19:16 http: TLS handshake error from 172.31.4.238:40758: remote error: tls: unknown certificate authority
2021/05/21 17:19:16 http: TLS handshake error from 172.31.4.238:40760: remote error: tls: unknown certificate authority
2021/05/21 17:19:17 http: TLS handshake error from 172.31.4.238:40762: remote error: tls: unknown certificate authority
2021/05/21 17:19:18 http: TLS handshake error from 172.31.4.238:40764: remote error: tls: unknown certificate authority
2021/05/21 17:19:18 http: TLS handshake error from 172.31.4.238:40766: remote error: tls: unknown certificate authority
2021/05/21 17:19:18 http: TLS handshake error from 172.31.4.238:40768: remote error: tls: unknown certificate authority
2021/05/21 17:19:19 http: TLS handshake error from 172.31.4.238:40770: remote error: tls: unknown certificate authority
2021/05/21 17:19:19 http: TLS handshake error from 172.31.4.238:40772: remote error: tls: unknown certificate authority
2021/05/21 17:19:19 http: TLS handshake error from 172.31.4.238:40774: remote error: tls: unknown certificate authority
2021/05/21 17:19:20 http: TLS handshake error from 172.31.4.238:40776: remote error: tls: unknown certificate authority

Steps To Reproduce

[root@ip-172-31-2-167 ~]# env | grep KUMA KUMA_GENERAL_TLS_KEY_FILE=/home/ec2-user/ip-172-31-2-167.us-east-2.compute.internal.key KUMA_DP_SERVER_TLS_KEY_FILE=/home/ec2-user/ip-172-31-2-167.us-east-2.compute.internal.key KUMA_API_SERVER_AUTH_CLIENT_CERTS_DIR=/home/ec2-user KUMA_GENERAL_TLS_CERT_FILE=/home/ec2-user/ip-172-31-2-167.us-east-2.compute.internal.crt KUMA_DP_SERVER_TLS_CERT_FILE=/home/ec2-user/ip-172-31-2-167.us-east-2.compute.internal.crt [root@ip-172-31-2-167 ~]# KUMA_MODE=remote KUMA_MULTIZONE_REMOTE_ZONE=universal-2 KUMA_MULTIZONE_REMOTE_GLOBAL_ADDRESS=grpcs://ip-172-31-7-0.us-east-2.compute.internal:5685 KUMA_DNS_SERVER_PORT=53 kuma-cp run --license-path=/home/ec2-user/license.json

2. ```
[ec2-user@ip-172-31-4-238 ~]$ env | grep KUMA
KUMA_GENERAL_TLS_KEY_FILE=/home/ec2-user/ip-172-31-4-238.us-east-2.compute.internal.key
KUMA_DNS_SERVER_PORT=53
KUMA_DNS_SERVER_CIDR=240.0.0.0/4
KUMA_DP_SERVER_TLS_KEY_FILE=/home/ec2-user/ip-172-31-4-238.us-east-2.compute.internal.key
KUMA_API_SERVER_AUTH_CLIENT_CERTS_DIR=/home/ec2-user
KUMA_DNS_SERVER_DOMAIN=mesh
KUMA_GENERAL_TLS_CERT_FILE=/home/ec2-user/ip-172-31-4-238.us-east-2.compute.internal.crt
KUMA_DP_SERVER_TLS_CERT_FILE=/home/ec2-user/ip-172-31-4-238.us-east-2.compute.internal.crt
[ec2-user@ip-172-31-4-238 ~]$ kuma-dp run --cp-address=https://ip-172-31-2-167.us-east-2.compute.internal:5678/ --dataplane-token-file=/home/ec2-user/universal-token --dataplane-file=/home/ec2-user/dataplane-universal.yaml  --dns-enabled

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 28 (20 by maintainers)

Most upvoted comments

Hi, we just changed our installation to use self-signed certs like suggested by @jakubdyszkiewicz and it seems way more reliable now.

I was actually aware of the “unhealthy” relationship between ArgoCD, Helm and Kubernetes secrets, but for whatever reason I was always believing that the Kuma CPs responsible for generating the certs.

Thanks a lot for the investigation! From my perspective the issue can be marked as resolved 👍

Hi @tibuntu, we have a team offsite, so we’re a bit slow with these things. I’ll bubble up this issue today.