kuma: DPP w/invalid or missing CA cert should fail instead of endless error loop
Summary
If a DPP starts with a missing or invalid CA cert, both the DPP and CP get in an endless error loop. The DPP should fail and exit as a retry isn’t going to ever succeed:
DPP:
[2021-05-21 17:19:18.477][1921][warning][upstream] [source/common/upstream/health_discovery_service.cc:334] StreamHealthCheck gRPC config stream closed: 14, upstream connect error or disconnect/reset before headers. reset reason: connection failure, transport failure reason: TLS error: 268435581:SSL routines:OPENSSL_internal:CERTIFICATE_VERIFY_FAILED
[2021-05-21 17:19:18.477][1921][warning][upstream] [source/common/upstream/health_discovery_service.cc:71] HdsDelegate stream/connection failure, will retry in 535 ms.
[2021-05-21 17:19:19.015][1921][warning][upstream] [source/common/upstream/health_discovery_service.cc:334] StreamHealthCheck gRPC config stream closed: 14, upstream connect error or disconnect/reset before headers. reset reason: connection failure, transport failure reason: TLS error: 268435581:SSL routines:OPENSSL_internal:CERTIFICATE_VERIFY_FAILED
[2021-05-21 17:19:19.015][1921][warning][upstream] [source/common/upstream/health_discovery_service.cc:71] HdsDelegate stream/connection failure, will retry in 226 ms.
[2021-05-21 17:19:19.246][1921][warning][upstream] [source/common/upstream/health_discovery_service.cc:334] StreamHealthCheck gRPC config stream closed: 14, upstream connect error or disconnect/reset before headers. reset reason: connection failure, transport failure reason: TLS error: 268435581:SSL routines:OPENSSL_internal:CERTIFICATE_VERIFY_FAILED
[2021-05-21 17:19:19.246][1921][warning][upstream] [source/common/upstream/health_discovery_service.cc:71] HdsDelegate stream/connection failure, will retry in 427 ms.
[2021-05-21 17:19:19.671][1921][warning][upstream] [source/common/upstream/health_discovery_service.cc:334] StreamHealthCheck gRPC config stream closed: 14, upstream connect error or disconnect/reset before headers. reset reason: connection failure, transport failure reason: TLS error: 268435581:SSL routines:OPENSSL_internal:CERTIFICATE_VERIFY_FAILED
[2021-05-21 17:19:19.671][1921][warning][upstream] [source/common/upstream/health_discovery_service.cc:71] HdsDelegate stream/connection failure, will retry in 918 ms.
[2021-05-21 17:19:20.594][1921][warning][upstream] [source/common/upstream/health_discovery_service.cc:334] StreamHealthCheck gRPC config stream closed: 14, upstream connect error or disconnect/reset before headers. reset reason: connection failure, transport failure reason: TLS error: 268435581:SSL routines:OPENSSL_internal:CERTIFICATE_VERIFY_FAILED
[2021-05-21 17:19:20.594][1921][warning][upstream] [source/common/upstream/health_discovery_service.cc:71] HdsDelegate stream/connection failure, will retry in 628 ms.
CP:
2021/05/21 17:19:12 http: TLS handshake error from 172.31.4.238:40740: remote error: tls: unknown certificate authority
2021/05/21 17:19:13 http: TLS handshake error from 172.31.4.238:40742: remote error: tls: unknown certificate authority
2021/05/21 17:19:13 http: TLS handshake error from 172.31.4.238:40744: remote error: tls: unknown certificate authority
2021/05/21 17:19:13 http: TLS handshake error from 172.31.4.238:40746: remote error: tls: unknown certificate authority
2021/05/21 17:19:14 http: TLS handshake error from 172.31.4.238:40748: remote error: tls: unknown certificate authority
2021/05/21 17:19:14 http: TLS handshake error from 172.31.4.238:40750: remote error: tls: unknown certificate authority
2021/05/21 17:19:15 http: TLS handshake error from 172.31.4.238:40752: remote error: tls: unknown certificate authority
2021/05/21 17:19:15 http: TLS handshake error from 172.31.4.238:40754: remote error: tls: unknown certificate authority
2021/05/21 17:19:15 http: TLS handshake error from 172.31.4.238:40756: remote error: tls: unknown certificate authority
2021/05/21 17:19:16 http: TLS handshake error from 172.31.4.238:40758: remote error: tls: unknown certificate authority
2021/05/21 17:19:16 http: TLS handshake error from 172.31.4.238:40760: remote error: tls: unknown certificate authority
2021/05/21 17:19:17 http: TLS handshake error from 172.31.4.238:40762: remote error: tls: unknown certificate authority
2021/05/21 17:19:18 http: TLS handshake error from 172.31.4.238:40764: remote error: tls: unknown certificate authority
2021/05/21 17:19:18 http: TLS handshake error from 172.31.4.238:40766: remote error: tls: unknown certificate authority
2021/05/21 17:19:18 http: TLS handshake error from 172.31.4.238:40768: remote error: tls: unknown certificate authority
2021/05/21 17:19:19 http: TLS handshake error from 172.31.4.238:40770: remote error: tls: unknown certificate authority
2021/05/21 17:19:19 http: TLS handshake error from 172.31.4.238:40772: remote error: tls: unknown certificate authority
2021/05/21 17:19:19 http: TLS handshake error from 172.31.4.238:40774: remote error: tls: unknown certificate authority
2021/05/21 17:19:20 http: TLS handshake error from 172.31.4.238:40776: remote error: tls: unknown certificate authority
Steps To Reproduce
[root@ip-172-31-2-167 ~]# env | grep KUMA KUMA_GENERAL_TLS_KEY_FILE=/home/ec2-user/ip-172-31-2-167.us-east-2.compute.internal.key KUMA_DP_SERVER_TLS_KEY_FILE=/home/ec2-user/ip-172-31-2-167.us-east-2.compute.internal.key KUMA_API_SERVER_AUTH_CLIENT_CERTS_DIR=/home/ec2-user KUMA_GENERAL_TLS_CERT_FILE=/home/ec2-user/ip-172-31-2-167.us-east-2.compute.internal.crt KUMA_DP_SERVER_TLS_CERT_FILE=/home/ec2-user/ip-172-31-2-167.us-east-2.compute.internal.crt [root@ip-172-31-2-167 ~]# KUMA_MODE=remote KUMA_MULTIZONE_REMOTE_ZONE=universal-2 KUMA_MULTIZONE_REMOTE_GLOBAL_ADDRESS=grpcs://ip-172-31-7-0.us-east-2.compute.internal:5685 KUMA_DNS_SERVER_PORT=53 kuma-cp run --license-path=/home/ec2-user/license.json
2. ```
[ec2-user@ip-172-31-4-238 ~]$ env | grep KUMA
KUMA_GENERAL_TLS_KEY_FILE=/home/ec2-user/ip-172-31-4-238.us-east-2.compute.internal.key
KUMA_DNS_SERVER_PORT=53
KUMA_DNS_SERVER_CIDR=240.0.0.0/4
KUMA_DP_SERVER_TLS_KEY_FILE=/home/ec2-user/ip-172-31-4-238.us-east-2.compute.internal.key
KUMA_API_SERVER_AUTH_CLIENT_CERTS_DIR=/home/ec2-user
KUMA_DNS_SERVER_DOMAIN=mesh
KUMA_GENERAL_TLS_CERT_FILE=/home/ec2-user/ip-172-31-4-238.us-east-2.compute.internal.crt
KUMA_DP_SERVER_TLS_CERT_FILE=/home/ec2-user/ip-172-31-4-238.us-east-2.compute.internal.crt
[ec2-user@ip-172-31-4-238 ~]$ kuma-dp run --cp-address=https://ip-172-31-2-167.us-east-2.compute.internal:5678/ --dataplane-token-file=/home/ec2-user/universal-token --dataplane-file=/home/ec2-user/dataplane-universal.yaml --dns-enabled
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Comments: 28 (20 by maintainers)
Hi, we just changed our installation to use self-signed certs like suggested by @jakubdyszkiewicz and it seems way more reliable now.
I was actually aware of the “unhealthy” relationship between ArgoCD, Helm and Kubernetes secrets, but for whatever reason I was always believing that the Kuma CPs responsible for generating the certs.
Thanks a lot for the investigation! From my perspective the issue can be marked as resolved 👍
Hi @tibuntu, we have a team offsite, so we’re a bit slow with these things. I’ll bubble up this issue today.