istio: 1.6 istio-proxy SDS not updating certificates SSLV3_ALERT_CERTIFICATE_EXPIRED
Bug description
In our 1.6
qualifying environment we have an application that is rejecting requests with upstream connect error or disconnect/reset before headers. reset reason: connection failure
.
We have three (almost) identical test apps:
- istio-test-app-1
- istio-test-app-2
- istio-test-app-3 (this is the broken one)
We continually send load that goes 1 -> 2 -> 3.
We rolling restart istio-test-app-1
every 10 minutes to create config churn

The pod is 25 hours old, however as you can see here something happened at 14:35 which broke istio-test-app-3
(1 and 2 remain fine).
The istio proxy logs contain lots of sds noise:
{"level":"info","time":"2020-10-17T14:39:40.757245Z","scope":"sds","msg":"resource:default new connection"}
{"level":"info","time":"2020-10-17T14:39:40.757366Z","scope":"sds","msg":"Skipping waiting for ingress gateway secret"}
{"level":"info","time":"2020-10-17T14:39:40.757625Z","scope":"cache","msg":"GenerateSecret from file default"}
{"level":"info","time":"2020-10-17T14:39:40.757772Z","scope":"sds","msg":"resource:default pushed key/cert pair to proxy"}
{"level":"info","time":"2020-10-17T14:39:40.875744Z","scope":"sds","msg":"resource:default new connection"}
{"level":"info","time":"2020-10-17T14:39:40.875865Z","scope":"sds","msg":"Skipping waiting for ingress gateway secret"}
{"level":"info","time":"2020-10-17T14:39:40.876142Z","scope":"cache","msg":"GenerateSecret from file default"}
{"level":"info","time":"2020-10-17T14:39:40.876315Z","scope":"sds","msg":"resource:default pushed key/cert pair to proxy"}
{"level":"info","time":"2020-10-17T14:44:40.680854Z","scope":"sds","msg":"resource:default connection is terminated: rpc error: code = Canceled desc = context canceled"}
{"level":"info","time":"2020-10-17T14:44:40.680939Z","scope":"sds","msg":"resource:default connection is terminated: rpc error: code = Canceled desc = context canceled"}
2020-10-17T14:44:40.681104Z warning envoy config [bazel-out/k8-opt/bin/external/envoy/source/common/config/_virtual_includes/grpc_stream_lib/common/config/grpc_stream.h:92] StreamSecrets gRPC config stream closed: 2, resource:default Close connection to proxy "sidecar~10.206.4.33~istio-test-app-3-69d695bf8b-b256m.istio-test-app-3~istio-test-app-3.svc.cluster.local-44"
2020-10-17T14:44:40.681167Z warning envoy config [bazel-out/k8-opt/bin/external/envoy/source/common/config/_virtual_includes/grpc_stream_lib/common/config/grpc_stream.h:92] StreamSecrets gRPC config stream closed: 2, resource:default Close connection to proxy "sidecar~10.206.4.33~istio-test-app-3-69d695bf8b-b256m.istio-test-app-3~istio-test-app-3.svc.cluster.local-45"
{"level":"info","time":"2020-10-17T14:44:40.989664Z","scope":"sds","msg":"resource:default new connection"}
{"level":"info","time":"2020-10-17T14:44:40.989763Z","scope":"sds","msg":"Skipping waiting for ingress gateway secret"}
{"level":"info","time":"2020-10-17T14:44:40.989996Z","scope":"cache","msg":"GenerateSecret from file default"}
{"level":"info","time":"2020-10-17T14:44:40.990130Z","scope":"sds","msg":"resource:default pushed key/cert pair to proxy"}
{"level":"info","time":"2020-10-17T14:44:41.101797Z","scope":"sds","msg":"resource:default new connection"}
{"level":"info","time":"2020-10-17T14:44:41.101917Z","scope":"sds","msg":"Skipping waiting for ingress gateway secret"}
{"level":"info","time":"2020-10-17T14:44:41.102142Z","scope":"cache","msg":"GenerateSecret from file default"}
{"level":"info","time":"2020-10-17T14:44:41.102384Z","scope":"sds","msg":"resource:default pushed key/cert pair to proxy"}
{"level":"info","time":"2020-10-17T14:49:40.680826Z","scope":"sds","msg":"resource:default connection is terminated: rpc error: code = Canceled desc = context canceled"}
{"level":"info","time":"2020-10-17T14:49:40.680922Z","scope":"sds","msg":"resource:default connection is terminated: rpc error: code = Canceled desc = context canceled"}
2020-10-17T14:49:40.681068Z warning envoy config [bazel-out/k8-opt/bin/external/envoy/source/common/config/_virtual_includes/grpc_stream_lib/common/config/grpc_stream.h:92] StreamSecrets gRPC config stream closed: 2, resource:default Close connection to proxy "sidecar~10.206.4.33~istio-test-app-3-69d695bf8b-b256m.istio-test-app-3~istio-test-app-3.svc.cluster.local-46"
2020-10-17T14:49:40.681109Z warning envoy config [bazel-out/k8-opt/bin/external/envoy/source/common/config/_virtual_includes/grpc_stream_lib/common/config/grpc_stream.h:92] StreamSecrets gRPC config stream closed: 2, resource:default Close connection to proxy "sidecar~10.206.4.33~istio-test-app-3-69d695bf8b-b256m.istio-test-app-3~istio-test-app-3.svc.cluster.local-47"
{"level":"info","time":"2020-10-17T14:49:40.853145Z","scope":"sds","msg":"resource:default new connection"}
{"level":"info","time":"2020-10-17T14:49:40.853301Z","scope":"sds","msg":"Skipping waiting for ingress gateway secret"}
{"level":"info","time":"2020-10-17T14:49:40.853587Z","scope":"cache","msg":"GenerateSecret from file default"}
{"level":"info","time":"2020-10-17T14:49:40.853768Z","scope":"sds","msg":"resource:default pushed key/cert pair to proxy"}
{"level":"info","time":"2020-10-17T14:49:40.917526Z","scope":"sds","msg":"resource:default new connection"}
{"level":"info","time":"2020-10-17T14:49:40.917640Z","scope":"sds","msg":"Skipping waiting for ingress gateway secret"}
{"level":"info","time":"2020-10-17T14:49:40.917876Z","scope":"cache","msg":"GenerateSecret from file default"}
{"level":"info","time":"2020-10-17T14:49:40.918032Z","scope":"sds","msg":"resource:default pushed key/cert pair to proxy"}
{"level":"info","time":"2020-10-17T14:54:40.680797Z","scope":"sds","msg":"resource:default connection is terminated: rpc error: code = Canceled desc = context canceled"}
{"level":"info","time":"2020-10-17T14:54:40.680866Z","scope":"sds","msg":"resource:default connection is terminated: rpc error: code = Canceled desc = context canceled"}
2020-10-17T14:54:40.680941Z warning envoy config [bazel-out/k8-opt/bin/external/envoy/source/common/config/_virtual_includes/grpc_stream_lib/common/config/grpc_stream.h:92] StreamSecrets gRPC config stream closed: 2, resource:default Close connection to proxy "sidecar~10.206.4.33~istio-test-app-3-69d695bf8b-b256m.istio-test-app-3~istio-test-app-3.svc.cluster.local-48"
2020-10-17T14:54:40.681040Z warning envoy config [bazel-out/k8-opt/bin/external/envoy/source/common/config/_virtual_includes/grpc_stream_lib/common/config/grpc_stream.h:92] StreamSecrets gRPC config stream closed: 2, resource:default Close connection to proxy "sidecar~10.206.4.33~istio-test-app-3-69d695bf8b-b256m.istio-test-app-3~istio-test-app-3.svc.cluster.local-49"
{"level":"info","time":"2020-10-17T14:54:40.758052Z","scope":"sds","msg":"resource:default new connection"}
{"level":"info","time":"2020-10-17T14:54:40.758157Z","scope":"sds","msg":"Skipping waiting for ingress gateway secret"}
{"level":"info","time":"2020-10-17T14:54:40.758426Z","scope":"cache","msg":"GenerateSecret from file default"}
{"level":"info","time":"2020-10-17T14:54:40.758584Z","scope":"sds","msg":"resource:default pushed key/cert pair to proxy"}
{"level":"info","time":"2020-10-17T14:54:41.127609Z","scope":"sds","msg":"resource:default new connection"}
{"level":"info","time":"2020-10-17T14:54:41.127721Z","scope":"sds","msg":"Skipping waiting for ingress gateway secret"}
{"level":"info","time":"2020-10-17T14:54:41.127964Z","scope":"cache","msg":"GenerateSecret from file default"}
{"level":"info","time":"2020-10-17T14:54:41.128115Z","scope":"sds","msg":"resource:default pushed key/cert pair to proxy"}
{"level":"info","time":"2020-10-17T14:59:40.680900Z","scope":"sds","msg":"resource:default connection is terminated: rpc error: code = Canceled desc = context canceled"}
{"level":"info","time":"2020-10-17T14:59:40.680941Z","scope":"sds","msg":"resource:default connection is terminated: rpc error: code = Canceled desc = context canceled"}
2020-10-17T14:59:40.681091Z warning envoy config [bazel-out/k8-opt/bin/external/envoy/source/common/config/_virtual_includes/grpc_stream_lib/common/config/grpc_stream.h:92] StreamSecrets gRPC config stream closed: 2, resource:default Close connection to proxy "sidecar~10.206.4.33~istio-test-app-3-69d695bf8b-b256m.istio-test-app-3~istio-test-app-3.svc.cluster.local-50"
2020-10-17T14:59:40.681147Z warning envoy config [bazel-out/k8-opt/bin/external/envoy/source/common/config/_virtual_includes/grpc_stream_lib/common/config/grpc_stream.h:92] StreamSecrets gRPC config stream closed: 2, resource:default Close connection to proxy "sidecar~10.206.4.33~istio-test-app-3-69d695bf8b-b256m.istio-test-app-3~istio-test-app-3.svc.cluster.local-51"
{"level":"info","time":"2020-10-17T14:59:40.795179Z","scope":"sds","msg":"resource:default new connection"}
{"level":"info","time":"2020-10-17T14:59:40.795280Z","scope":"sds","msg":"Skipping waiting for ingress gateway secret"}
{"level":"info","time":"2020-10-17T14:59:40.795539Z","scope":"cache","msg":"GenerateSecret from file default"}
{"level":"info","time":"2020-10-17T14:59:40.795689Z","scope":"sds","msg":"resource:default pushed key/cert pair to proxy"}
{"level":"info","time":"2020-10-17T14:59:41.114830Z","scope":"sds","msg":"resource:default new connection"}
{"level":"info","time":"2020-10-17T14:59:41.114952Z","scope":"sds","msg":"Skipping waiting for ingress gateway secret"}
{"level":"info","time":"2020-10-17T14:59:41.115164Z","scope":"cache","msg":"GenerateSecret from file default"}
{"level":"info","time":"2020-10-17T14:59:41.115318Z","scope":"sds","msg":"resource:default pushed key/cert pair to proxy"}
{"level":"info","time":"2020-10-17T15:04:40.680881Z","scope":"sds","msg":"resource:default connection is terminated: rpc error: code = Canceled desc = context canceled"}
{"level":"info","time":"2020-10-17T15:04:40.680970Z","scope":"sds","msg":"resource:default connection is terminated: rpc error: code = Canceled desc = context canceled"}
2020-10-17T15:04:40.681071Z warning envoy config [bazel-out/k8-opt/bin/external/envoy/source/common/config/_virtual_includes/grpc_stream_lib/common/config/grpc_stream.h:92] StreamSecrets gRPC config stream closed: 2, resource:default Close connection to proxy "sidecar~10.206.4.33~istio-test-app-3-69d695bf8b-b256m.istio-test-app-3~istio-test-app-3.svc.cluster.local-52"
2020-10-17T15:04:40.681126Z warning envoy config [bazel-out/k8-opt/bin/external/envoy/source/common/config/_virtual_includes/grpc_stream_lib/common/config/grpc_stream.h:92] StreamSecrets gRPC config stream closed: 2, resource:default Close connection to proxy "sidecar~10.206.4.33~istio-test-app-3-69d695bf8b-b256m.istio-test-app-3~istio-test-app-3.svc.cluster.local-53"
{"level":"info","time":"2020-10-17T15:04:40.980455Z","scope":"sds","msg":"resource:default new connection"}
{"level":"info","time":"2020-10-17T15:04:40.980561Z","scope":"sds","msg":"Skipping waiting for ingress gateway secret"}
{"level":"info","time":"2020-10-17T15:04:40.980783Z","scope":"cache","msg":"GenerateSecret from file default"}
{"level":"info","time":"2020-10-17T15:04:40.980906Z","scope":"sds","msg":"resource:default pushed key/cert pair to proxy"}
{"level":"info","time":"2020-10-17T15:04:41.180954Z","scope":"sds","msg":"resource:default new connection"}
{"level":"info","time":"2020-10-17T15:04:41.181074Z","scope":"sds","msg":"Skipping waiting for ingress gateway secret"}
{"level":"info","time":"2020-10-17T15:04:41.181312Z","scope":"cache","msg":"GenerateSecret from file default"}
{"level":"info","time":"2020-10-17T15:04:41.181612Z","scope":"sds","msg":"resource:default pushed key/cert pair to proxy"}
{"level":"info","time":"2020-10-17T15:09:40.680961Z","scope":"sds","msg":"resource:default connection is terminated: rpc error: code = Canceled desc = context canceled"}
2020-10-17T15:09:40.681152Z warning envoy config [bazel-out/k8-opt/bin/external/envoy/source/common/config/_virtual_includes/grpc_stream_lib/common/config/grpc_stream.h:92] StreamSecrets gRPC config stream closed: 2, resource:default Close connection to proxy "sidecar~10.206.4.33~istio-test-app-3-69d695bf8b-b256m.istio-test-app-3~istio-test-app-3.svc.cluster.local-54"
2020-10-17T15:09:40.681253Z warning envoy config [bazel-out/k8-opt/bin/external/envoy/source/common/config/_virtual_includes/grpc_stream_lib/common/config/grpc_stream.h:92] StreamSecrets gRPC config stream closed: 2, resource:default Close connection to proxy "sidecar~10.206.4.33~istio-test-app-3-69d695bf8b-b256m.istio-test-app-3~istio-test-app-3.svc.cluster.local-55"
{"level":"info","time":"2020-10-17T15:09:40.681058Z","scope":"sds","msg":"resource:default connection is terminated: rpc error: code = Canceled desc = context canceled"}
{"level":"info","time":"2020-10-17T15:09:40.842526Z","scope":"sds","msg":"resource:default new connection"}
{"level":"info","time":"2020-10-17T15:09:40.842666Z","scope":"sds","msg":"Skipping waiting for ingress gateway secret"}
{"level":"info","time":"2020-10-17T15:09:40.842911Z","scope":"cache","msg":"GenerateSecret from file default"}
{"level":"info","time":"2020-10-17T15:09:40.843082Z","scope":"sds","msg":"resource:default pushed key/cert pair to proxy"}
{"level":"info","time":"2020-10-17T15:09:40.893759Z","scope":"sds","msg":"resource:default new connection"}
{"level":"info","time":"2020-10-17T15:09:40.893894Z","scope":"sds","msg":"Skipping waiting for ingress gateway secret"}
{"level":"info","time":"2020-10-17T15:09:40.894106Z","scope":"cache","msg":"GenerateSecret from file default"}
{"level":"info","time":"2020-10-17T15:09:40.894256Z","scope":"sds","msg":"resource:default pushed key/cert pair to proxy"}
The only difference that istio-test-app-3
has vs 1 and 2 is that it uses:
- name: OUTPUT_CERTS
value: /etc/certs
As per this documentation.
To write certs to this volumeMount
:
- emptyDir:
medium: Memory
name: istio-certs
In order to write the certs to disk (as I was experimenting with for prometheus)
Looking at the logs I see:
2020-10-17T15:15:40.611175Z debug envoy conn_handler [external/envoy/source/server/connection_handler_impl.cc:411] [C59921] new connection
2020-10-17T15:15:40.612284Z debug envoy connection [external/envoy/source/extensions/transport_sockets/tls/ssl_socket.cc:191] [C59921] handshake expecting read
2020-10-17T15:15:40.612317Z debug envoy connection [external/envoy/source/extensions/transport_sockets/tls/ssl_socket.cc:191] [C59921] handshake expecting read
2020-10-17T15:15:40.612704Z debug envoy connection [external/envoy/source/extensions/transport_sockets/tls/ssl_socket.cc:198] [C59921] handshake error: 1
2020-10-17T15:15:40.612717Z debug envoy connection [external/envoy/source/extensions/transport_sockets/tls/ssl_socket.cc:226] [C59921] TLS error: 268436501:SSL routines:OPENSSL_internal:SSLV3_ALERT_CERTIFICATE_EXPIRED
2020-10-17T15:15:40.612723Z debug envoy connection [external/envoy/source/common/network/connection_impl.cc:200] [C59921] closing socket: 0
2020-10-17T15:15:40.612752Z debug envoy conn_handler [external/envoy/source/server/connection_handler_impl.cc:111] [C59921] adding to cleanup list
2020-10-17T15:15:40.679479Z debug envoy filter [external/envoy/source/extensions/filters/listener/original_dst/original_dst.cc:18] original_dst: New connection accepted
2020-10-17T15:15:40.679512Z debug envoy filter [external/envoy/source/extensions/filters/listener/tls_inspector/tls_inspector.cc:78] tls inspector: new connection accepted
2020-10-17T15:15:40.679541Z debug envoy filter [external/envoy/source/extensions/filters/listener/tls_inspector/tls_inspector.cc:148] tls:onServerName(), requestedServerName: outbound_.80_._.app.istio-test-app-3.svc.cluster.local
2020-10-17T15:15:40.679607Z debug envoy conn_handler [external/envoy/source/server/connection_handler_impl.cc:411] [C59922] new connection
2020-10-17T15:15:40.680782Z debug envoy connection [external/envoy/source/extensions/transport_sockets/tls/ssl_socket.cc:191] [C59922] handshake expecting read
2020-10-17T15:15:40.680811Z debug envoy connection [external/envoy/source/extensions/transport_sockets/tls/ssl_socket.cc:191] [C59922] handshake expecting read
2020-10-17T15:15:40.681162Z debug envoy connection [external/envoy/source/extensions/transport_sockets/tls/ssl_socket.cc:198] [C59922] handshake error: 1
2020-10-17T15:15:40.681181Z debug envoy connection [external/envoy/source/extensions/transport_sockets/tls/ssl_socket.cc:226] [C59922] TLS error: 268436501:SSL routines:OPENSSL_internal:SSLV3_ALERT_CERTIFICATE_EXPIRED
Which makes me check:
❯ ./src/1.6.12/bin/istioctl pc secret istio-test-app-3-69d695bf8b-b256m.istio-test-app-3
RESOURCE NAME TYPE STATUS VALID CERT SERIAL NUMBER NOT AFTER NOT BEFORE
default Cert Chain ACTIVE true 331844299171782252960640586806792969930 2020-10-17T13:34:41Z 2020-10-16T13:34:41Z
default Cert Chain ACTIVE true 331844299171782252960640586806792969930 2020-10-17T13:34:41Z 2020-10-16T13:34:41Z
ROOTCA CA ACTIVE true 2623683368870835277565802163895395946 2030-10-07T07:55:53Z 2020-10-09T07:55:53Z
Which shows the certificates are indeed out of date - and expired exactly when we started seeing issues.
This made me check another app where we mount certificates in this way (prometheus) and sure enough I see its certs are out of date too:
❯ ./src/1.6.12/bin/istioctl pc secret prometheus-0.app-metrics
RESOURCE NAME TYPE STATUS VALID CERT SERIAL NUMBER NOT AFTER NOT BEFORE
default Cert Chain ACTIVE true 88593072017091452019065727082748990393 2020-10-16T18:43:59Z 2020-10-15T18:43:59Z
default Cert Chain ACTIVE true 88593072017091452019065727082748990393 2020-10-16T18:43:59Z 2020-10-15T18:43:59Z
ROOTCA CA ACTIVE true 2623683368870835277565802163895395946 2030-10-07T07:55:53Z 2020-10-09T07:55:53Z
Restarting the pod renewed the certificates:
❯ ./src/1.6.12/bin/istioctl pc secret istio-test-app-3-69d695bf8b-87x6j.istio-test-app-3
RESOURCE NAME TYPE STATUS VALID CERT SERIAL NUMBER NOT AFTER NOT BEFORE
default Cert Chain ACTIVE true 168785524709398884220511848353340100457 2020-10-18T15:21:26Z 2020-10-17T15:21:26Z
default Cert Chain ACTIVE true 168785524709398884220511848353340100457 2020-10-18T15:21:26Z 2020-10-17T15:21:26Z
ROOTCA CA ACTIVE true 2623683368870835277565802163895395946 2030-10-07T07:55:53Z 2020-10-09T07:55:53Z
[ ] Docs [ ] Installation [ ] Networking [ ] Performance and Scalability [ ] Extensions and Telemetry [x] Security [ ] Test and Release [x] User Experience [ ] Developer Infrastructure
Expected behavior
SDS certs to be updated correctly when using OUTPUT_CERTS
Steps to reproduce the bug No idea
Version (include the output of istioctl version --remote
and kubectl version --short
and helm version
if you used Helm)
1.6.12
How was Istio installed? Helm
Environment where bug was observed (cloud vendor, OS, etc) GKE
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Comments: 15 (15 by maintainers)
Definitely not! If we can’t do sequential version updates (1.5 -> 1.6 -> 1.7) which is the approved upgrade route maintaining backwards compatibility, I have no confidence in an untrodden and untested path of 1.5 -> 1.7 working.
We decided to stay at least one minor version behind because of the painful track record of issues and breaking changes with new releases. After the nightmare experience of getting onto 1.5 (telemetry v2 etc), we decided to leave it a little longer for 1.6 to stabilise.
I really believe Istio are going to alienate any serious users with this overly rapid release process and T-1 support policy. There needs to be a focus on stability where you are right now and ensuring people are actually able to upgrade, not releasing more enhancements. 12 patch versions into 1.6 and I wouldn’t describe it as anything other than broken and for us, actually unusable:
Two weeks of testing it and we we’ve not been able to get it past our most basic qualifying environments:
I’ve always been a massive advocate of Istio. But I’m finding it harder and harder to continue to recommend it.
1.5 -> 1.6 has easily been the worst experience yet.
@Stono you can do it with:
I think we need to update some of the mesh config docs for this annotation though, it was new