istio: Pilot ADS not coping and failing to push

Describe the bug Under heavy load of pods’ restart (think of a development cluster), Istio’s pilot seems to fail to cope with all the changes and sometimes it doesn’t push the new configuration to new pods.

Expected behavior Istio’s pilot should always inject the current configuration into any new pod starting up.

Version Kubernetes: 1.11.4 Istio: 1.0.4

Some logs:

2018-12-07T08:55:06.300944Z     warn    ads     Repeated failure to push 41 sidecar~174.16.213.15~test-webapp-bcd997dfb-m6g75.performance~performance.svc.cluster.local-269
2018-12-07T08:55:06.300945Z     warn    ads     Repeated failure to push 41 sidecar~174.16.213.187~mpc-wipo-ms-incoming-ftp-578fc5476b-8rzkn.intcorrective-cb~intcorrective-cb.svc.cluster.local-470
2018-12-07T08:55:11.301203Z     warn    ads     Repeated failure to push 41 sidecar~174.16.61.58~kibanatools-payment-data-pipeline-sink-elasticsearch-5b7b4hvm6p.preprod-cb~preprod-cb.svc.cluster.local-66
2018-12-07T08:55:11.301208Z     warn    ads     Repeated failure to push 41 sidecar~174.16.212.245~absolutegrounds-helper-processors-5c4d96f4f6-9ghcw.intcorrective-cb~intcorrective-cb.svc.cluster.local-272
2018-12-07T08:55:16.301500Z     warn    ads     Repeated failure to push 41 sidecar~174.16.61.48~ms-litigation-74b4bb8995-z556q.intcorrective-cb~intcorrective-cb.svc.cluster.local-151
2018-12-07T08:55:16.301643Z     warn    ads     Repeated failure to push 41 sidecar~174.16.74.71~parallelmarks-server-96f79f9-lm4h2.test-ds~test-ds.svc.cluster.local-137
2018-12-07T08:55:21.301748Z     warn    ads     Repeated failure to push 41 sidecar~174.16.61.60~classification-helper-sinks-79f95c5c5c-l4l6p.intcorrective-cb~intcorrective-cb.svc.cluster.local-202
2018-12-07T08:55:21.301801Z     warn    ads     Repeated failure to push 41 sidecar~174.16.15.140~kibanatools-webreports-frontend-64f8957886-zsjc9.preprod-cb~preprod-cb.svc.cluster.local-232
2018-12-07T08:55:26.301984Z     warn    ads     Repeated failure to push 41 sidecar~174.16.142.215~ms-appeal-85d649db59-228p5.preprod-cb~preprod-cb.svc.cluster.local-585
2018-12-07T08:55:26.301985Z     warn    ads     Repeated failure to push 41 sidecar~174.16.60.225~ms-register-6d8c87d86c-sszcn.intcorrective-cb~intcorrective-cb.svc.cluster.local-213
2018-12-07T08:55:31.302289Z     warn    ads     Repeated failure to push 41 sidecar~174.16.213.30~opposition-helper-ui-6649487dd8-jbhbn.intadaptive-cb~intadaptive-cb.svc.cluster.local-260
2018-12-07T08:55:31.302349Z     warn    ads     Repeated failure to push 41 sidecar~174.16.213.42~ms-design-77b78b8764-5b79n.test-cb~test-cb.svc.cluster.local-440
2018-12-07T08:55:36.302505Z     warn    ads     Repeated failure to push 41 sidecar~174.16.138.151~ms-register-pipeline-67b5489849-zgbfd.intadaptive-cb~intadaptive-cb.svc.cluster.local-542
2018-12-07T08:55:36.302505Z     warn    ads     Repeated failure to push 41 sidecar~174.16.142.215~ms-appeal-85d649db59-228p5.preprod-cb~preprod-cb.svc.cluster.local-585
2018-12-07T08:55:41.302759Z     warn    ads     Repeated failure to push 41 sidecar~174.16.213.42~ms-design-77b78b8764-5b79n.test-cb~test-cb.svc.cluster.local-440
2018-12-07T08:55:41.302878Z     warn    ads     Repeated failure to push 41 sidecar~174.16.61.48~ms-litigation-74b4bb8995-z556q.intcorrective-cb~intcorrective-cb.svc.cluster.local-151
2018-12-07T08:55:46.303016Z     warn    ads     Repeated failure to push 41 sidecar~174.16.138.151~ms-register-pipeline-67b5489849-zgbfd.intadaptive-cb~intadaptive-cb.svc.cluster.local-542
2018-12-07T08:55:46.305304Z     warn    ads     Repeated failure to push 41 sidecar~174.16.61.60~classification-helper-sinks-79f95c5c5c-l4l6p.intcorrective-cb~intcorrective-cb.svc.cluster.local-202
2018-12-07T08:55:51.303260Z     warn    ads     Repeated failure to push 41 sidecar~174.16.93.180~ms-register-pipeline-67b5489849-74w9d.intadaptive-cb~intadaptive-cb.svc.cluster.local-596
2018-12-07T08:55:51.305462Z     warn    ads     Repeated failure to push 41 sidecar~174.16.60.225~ms-register-6d8c87d86c-sszcn.intcorrective-cb~intcorrective-cb.svc.cluster.local-213
2018-12-07T08:55:56.303468Z     warn    ads     Repeated failure to push 41 sidecar~174.16.35.55~ms-invalidity-6dcc99fd5d-dgztr.intadaptive-cb~intadaptive-cb.svc.cluster.local-192
2018-12-07T08:55:56.305622Z     warn    ads     Repeated failure to push 41 sidecar~174.16.213.30~opposition-helper-ui-6649487dd8-jbhbn.intadaptive-cb~intadaptive-cb.svc.cluster.local-260
2018-12-07T08:56:01.303734Z     warn    ads     Repeated failure to push 41 sidecar~174.16.81.38~cancellation-helper-processors-fd977dc5-wddkc.test-cb~test-cb.svc.cluster.local-592
2018-12-07T08:56:01.305871Z     warn    ads     Repeated failure to push 41 sidecar~174.16.35.55~ms-invalidity-6dcc99fd5d-dgztr.intadaptive-cb~intadaptive-cb.svc.cluster.local-192

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Reactions: 3
  • Comments: 27 (13 by maintainers)

Most upvoted comments

Got similar issue in 1.0.5 when enabled Istio in a namespace with 140 Pods - The total number of services on this cluster is 180. The difference from the previous comments is that right after the first “ads Repeated failure to push” log message we started to see an increase in memory utilization of the Pilot.

image

We’re also running a single Pilot at the moment. I’m planing to setup resource limitation to the Pilot and increase the number of replicas but reading the comment above it concerns me. I know there was many improvements in newer releases but I wonder if this issue was addressed.