istio: isito-validation initContainer report error when istio-cni is not installed correctly
Bug description When enabling istio-cni-repair on Istio 1.4.5, the istio-validation init container fails. Relevant logs:
in new validator: <pod_ip>
Listening on 127.0.0.1:15001
Listening on 127.0.0.1:15006
Error connecting to 127.0.0.6:15002: dial tcp 127.0.0.1:0->127.0.0.6:15002: connect: connection refused
Error connecting to 127.0.0.6:15002: dial tcp 127.0.0.1:0->127.0.0.6:15002: connect: connection refused
Error connecting to 127.0.0.6:15002: dial tcp 127.0.0.1:0->127.0.0.6:15002: connect: connection refused
Error connecting to 127.0.0.6:15002: dial tcp 127.0.0.1:0->127.0.0.6:15002: connect: connection refused
Error connecting to 127.0.0.6:15002: dial tcp 127.0.0.1:0->127.0.0.6:15002: connect: connection refused
Error connecting to 127.0.0.6:15002: dial tcp 127.0.0.1:0->127.0.0.6:15002: connect: connection refused
Error connecting to 127.0.0.6:15002: dial tcp 127.0.0.1:0->127.0.0.6:15002: connect: connection refused
A brief examination of the code would indicate to me that the validator never starts a listener on the IptablesProbePort
, unless I am mistake (https://github.com/istio/istio/blob/1.4.5/tools/istio-iptables/pkg/validation/validator.go#L117 and https://github.com/istio/istio/blob/1.4.5/tools/istio-iptables/pkg/validation/validator.go#L166-L178).
Expected behavior istio-validator validates pod CNI and successfully exits.
Steps to reproduce the bug Install Istio 1.4.5 with CNI and repair enabled. Attempt to launch a pod.
Version (include the output of istioctl version --remote
and kubectl version
and helm version
if you used Helm)
istio: 1.4.5
kube: 1.15.7
How was Istio installed? helm template
Environment where bug was observed (cloud vendor, OS, etc) AWS - kops
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Comments: 38 (23 by maintainers)
@tmshort Thank you for the comment. The repair and validation is heuristic: they don’t understand why istio-cni does not inject the iptables. The next step is to confirm the iptables
Any how, you can access the k8s node, run iptables-save in the network namespace of the failing pod
on the k8s node, run below as root
Let’s start with if the iptables-save container any rule with the name “ISTIO”
@zs-ddl interesting. Then it loops back to #14977
@towens Is there some helpful hints in the
describe
for the DaemonSet?For us, we’re seeing the following:
Per the K8s discussion on
system-cluster-critical
, looks like we’ll have to wait until GKE supports K8s 1.17 to allow for the priority class to be in another namespace other than kube-system.