https://jenkins.cilium.io/job/Ginkgo-CI-Tests-Pipeline/2299/testReport/junit/k8s-1/10/K8sValidatedChaosTest_Endpoint_can_still_connect_while_Cilium_is_not_running/
Stacktrace
/home/jenkins/workspace/Ginkgo-CI-Tests-Pipeline/src/github.com/cilium/cilium/test/ginkgo-ext/scopes.go:312
Cannot curl from "testclient-vrs9w" to testds-service
Expected
<bool>: false
to be true
/home/jenkins/workspace/Ginkgo-CI-Tests-Pipeline/src/github.com/cilium/cilium/test/k8sT/Chaos.go:126
Standard Output
time="2018-05-02T06:39:28Z" level=info msg=Starting testName=K8sChaosTest
time="2018-05-02T06:39:28Z" level=info msg="Vagrant: running command 'vagrant ssh-config k8s1-1.10'"
time="2018-05-02T06:39:29Z" level=debug msg="generated SSHConfig for node k8s1-1.10"
time="2018-05-02T06:39:29Z" level=debug msg="done importing ssh config"
time="2018-05-02T06:39:29Z" level=debug msg="running command: ls /tmp/"
time="2018-05-02T06:39:30Z" level=debug msg="running command: cat ${HOME}/.kube/config"
time="2018-05-02T06:39:30Z" level=debug msg="applying /home/vagrant/go/src/github.com/cilium/cilium/test/k8sT/manifests/cilium_ds.yaml" testName=K8sChaosTest
time="2018-05-02T06:39:30Z" level=debug msg="running command: kubectl apply -f /home/vagrant/go/src/github.com/cilium/cilium/test/k8sT/manifests/cilium_ds.yaml"
time="2018-05-02T06:39:35Z" level=debug msg="running command: kubectl -n kube-system get pods -l k8s-app=cilium -o json"
time="2018-05-02T06:39:46Z" level=debug msg="running command: kubectl -n kube-system get pods -l k8s-app=kube-dns -o json"
time="2018-05-02T06:39:46Z" level=debug msg="applying /home/vagrant/go/src/github.com/cilium/cilium/test/k8sT/manifests/demo_ds.yaml" testName=K8sChaosTest
time="2018-05-02T06:39:46Z" level=debug msg="running command: kubectl apply -f /home/vagrant/go/src/github.com/cilium/cilium/test/k8sT/manifests/demo_ds.yaml"
time="2018-05-02T06:39:51Z" level=debug msg="running command: kubectl -n default get pods -l zgroup=testDS -o json"
time="2018-05-02T06:39:52Z" level=info msg="WaitforPods: pods are not ready" data="false false" filter="-l zgroup=testDS" namespace=default testName=K8sChaosTest
time="2018-05-02T06:39:56Z" level=debug msg="running command: kubectl -n default get pods -l zgroup=testDS -o json"
time="2018-05-02T06:40:02Z" level=debug msg="running command: kubectl -n default get pods -l zgroup=testDSClient -o json"
time="2018-05-02T06:40:02Z" level=debug msg="running command: kubectl -n default get pods -l zgroup=testDS -o jsonpath='{range .items[*]}{@.metadata.name}{\"=\"}{@.status.podIP}{\"\\n\"}{end}'"
time="2018-05-02T06:40:02Z" level=debug msg="running command: kubectl exec -n default testclient-qdkdl -- ping -W 2 -c 5 10.10.1.95"
time="2018-05-02T06:40:07Z" level=debug msg="Pod testclient-qdkdl ping 10.10.1.95"
time="2018-05-02T06:40:07Z" level=debug msg="running command: kubectl -n kube-system get service kube-dns -o json"
time="2018-05-02T06:40:12Z" level=debug msg="running command: dig +short testds-service.default.svc.cluster.local @10.96.0.10 | grep -v -e '^$'"
time="2018-05-02T06:40:12Z" level=debug msg="running command: kubectl exec -n default testclient-qdkdl -- curl -s --fail --connect-timeout 5 --max-time 5 http://testds-service.default.svc.cluster.local:80/"
time="2018-05-02T06:40:12Z" level=debug msg="running command: kubectl exec -n default testclient-qdkdl -- ping -W 2 -c 5 10.10.0.158"
time="2018-05-02T06:40:17Z" level=debug msg="Pod testclient-qdkdl ping 10.10.0.158"
time="2018-05-02T06:40:17Z" level=debug msg="running command: kubectl -n kube-system get service kube-dns -o json"
time="2018-05-02T06:40:22Z" level=debug msg="running command: dig +short testds-service.default.svc.cluster.local @10.96.0.10 | grep -v -e '^$'"
time="2018-05-02T06:40:22Z" level=debug msg="running command: kubectl exec -n default testclient-qdkdl -- curl -s --fail --connect-timeout 5 --max-time 5 http://testds-service.default.svc.cluster.local:80/"
time="2018-05-02T06:40:23Z" level=debug msg="running command: kubectl exec -n default testclient-vrs9w -- ping -W 2 -c 5 10.10.1.95"
time="2018-05-02T06:40:27Z" level=debug msg="Pod testclient-vrs9w ping 10.10.1.95"
time="2018-05-02T06:40:27Z" level=debug msg="running command: kubectl -n kube-system get service kube-dns -o json"
time="2018-05-02T06:40:33Z" level=debug msg="running command: dig +short testds-service.default.svc.cluster.local @10.96.0.10 | grep -v -e '^$'"
time="2018-05-02T06:40:33Z" level=debug msg="running command: kubectl exec -n default testclient-vrs9w -- curl -s --fail --connect-timeout 5 --max-time 5 http://testds-service.default.svc.cluster.local:80/"
time="2018-05-02T06:40:33Z" level=debug msg="running command: kubectl exec -n default testclient-vrs9w -- ping -W 2 -c 5 10.10.0.158"
time="2018-05-02T06:40:38Z" level=debug msg="Pod testclient-vrs9w ping 10.10.0.158"
time="2018-05-02T06:40:38Z" level=debug msg="running command: kubectl -n kube-system get service kube-dns -o json"
time="2018-05-02T06:40:43Z" level=debug msg="running command: dig +short testds-service.default.svc.cluster.local @10.96.0.10 | grep -v -e '^$'"
time="2018-05-02T06:40:43Z" level=debug msg="running command: kubectl exec -n default testclient-vrs9w -- curl -s --fail --connect-timeout 5 --max-time 5 http://testds-service.default.svc.cluster.local:80/"
�[1mSTEP�[0m: Deleting cilium pods
time="2018-05-02T06:40:43Z" level=debug msg="running command: kubectl -n kube-system delete pods -l k8s-app=cilium"
time="2018-05-02T06:40:49Z" level=debug msg="running command: kubectl -n kube-system get pods -l k8s-app=cilium -o json"
time="2018-05-02T06:40:49Z" level=info msg="WaitforPods: pods are not ready" data="false true" filter="-l k8s-app=cilium" namespace=kube-system testName=K8sChaosTest
time="2018-05-02T06:40:54Z" level=debug msg="running command: kubectl -n kube-system get pods -l k8s-app=cilium -o json"
time="2018-05-02T06:40:54Z" level=info msg="WaitforPods: pods are not ready" data="false true" filter="-l k8s-app=cilium" namespace=kube-system testName=K8sChaosTest
time="2018-05-02T06:40:59Z" level=debug msg="running command: kubectl -n kube-system get pods -l k8s-app=cilium -o json"
time="2018-05-02T06:40:59Z" level=debug msg="running command: kubectl -n default get pods -l zgroup=testDS -o jsonpath='{range .items[*]}{@.metadata.name}{\"=\"}{@.status.podIP}{\"\\n\"}{end}'"
time="2018-05-02T06:40:59Z" level=debug msg="running command: kubectl exec -n default testclient-qdkdl -- ping -W 2 -c 5 10.10.1.95"
time="2018-05-02T06:41:04Z" level=debug msg="Pod testclient-qdkdl ping 10.10.1.95"
time="2018-05-02T06:41:04Z" level=debug msg="running command: kubectl -n kube-system get service kube-dns -o json"
time="2018-05-02T06:41:10Z" level=debug msg="running command: dig +short testds-service.default.svc.cluster.local @10.96.0.10 | grep -v -e '^$'"
time="2018-05-02T06:41:10Z" level=debug msg="running command: kubectl exec -n default testclient-qdkdl -- curl -s --fail --connect-timeout 5 --max-time 5 http://testds-service.default.svc.cluster.local:80/"
time="2018-05-02T06:41:10Z" level=debug msg="running command: kubectl exec -n default testclient-qdkdl -- ping -W 2 -c 5 10.10.0.158"
time="2018-05-02T06:41:15Z" level=debug msg="Pod testclient-qdkdl ping 10.10.0.158"
time="2018-05-02T06:41:15Z" level=debug msg="running command: kubectl -n kube-system get service kube-dns -o json"
time="2018-05-02T06:41:20Z" level=debug msg="running command: dig +short testds-service.default.svc.cluster.local @10.96.0.10 | grep -v -e '^$'"
time="2018-05-02T06:41:20Z" level=debug msg="running command: kubectl exec -n default testclient-qdkdl -- curl -s --fail --connect-timeout 5 --max-time 5 http://testds-service.default.svc.cluster.local:80/"
time="2018-05-02T06:41:20Z" level=debug msg="running command: kubectl exec -n default testclient-vrs9w -- ping -W 2 -c 5 10.10.1.95"
time="2018-05-02T06:41:25Z" level=debug msg="Pod testclient-vrs9w ping 10.10.1.95"
time="2018-05-02T06:41:25Z" level=debug msg="running command: kubectl -n kube-system get service kube-dns -o json"
time="2018-05-02T06:41:30Z" level=debug msg="running command: dig +short testds-service.default.svc.cluster.local @10.96.0.10 | grep -v -e '^$'"
time="2018-05-02T06:41:30Z" level=debug msg="running command: kubectl exec -n default testclient-vrs9w -- curl -s --fail --connect-timeout 5 --max-time 5 http://testds-service.default.svc.cluster.local:80/"
I searched the logs and ignored the copious amounts of policy computation logs, as I view them as extraneous here:
The endpoint starts with the reserved identity init, as expected:
Policy enforcement is enabled for the endpoint because it is in init mode still (per the code block in the previous comment above):
The endpoint then gets a new identity because we received its labels from K8s:
We then skip regeneration of policy for some reason - I’m not sure where these “waiting for proxy updates to complete…” logs come into play here:
The key thing here is that we are ‘skipping policy recalculation’ after getting this new identity, it seems:
We skip policy recalculation even though we have changed identities, and policy enforcement should be disabled for ingress / egress:
https://github.com/cilium/cilium/blob/18ef7f2b8fa5a7017f5027e88a5274b96e6786cf/pkg/endpoint/policy.go#L492-L504
And ingress / egress policy enforcement are updated later in the same function than the above block of code (indicating that the code is never hit, and as such the endpoint is not regenerated correctly):
https://github.com/cilium/cilium/blob/18ef7f2b8fa5a7017f5027e88a5274b96e6786cf/pkg/endpoint/policy.go#L549-L570
This is a release-blocker, IMO.