cilium: CI failure: KafkaPolicies failed due to kafka-broker going into CrashLoopBackOff because service names aren't plumbed down to kube-dns for 5 minutes in K8s 1.7 environments

Build link: https://jenkins.cilium.io/job/Ginkgo-CI-Tests-Pipeline/3446/testReport/junit/k8s-1/7/K8sKafkaPolicyTest_Kafka_Policy_Tests_KafkaPolicies/

Output:

Stacktrace

/home/jenkins/workspace/Ginkgo-CI-Tests-Pipeline/src/github.com/cilium/cilium/test/ginkgo-ext/scopes.go:333
Kafka Pods are not ready after timeout
Expected
    <*errors.errorString | 0xc425a8b630>: {
        s: "Timeout reached: timed out waiting for pods with filter -l zgroup=kafkaTestApp to be ready",
    }
to be nil
/home/jenkins/workspace/Ginkgo-CI-Tests-Pipeline/src/github.com/cilium/cilium/test/k8sT/KafkaPolicies.go:133
Standard Output

Number of "context deadline exceeded" in logs: 0
Number of "level=error" in logs: 0
Number of "level=warning" in logs: 0
Number of "Cilium API handler panicked" in logs: 0
Cilium pods: [cilium-cx4bm cilium-sc5qb]
Netpols loaded: 
CiliumNetworkPolicies loaded: 
Endpoint Policy Enforcement:
     empire-outpost-9999-4143713519-q2sv1   =>   none
     kafka-broker-2873197211-qh6kt          =>   none
     prometheus-core-3506541099-jwhst       =>   none
     empire-backup-2986355077-7lh6d         =>   none
     empire-hq-2917507631-n4d1c             =>   none
     empire-outpost-8888-3627733903-8xdrk   =>   none
     zookeeper-1032194181-mt667             =>   none
     cilium-health-k8s2                     =>   none
     kube-dns-2552250787-1xc0z              =>   none
Cilium agent "cilium-cx4bm": Status: Ok  Health: Ok Nodes "k8s1 k8s2" ContinerRuntime: Ok Kubernetes: Ok KVstore: Ok Controllers: Total 50 Failed 0
Cilium agent "cilium-sc5qb": Status: Ok  Health: Ok Nodes "k8s1 k8s2" ContinerRuntime: Ok Kubernetes: Ok KVstore: Ok Controllers: Total 26 Failed 0

Standard Error

===================== TEST FAILED =====================
cmd: kubectl get pods -o wide --all-namespaces
NAMESPACE     NAME                                   READY     STATUS             RESTARTS   AGE       IP              NODE
default       empire-backup-2986355077-7lh6d         1/1       Running            0          5m        10.10.1.190     k8s2
default       empire-hq-2917507631-n4d1c             1/1       Running            0          5m        10.10.0.251     k8s1
default       empire-outpost-8888-3627733903-8xdrk   1/1       Running            0          5m        10.10.1.192     k8s2
default       empire-outpost-9999-4143713519-q2sv1   1/1       Running            0          5m        10.10.1.222     k8s2
default       kafka-broker-2873197211-qh6kt          0/1       CrashLoopBackOff   5          5m        10.10.1.98      k8s2
default       zookeeper-1032194181-mt667             1/1       Running            0          5m        10.10.1.165     k8s2
kube-system   cilium-cx4bm                           1/1       Running            0          5m        192.168.36.12   k8s2
kube-system   cilium-sc5qb                           1/1       Running            0          5m        192.168.36.11   k8s1
kube-system   etcd-k8s1                              1/1       Running            0          46m       192.168.36.11   k8s1
kube-system   kube-apiserver-k8s1                    1/1       Running            0          46m       192.168.36.11   k8s1
kube-system   kube-controller-manager-k8s1           1/1       Running            0          46m       192.168.36.11   k8s1
kube-system   kube-dns-2552250787-1xc0z              3/3       Running            0          6m        10.10.1.243     k8s2
kube-system   kube-proxy-vps9k                       1/1       Running            0          47m       192.168.36.11   k8s1
kube-system   kube-proxy-vxchz                       1/1       Running            0          40m       192.168.36.12   k8s2
kube-system   kube-scheduler-k8s1                    1/1       Running            0          46m       192.168.36.11   k8s1
prometheus    prometheus-core-3506541099-jwhst       1/1       Running            0          40m       10.10.0.75      k8s1

cmd: kubectl exec -n kube-system cilium-cx4bm -- cilium service list
ID   Frontend             Backend                   
1    10.96.0.1:443        1 => 192.168.36.11:6443   
2    10.96.0.10:53        1 => 10.10.1.243:53       
3    10.101.3.243:9090    1 => 10.10.0.75:9090      
4    10.105.192.11:2181   1 => 10.10.1.165:2181     

cmd: kubectl exec -n kube-system cilium-cx4bm -- cilium endpoint list
ENDPOINT   POLICY (ingress)   POLICY (egress)   IDENTITY   LABELS (source:key[=value])                        IPv6                   IPv4          STATUS   
           ENFORCEMENT        ENFORCEMENT                                                                                                          
3650       Disabled           Disabled          22035      k8s:app=empire-outpost                             f00d::a0a:100:0:e42    10.10.1.222   ready   
                                                           k8s:io.cilium.k8s.policy.serviceaccount=default                                                 
                                                           k8s:io.kubernetes.pod.namespace=default                                                         
                                                           k8s:outpostid=9999                                                                              
                                                           k8s:zgroup=kafkaTestApp                                                                         
6117       Disabled           Disabled          35842      k8s:app=empire-outpost                             f00d::a0a:100:0:17e5   10.10.1.192   ready   
                                                           k8s:io.cilium.k8s.policy.serviceaccount=default                                                 
                                                           k8s:io.kubernetes.pod.namespace=default                                                         
                                                           k8s:outpostid=8888                                                                              
                                                           k8s:zgroup=kafkaTestApp                                                                         
6523       Disabled           Disabled          4          reserved:health                                    f00d::a0a:100:0:197b   10.10.1.247   ready   
8208       Disabled           Disabled          50075      k8s:app=kafka                                      f00d::a0a:100:0:2010   10.10.1.98    ready   
                                                           k8s:io.cilium.k8s.policy.serviceaccount=default                                                 
                                                           k8s:io.kubernetes.pod.namespace=default                                                         
                                                           k8s:zgroup=kafkaTestApp                                                                         
22152      Disabled           Disabled          47341      k8s:io.cilium.k8s.policy.serviceaccount=kube-dns   f00d::a0a:100:0:5688   10.10.1.243   ready   
                                                           k8s:io.kubernetes.pod.namespace=kube-system                                                     
                                                           k8s:k8s-app=kube-dns                                                                            
32564      Disabled           Disabled          59494      k8s:app=zook                                       f00d::a0a:100:0:7f34   10.10.1.165   ready   
                                                           k8s:io.cilium.k8s.policy.serviceaccount=default                                                 
                                                           k8s:io.kubernetes.pod.namespace=default                                                         
                                                           k8s:zgroup=kafkaTestApp                                                                         
45365      Disabled           Disabled          43645      k8s:app=empire-backup                              f00d::a0a:100:0:b135   10.10.1.190   ready   
                                                           k8s:io.cilium.k8s.policy.serviceaccount=default                                                 
                                                           k8s:io.kubernetes.pod.namespace=default                                                         
                                                           k8s:zgroup=kafkaTestApp                                                                         

cmd: kubectl exec -n kube-system cilium-sc5qb -- cilium service list
ID   Frontend             Backend                   
1    10.96.0.1:443        1 => 192.168.36.11:6443   
2    10.96.0.10:53        1 => 10.10.1.243:53       
3    10.101.3.243:9090    1 => 10.10.0.75:9090      
4    10.105.192.11:2181   1 => 10.10.1.165:2181     

cmd: kubectl exec -n kube-system cilium-sc5qb -- cilium endpoint list
ENDPOINT   POLICY (ingress)   POLICY (egress)   IDENTITY   LABELS (source:key[=value])                              IPv6                 IPv4          STATUS   
           ENFORCEMENT        ENFORCEMENT                                                                                                              
8417       Disabled           Disabled          48563      k8s:app=prometheus                                       f00d::a0a:0:0:20e1   10.10.0.75    ready   
                                                           k8s:component=core                                                                                  
                                                           k8s:io.cilium.k8s.policy.serviceaccount=prometheus-k8s                                              
                                                           k8s:io.kubernetes.pod.namespace=prometheus                                                          
38477      Disabled           Disabled          4          reserved:health                                          f00d::a0a:0:0:964d   10.10.0.190   ready   
41052      Disabled           Disabled          32829      k8s:app=empire-hq                                        f00d::a0a:0:0:a05c   10.10.0.251   ready   
                                                           k8s:io.cilium.k8s.policy.serviceaccount=default                                                     
                                                           k8s:io.kubernetes.pod.namespace=default                                                             
                                                           k8s:zgroup=kafkaTestApp    

Logs: 4bc18599_K8sKafkaPolicyTest_Kafka_Policy_Tests_KafkaPolicies.zip

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Comments: 23 (23 by maintainers)

Most upvoted comments

I’ve just noticed that this is use to happens when Cilium Upgrade happens just before Kafka, maybe any miss update?

Builds ID: 1335 and 1336 https://jenkins.cilium.io/job/cilium-ginkgo/job/cilium/job/master/1335/execution/node/97/log/?consoleFull https://jenkins.cilium.io/job/cilium-ginkgo/job/cilium/job/master/1336/execution/node/97/log/?consoleFull

Seed: 1532949306