cilium: CI failure: KafkaPolicies failed due to kafka-broker going into CrashLoopBackOff because service names aren't plumbed down to kube-dns for 5 minutes in K8s 1.7 environments
Output:
Stacktrace
/home/jenkins/workspace/Ginkgo-CI-Tests-Pipeline/src/github.com/cilium/cilium/test/ginkgo-ext/scopes.go:333
Kafka Pods are not ready after timeout
Expected
<*errors.errorString | 0xc425a8b630>: {
s: "Timeout reached: timed out waiting for pods with filter -l zgroup=kafkaTestApp to be ready",
}
to be nil
/home/jenkins/workspace/Ginkgo-CI-Tests-Pipeline/src/github.com/cilium/cilium/test/k8sT/KafkaPolicies.go:133
Standard Output
Number of "context deadline exceeded" in logs: 0
Number of "level=error" in logs: 0
Number of "level=warning" in logs: 0
Number of "Cilium API handler panicked" in logs: 0
Cilium pods: [cilium-cx4bm cilium-sc5qb]
Netpols loaded:
CiliumNetworkPolicies loaded:
Endpoint Policy Enforcement:
empire-outpost-9999-4143713519-q2sv1 => none
kafka-broker-2873197211-qh6kt => none
prometheus-core-3506541099-jwhst => none
empire-backup-2986355077-7lh6d => none
empire-hq-2917507631-n4d1c => none
empire-outpost-8888-3627733903-8xdrk => none
zookeeper-1032194181-mt667 => none
cilium-health-k8s2 => none
kube-dns-2552250787-1xc0z => none
Cilium agent "cilium-cx4bm": Status: Ok Health: Ok Nodes "k8s1 k8s2" ContinerRuntime: Ok Kubernetes: Ok KVstore: Ok Controllers: Total 50 Failed 0
Cilium agent "cilium-sc5qb": Status: Ok Health: Ok Nodes "k8s1 k8s2" ContinerRuntime: Ok Kubernetes: Ok KVstore: Ok Controllers: Total 26 Failed 0
Standard Error
===================== TEST FAILED =====================
cmd: kubectl get pods -o wide --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE
default empire-backup-2986355077-7lh6d 1/1 Running 0 5m 10.10.1.190 k8s2
default empire-hq-2917507631-n4d1c 1/1 Running 0 5m 10.10.0.251 k8s1
default empire-outpost-8888-3627733903-8xdrk 1/1 Running 0 5m 10.10.1.192 k8s2
default empire-outpost-9999-4143713519-q2sv1 1/1 Running 0 5m 10.10.1.222 k8s2
default kafka-broker-2873197211-qh6kt 0/1 CrashLoopBackOff 5 5m 10.10.1.98 k8s2
default zookeeper-1032194181-mt667 1/1 Running 0 5m 10.10.1.165 k8s2
kube-system cilium-cx4bm 1/1 Running 0 5m 192.168.36.12 k8s2
kube-system cilium-sc5qb 1/1 Running 0 5m 192.168.36.11 k8s1
kube-system etcd-k8s1 1/1 Running 0 46m 192.168.36.11 k8s1
kube-system kube-apiserver-k8s1 1/1 Running 0 46m 192.168.36.11 k8s1
kube-system kube-controller-manager-k8s1 1/1 Running 0 46m 192.168.36.11 k8s1
kube-system kube-dns-2552250787-1xc0z 3/3 Running 0 6m 10.10.1.243 k8s2
kube-system kube-proxy-vps9k 1/1 Running 0 47m 192.168.36.11 k8s1
kube-system kube-proxy-vxchz 1/1 Running 0 40m 192.168.36.12 k8s2
kube-system kube-scheduler-k8s1 1/1 Running 0 46m 192.168.36.11 k8s1
prometheus prometheus-core-3506541099-jwhst 1/1 Running 0 40m 10.10.0.75 k8s1
cmd: kubectl exec -n kube-system cilium-cx4bm -- cilium service list
ID Frontend Backend
1 10.96.0.1:443 1 => 192.168.36.11:6443
2 10.96.0.10:53 1 => 10.10.1.243:53
3 10.101.3.243:9090 1 => 10.10.0.75:9090
4 10.105.192.11:2181 1 => 10.10.1.165:2181
cmd: kubectl exec -n kube-system cilium-cx4bm -- cilium endpoint list
ENDPOINT POLICY (ingress) POLICY (egress) IDENTITY LABELS (source:key[=value]) IPv6 IPv4 STATUS
ENFORCEMENT ENFORCEMENT
3650 Disabled Disabled 22035 k8s:app=empire-outpost f00d::a0a:100:0:e42 10.10.1.222 ready
k8s:io.cilium.k8s.policy.serviceaccount=default
k8s:io.kubernetes.pod.namespace=default
k8s:outpostid=9999
k8s:zgroup=kafkaTestApp
6117 Disabled Disabled 35842 k8s:app=empire-outpost f00d::a0a:100:0:17e5 10.10.1.192 ready
k8s:io.cilium.k8s.policy.serviceaccount=default
k8s:io.kubernetes.pod.namespace=default
k8s:outpostid=8888
k8s:zgroup=kafkaTestApp
6523 Disabled Disabled 4 reserved:health f00d::a0a:100:0:197b 10.10.1.247 ready
8208 Disabled Disabled 50075 k8s:app=kafka f00d::a0a:100:0:2010 10.10.1.98 ready
k8s:io.cilium.k8s.policy.serviceaccount=default
k8s:io.kubernetes.pod.namespace=default
k8s:zgroup=kafkaTestApp
22152 Disabled Disabled 47341 k8s:io.cilium.k8s.policy.serviceaccount=kube-dns f00d::a0a:100:0:5688 10.10.1.243 ready
k8s:io.kubernetes.pod.namespace=kube-system
k8s:k8s-app=kube-dns
32564 Disabled Disabled 59494 k8s:app=zook f00d::a0a:100:0:7f34 10.10.1.165 ready
k8s:io.cilium.k8s.policy.serviceaccount=default
k8s:io.kubernetes.pod.namespace=default
k8s:zgroup=kafkaTestApp
45365 Disabled Disabled 43645 k8s:app=empire-backup f00d::a0a:100:0:b135 10.10.1.190 ready
k8s:io.cilium.k8s.policy.serviceaccount=default
k8s:io.kubernetes.pod.namespace=default
k8s:zgroup=kafkaTestApp
cmd: kubectl exec -n kube-system cilium-sc5qb -- cilium service list
ID Frontend Backend
1 10.96.0.1:443 1 => 192.168.36.11:6443
2 10.96.0.10:53 1 => 10.10.1.243:53
3 10.101.3.243:9090 1 => 10.10.0.75:9090
4 10.105.192.11:2181 1 => 10.10.1.165:2181
cmd: kubectl exec -n kube-system cilium-sc5qb -- cilium endpoint list
ENDPOINT POLICY (ingress) POLICY (egress) IDENTITY LABELS (source:key[=value]) IPv6 IPv4 STATUS
ENFORCEMENT ENFORCEMENT
8417 Disabled Disabled 48563 k8s:app=prometheus f00d::a0a:0:0:20e1 10.10.0.75 ready
k8s:component=core
k8s:io.cilium.k8s.policy.serviceaccount=prometheus-k8s
k8s:io.kubernetes.pod.namespace=prometheus
38477 Disabled Disabled 4 reserved:health f00d::a0a:0:0:964d 10.10.0.190 ready
41052 Disabled Disabled 32829 k8s:app=empire-hq f00d::a0a:0:0:a05c 10.10.0.251 ready
k8s:io.cilium.k8s.policy.serviceaccount=default
k8s:io.kubernetes.pod.namespace=default
k8s:zgroup=kafkaTestApp
Logs: 4bc18599_K8sKafkaPolicyTest_Kafka_Policy_Tests_KafkaPolicies.zip
About this issue
- Original URL
- State: closed
- Created 6 years ago
- Comments: 23 (23 by maintainers)
I’ve just noticed that this is use to happens when Cilium Upgrade happens just before Kafka, maybe any miss update?
Builds ID: 1335 and 1336 https://jenkins.cilium.io/job/cilium-ginkgo/job/cilium/job/master/1335/execution/node/97/log/?consoleFull https://jenkins.cilium.io/job/cilium-ginkgo/job/cilium/job/master/1336/execution/node/97/log/?consoleFull
Seed: 1532949306