cilium: Synchronous endpoint regeneration does not complete quickly enough
Build: https://jenkins.cilium.io/job/cilium-ginkgo/job/cilium/job/master/1343/testReport/junit/k8s-1/11/K8sDemosTest_Tests_Star_Wars_Demo/ Branch: Master
Issue:
/home/jenkins/workspace/cilium-ginkgo_cilium_master-QVKDPVJ2A6E272IRUX7X6DCB5FBLCQDKRTOMHPZS7ZECX3RM5C3A/src/github.com/cilium/cilium/test/ginkgo-ext/scopes.go:306
Found a "RunInit: Command execution failed" in Cilium Logs
/home/jenkins/workspace/cilium-ginkgo_cilium_master-QVKDPVJ2A6E272IRUX7X6DCB5FBLCQDKRTOMHPZS7ZECX3RM5C3A/src/github.com/cilium/cilium/test/ginkgo-ext/scopes.go:467
Issue message
2018-07-31T16:13:47.441326938Z level=warning msg="RunInit: Command execution failed" cmd="/var/lib/cilium/bpf/join_ep.sh /var/lib/cilium/bpf /var/run/cilium/state /var/run/cilium/state/60769_next lxc05587 true 60769" containerID=0558782bb5 endpointID=60769 error="exit status 2" ipv4=10.10.1.184 ipv6="f00d::a0a:100:0:ed61" k8sPodName=default/deathstar-8659bbbdbb-nmv89 policyRevision=10
2018-07-31T16:13:47.441370566Z level=warning msg="Join EP id=/var/run/cilium/state/60769_next ifname=lxc05587" subsys=endpoint
2018-07-31T16:13:47.441374661Z level=warning msg="kernel version: Linux k8s2 4.9.17-040917-generic #201703220831 SMP Wed Mar 22 12:33:05 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux" subsys=endpoint
2018-07-31T16:13:47.441376866Z level=warning msg="clang version: clang version 3.8.1 (tags/RELEASE_381/final) Target: x86_64-unknown-linux-gnu Thread model: posix InstalledDir: /usr/local/clang+llvm/bin" subsys=endpoint
2018-07-31T16:13:47.441378894Z level=warning msg="'probe' is not a recognized processor for this target (ignoring processor)" subsys=endpoint
2018-07-31T16:13:47.441380783Z level=warning msg="'probe' is not a recognized processor for this target (ignoring processor)" subsys=endpoint
2018-07-31T16:13:47.441382549Z level=warning msg="'probe' is not a recognized processor for this target (ignoring processor)" subsys=endpoint
2018-07-31T16:13:47.441384373Z level=warning msg="'probe' is not a recognized processor for this target (ignoring processor)" subsys=endpoint
2018-07-31T16:13:47.441386175Z level=warning msg="RTNETLINK answers: No such device" subsys=endpoint
2018-07-31T16:13:47.441387945Z level=warning msg="We have an error talking to the kernel, -1" subsys=endpoint
2018-07-31T16:13:47.441389752Z level=warning msg="Note: 8 bytes struct bpf_elf_map fixup performed due to size mismatch!" subsys=endpoint
2018-07-31T16:13:47.441399711Z level=debug msg="BPF compilation completed" BPFCompilationTime=4.722723658s containerID=0558782bb5 endpointID=60769 error="error: \"exit status 2\" command output: \"Join EP id=/var/run/cilium/state/60769_next ifname=lxc05587\\nkernel version: Linux k8s2 4.9.17-040917-generic #201703220831 SMP Wed Mar 22 12:33:05 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux\\nclang version: clang version 3.8.1 (tags/RELEASE_381/final) Target: x86_64-unknown-linux-gnu Thread model: posix InstalledDir: /usr/local/clang+llvm/bin\\n'probe' is not a recognized processor for this target (ignoring processor)\\n'probe' is not a recognized processor for this target (ignoring processor)\\n'probe' is not a recognized processor for this target (ignoring processor)\\n'probe' is not a recognized processor for this target (ignoring processor)\\nRTNETLINK answers: No such device\\nWe have an error talking to the kernel, -1\\nNote: 8 bytes struct bpf_elf_map fixup performed due to size mismatch!\\n\"" ipv4=10.10.1.184 ipv6="f00d::a0a:100:0:ed61" k8sPodName=default/deathstar-8659bbbdbb-nmv89 policyRevision=10
2018-07-31T16:13:47.44144344Z level=error msg="destroying BPF maps due to errors during regeneration" containerID=0558782bb5 endpointID=60769 error="error: \"exit status 2\" command output: \"Join EP id=/var/run/cilium/state/60769_next ifname=lxc05587\\nkernel version: Linux k8s2 4.9.17-040917-generic #201703220831 SMP Wed Mar 22 12:33:05 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux\\nclang version: clang version 3.8.1 (tags/RELEASE_381/final) Target: x86_64-unknown-linux-gnu Thread model: posix InstalledDir: /usr/local/clang+llvm/bin\\n'probe' is not a recognized processor for this target (ignoring processor)\\n'probe' is not a recognized processor for this target (ignoring processor)\\n'probe' is not a recognized processor for this target (ignoring processor)\\n'probe' is not a recognized processor for this target (ignoring processor)\\nRTNETLINK answers: No such device\\nWe have an error talking to the kernel, -1\\nNote: 8 bytes struct bpf_elf_map fixup performed due to size mismatch!\\n\"" ipv4=10.10.1.184 ipv6="f00d::a0a:100:0:ed61" k8sPodName=default/deathstar-8659bbbdbb-nmv89 policyRevision=10
2018-07-31T16:13:47.441463496Z level=info msg="Regeneration of BPF program has completed" buildDuration=4.724872733s containerID=0558782bb5 endpointID=60769 ipv4=10.10.1.184 ipv6="f00d::a0a:100:0:ed61" k8sPodName=default/deathstar-8659bbbdbb-nmv89 policyRevision=10
2018-07-31T16:13:47.441466838Z level=warning msg="Generating BPF for endpoint failed, keeping stale directory." containerID=0558782bb5 endpointID=60769 file-path=60769_next_fail ipv4=10.10.1.184 ipv6="f00d::a0a:100:0:ed61" k8sPodName=default/deathstar-8659bbbdbb-nmv89 policyRevision=10
2018-07-31T16:13:47.44151822Z level=debug msg="Completed endpoint regeneration with no pending regeneration requests" code=OK containerID=0558782bb5 endpointID=60769 endpointState=ready ipv4=10.10.1.184 ipv6="f00d::a0a:100:0:ed61" k8sPodName=default/deathstar-8659bbbdbb-nmv89 policyRevision=10 type=0
2018-07-31T16:13:47.441560429Z level=warning msg="Regeneration of endpoint program failed" containerID=0558782bb5 endpointID=60769 error="error: \"exit status 2\" command output: \"Join EP id=/var/run/cilium/state/60769_next ifname=lxc05587\\nkernel version: Linux k8s2 4.9.17-040917-generic #201703220831 SMP Wed Mar 22 12:33:05 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux\\nclang version: clang version 3.8.1 (tags/RELEASE_381/final) Target: x86_64-unknown-linux-gnu Thread model: posix InstalledDir: /usr/local/clang+llvm/bin\\n'probe' is not a recognized processor for this target (ignoring processor)\\n'probe' is not a recognized processor for this target (ignoring processor)\\n'probe' is not a recognized processor for this target (ignoring processor)\\n'probe' is not a recognized processor for this target (ignoring processor)\\nRTNETLINK answers: No such device\\nWe have an error talking to the kernel, -1\\nNote: 8 bytes struct bpf_elf_map fixup performed due to size mismatch!\\n\"" ipv4=10.10.1.184 ipv6="f00d::a0a:100:0:ed61" k8sPodName=/ policyRevision=0
2018-07-31T16:13:47.441598352Z level=debug msg="Error regenerating endpoint: error: \"exit status 2\" command output: \"Join EP id=/var/run/cilium/state/60769_next ifname=lxc05587\\nkernel version: Linux k8s2 4.9.17-040917-generic #201703220831 SMP Wed Mar 22 12:33:05 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux\\nclang version: clang version 3.8.1 (tags/RELEASE_381/final) Target: x86_64-unknown-linux-gnu Thread model: posix InstalledDir: /usr/local/clang+llvm/bin\\n'probe' is not a recognized processor for this target (ignoring processor)\\n'probe' is not a recognized processor for this target (ignoring processor)\\n'probe' is not a recognized processor for this target (ignoring processor)\\n'probe' is not a recognized processor for this target (ignoring processor)\\nRTNETLINK answers: No such device\\nWe have an error talking to the kernel, -1\\nNote: 8 bytes struct bpf_elf_map fixup performed due to size mismatch!\\n\"" code=Failure containerID=0558782bb5 endpointID=60769 endpointState=ready ipv4=10.10.1.184 ipv6="f00d::a0a:100:0:ed61" k8sPodName=default/deathstar-8659bbbdbb-nmv89 policyRevision=10 type=200
2018-07-31T16:13:47.443097368Z level=debug msg="BPF compilation completed" BPFCompilationTime=4.700343105s containerID=2bd26a1b8c endpointID=9028 error="<nil>" ipv4=10.10.1.224 ipv6="f00d::a0a:100:0:2344" k8sPodName=default/xwing-6547b96dc9-4zjf9 policyRevision=0
Logs:
47ead759_K8sDemosTest_Tests_Star_Wars_Demo.zip
Status checks:
⚠️ Found a "RunInit: Command execution failed" in logs
Number of "context deadline exceeded" in logs: 0
Number of "level=error" in logs: 5
⚠️ Number of "level=warning" in logs: 30
Number of "Cilium API handler panicked" in logs: 0
Cilium pods: [cilium-b45rc cilium-mhljt]
Netpols loaded:
CiliumNetworkPolicies loaded: default::deathstar-l7
Endpoint Policy Enforcement:
spaceship-589d768cc4-bdr4v => none
deathstar-8659bbbdbb-fmgf5 => ingress
xwing-6547b96dc9-mp5xx => none
cilium-health-k8s1 => none
cilium-health-k8s2 => none
deathstar-8659bbbdbb-lxpmx => ingress
deathstar-8659bbbdbb-nmv89 => ingress
spaceship-589d768cc4-2bdcl => none
prometheus-core-6546477dc8-r9c9k => none
spaceship-589d768cc4-42852 => none
spaceship-589d768cc4-svt7q => none
xwing-6547b96dc9-4zjf9 => none
xwing-6547b96dc9-m89nm => none
coredns-b9476b976-5j2p8 => none
Cilium agent "cilium-b45rc": Status: Ok Health: Ok Nodes "k8s1 k8s2" ContinerRuntime: Ok Kubernetes: Ok KVstore: Ok Controllers: Total 52 Failed 0
Cilium agent "cilium-mhljt": Status: Ok Health: Ok Nodes "k8s2 k8s1" ContinerRuntime: Ok Kubernetes: Ok KVstore: Ok Controllers: Total 52 Failed 0
cmd: kubectl get pods -o wide --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE
default deathstar-8659bbbdbb-fmgf5 1/1 Running 0 1m 10.10.1.84 k8s2
default deathstar-8659bbbdbb-lxpmx 1/1 Running 0 1m 10.10.0.141 k8s1
default deathstar-8659bbbdbb-nmv89 1/1 Running 1 1m 10.10.1.194 k8s2
default spaceship-589d768cc4-2bdcl 1/1 Running 0 1m 10.10.1.139 k8s2
default spaceship-589d768cc4-42852 1/1 Running 0 1m 10.10.0.161 k8s1
default spaceship-589d768cc4-bdr4v 1/1 Running 0 1m 10.10.1.70 k8s2
default spaceship-589d768cc4-svt7q 1/1 Running 0 1m 10.10.0.25 k8s1
default xwing-6547b96dc9-4zjf9 1/1 Running 0 1m 10.10.1.224 k8s2
default xwing-6547b96dc9-m89nm 1/1 Running 0 1m 10.10.1.71 k8s2
default xwing-6547b96dc9-mp5xx 1/1 Running 0 1m 10.10.0.246 k8s1
kube-system cilium-b45rc 1/1 Running 0 5m 192.168.36.11 k8s1
kube-system cilium-mhljt 1/1 Running 0 5m 192.168.36.12 k8s2
kube-system coredns-b9476b976-5j2p8 1/1 Running 0 11m 10.10.0.149 k8s1
kube-system etcd-k8s1 1/1 Running 0 10m 192.168.36.11 k8s1
kube-system kube-apiserver-k8s1 1/1 Running 0 11m 192.168.36.11 k8s1
kube-system kube-controller-manager-k8s1 1/1 Running 0 10m 192.168.36.11 k8s1
kube-system kube-proxy-76fwx 1/1 Running 0 5m 192.168.36.12 k8s2
kube-system kube-proxy-vb2vj 1/1 Running 0 11m 192.168.36.11 k8s1
kube-system kube-scheduler-k8s1 1/1 Running 0 10m 192.168.36.11 k8s1
prometheus prometheus-core-6546477dc8-r9c9k 1/1 Running 0 5m 10.10.0.28 k8s1
cmd: kubectl exec -n kube-system cilium-b45rc -- cilium endpoint list
ENDPOINT POLICY (ingress) POLICY (egress) IDENTITY LABELS (source:key[=value]) IPv6 IPv4 STATUS
ENFORCEMENT ENFORCEMENT
20800 Enabled Disabled 7726 k8s:class=deathstar f00d::a0a:0:0:5140 10.10.0.141 ready
k8s:io.cilium.k8s.policy.serviceaccount=default
k8s:io.kubernetes.pod.namespace=default
k8s:org=empire
31054 Disabled Disabled 4 reserved:health f00d::a0a:0:0:794e 10.10.0.166 ready
49046 Disabled Disabled 1466 k8s:class=spaceship f00d::a0a:0:0:bf96 10.10.0.161 ready
k8s:io.cilium.k8s.policy.serviceaccount=default
k8s:io.kubernetes.pod.namespace=default
k8s:org=empire
52000 Disabled Disabled 23109 k8s:io.cilium.k8s.policy.serviceaccount=coredns f00d::a0a:0:0:cb20 10.10.0.149 ready
k8s:io.kubernetes.pod.namespace=kube-system
k8s:k8s-app=kube-dns
55361 Disabled Disabled 1466 k8s:class=spaceship f00d::a0a:0:0:d841 10.10.0.25 ready
k8s:io.cilium.k8s.policy.serviceaccount=default
k8s:io.kubernetes.pod.namespace=default
k8s:org=empire
59704 Disabled Disabled 29255 k8s:class=spaceship f00d::a0a:0:0:e938 10.10.0.246 ready
k8s:io.cilium.k8s.policy.serviceaccount=default
k8s:io.kubernetes.pod.namespace=default
k8s:org=alliance
61807 Disabled Disabled 37382 k8s:app=prometheus f00d::a0a:0:0:f16f 10.10.0.28 ready
k8s:component=core
k8s:io.cilium.k8s.policy.serviceaccount=prometheus-k8s
k8s:io.kubernetes.pod.namespace=prometheus
cmd: kubectl exec -n kube-system cilium-b45rc -- cilium service list
ID Frontend Backend
1 10.96.0.10:53 1 => 10.10.0.149:53
2 10.111.42.131:9090 1 => 10.10.0.28:9090
3 10.96.0.1:443 1 => 192.168.36.11:6443
5 10.103.186.101:80 1 => 10.10.0.141:80
2 => 10.10.1.194:80
3 => 10.10.1.84:80
cmd: kubectl exec -n kube-system cilium-mhljt -- cilium endpoint list
ENDPOINT POLICY (ingress) POLICY (egress) IDENTITY LABELS (source:key[=value]) IPv6 IPv4 STATUS
ENFORCEMENT ENFORCEMENT
1856 Disabled Disabled 4 reserved:health f00d::a0a:100:0:740 10.10.1.156 ready
3377 Enabled Disabled 7726 k8s:class=deathstar f00d::a0a:100:0:d31 10.10.1.84 ready
k8s:io.cilium.k8s.policy.serviceaccount=default
k8s:io.kubernetes.pod.namespace=default
k8s:org=empire
4297 Enabled Disabled 7726 k8s:class=deathstar f00d::a0a:100:0:10c9 10.10.1.194 ready
k8s:io.cilium.k8s.policy.serviceaccount=default
k8s:io.kubernetes.pod.namespace=default
k8s:org=empire
9028 Disabled Disabled 29255 k8s:class=spaceship f00d::a0a:100:0:2344 10.10.1.224 ready
k8s:io.cilium.k8s.policy.serviceaccount=default
k8s:io.kubernetes.pod.namespace=default
k8s:org=alliance
18903 Disabled Disabled 1466 k8s:class=spaceship f00d::a0a:100:0:49d7 10.10.1.139 ready
k8s:io.cilium.k8s.policy.serviceaccount=default
k8s:io.kubernetes.pod.namespace=default
k8s:org=empire
23142 Disabled Disabled 1466 k8s:class=spaceship f00d::a0a:100:0:5a66 10.10.1.70 ready
k8s:io.cilium.k8s.policy.serviceaccount=default
k8s:io.kubernetes.pod.namespace=default
k8s:org=empire
56198 Disabled Disabled 29255 k8s:class=spaceship f00d::a0a:100:0:db86 10.10.1.71 ready
k8s:io.cilium.k8s.policy.serviceaccount=default
k8s:io.kubernetes.pod.namespace=default
k8s:org=alliance
cmd: kubectl exec -n kube-system cilium-mhljt -- cilium service list
ID Frontend Backend
1 10.96.0.10:53 1 => 10.10.0.149:53
2 10.111.42.131:9090 1 => 10.10.0.28:9090
3 10.96.0.1:443 1 => 192.168.36.11:6443
5 10.103.186.101:80 1 => 10.10.0.141:80
2 => 10.10.1.194:80
3 => 10.10.1.84:80
About this issue
- Original URL
- State: closed
- Created 6 years ago
- Comments: 15 (15 by maintainers)
Commits related to this issue
- client: Add API timeout to endpoint requests Commit 66e36c4160e3 ("Add client timeout for Cilium API") attempted to introduce a longer client timeout to better handle longer endpoint regeneration cyc... — committed to joestringer/cilium by joestringer 6 years ago
- client: Add API timeout to endpoint requests Commit 66e36c4160e3 ("Add client timeout for Cilium API") attempted to introduce a longer client timeout to better handle longer endpoint regeneration cyc... — committed to cilium/cilium by joestringer 6 years ago
- client: Add API timeout to endpoint requests [ upstream commit f7f8c2fbed0046fbfaf1e28fa8bec64b0f52dbc6 ] Commit 66e36c4160e3 ("Add client timeout for Cilium API") attempted to introduce a longer cl... — committed to cilium/cilium by joestringer 6 years ago
- client: Add API timeout to endpoint requests [ upstream commit f7f8c2fbed0046fbfaf1e28fa8bec64b0f52dbc6 ] Commit 66e36c4160e3 ("Add client timeout for Cilium API") attempted to introduce a longer cl... — committed to cilium/cilium by joestringer 6 years ago
- client: Add API timeout to endpoint requests [ upstream commit f7f8c2fbed0046fbfaf1e28fa8bec64b0f52dbc6 ] Commit 66e36c4160e3 ("Add client timeout for Cilium API") attempted to introduce a longer cl... — committed to cilium/cilium by joestringer 6 years ago
- client: Add API timeout to endpoint requests [ upstream commit f7f8c2fbed0046fbfaf1e28fa8bec64b0f52dbc6 ] Commit 66e36c4160e3 ("Add client timeout for Cilium API") attempted to introduce a longer cl... — committed to cilium/cilium by joestringer 6 years ago
@eloycoto can this be closed? We’ve done a bunch of work to streamline building of endpoints at this time.