cilium: Misbehaving L7Proxy makes Cilium API Unable to create new endpoint

Bug report

General Information

  • Cilium version: Client: 1.10.0 952d9d3 2021-05-19T18:42:32+02:00 go version go1.16.4 linux/amd64 Daemon: 1.10.0 952d9d3 2021-05-19T18:42:32+02:00 go version go1.16.4 linux/amd64

  • Kernel version (run uname -a) Linux itsedcll-bb4070 5.10.0-1026-oem #27-Ubuntu SMP Thu May 6 07:19:13 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

  • Orchestration system version in use (e.g. kubectl version, …) Client Version: version.Info{Major:“1”, Minor:“20”, GitVersion:“v1.20.6”, GitCommit:“8a62859e515889f07e3e3be6a1080413f17cf2c3”, GitTreeState:“clean”, BuildDate:“2021-04-15T03:28:42Z”, GoVersion:“go1.15.10”, Compiler:“gc”, Platform:“linux/amd64”} Server Version: version.Info{Major:“1”, Minor:“20”, GitVersion:“v1.20.6”, GitCommit:“8a62859e515889f07e3e3be6a1080413f17cf2c3”, GitTreeState:“clean”, BuildDate:“2021-04-15T03:19:55Z”, GoVersion:“go1.15.10”, Compiler:“gc”, Platform:“linux/amd64”}

  • Link to relevant artifacts (policies, deployments scripts, …)

helm upgrade -i cilium cilium/cilium \
--version 1.10.0 \
--namespace kube-system \
--set kubeProxyReplacement=strict \
--set k8sServiceHost=k8s-api.dcll.cnp.ikeadt.com \
--set k8sServicePort=6443 \
--set localRedirectPolicy=true \
--set prometheus.enabled=true \
--set operator.prometheus.enabled=true \
--set hubble.enabled=true \
--set hubble.metrics.enabled="{dns,drop,tcp,flow,port-distribution,icmp,http}" \
--set hubble.relay.enabled=true \
--set hubble.ui.enabled=true \
--set containerRuntime.integration=containerd \
--set tunnel=disabled \
--set ipam.mode=kubernetes \
--set nativeRoutingCIDR=10.62.128.0/19 \
--set devices=client0 \
--set endpointRoutes.enabled=true

  • Generate and upload a system zip: I have run it and tried to upload it to this issue but with out any luck. Maybe it is to large…

How to reproduce the issue

  1. Install cilium with the above helm command
  2. run cilium connectivity test && kubectl delete ns cilium-test a few times then after a few runs Cilium API becomes unavailable. And pods fails to be scheduled because of Cilium API timetouts
  Warning  FailedCreatePodSandBox  46m                 kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "2788700c30f493be1d6c729fc54d79700dbe48e7e1160a046710c9479b248a6a": Unable to create endpoint: Cilium API client timeout exceeded

At this point in time I am able to curl to 2 of 6 of the cilium-agent endpoints:

curl http://10.62.64.21:9095 -i
HTTP/1.1 404 Not Found
date: Mon, 31 May 2021 11:54:34 GMT
server: envoy
content-length: 0

On the working nodes I cannot curl the API:

curl http://10.62.64.23:9095 -i
curl: (7) Failed to connect to 10.62.64.23 port 9095: Connection refused

If I delete the pods on the broken nodes, they come back online for a little while and then the become healthy again.

The cilium status reports all healthy The same is if I run cilium status in the individual pods they also report healthy

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 51 (48 by maintainers)

Most upvoted comments

Had another look at this. My current theory is that we are stuck in: https://github.com/cilium/cilium/blob/952d9d33740fc77e1a94d5a52b38d2a313e5c570/pkg/endpoint/bpf.go#L671

@knfoo one way to verify this would be to enable debugging output (there is a -D option in the agent, a debug.enabled value that can be set in helm: --set debug.enabled=true, and for cilium-cli a config map option called debug: true)

With debugging set, we should be able to the following messages: https://github.com/cilium/cilium/blob/952d9d33740fc77e1a94d5a52b38d2a313e5c570/pkg/endpoint/endpoint.go#L427-L432

details

In the sysdump, if we track the creation of a endpoint 2736, it starts with:

2021-06-15T13:40:36.209287058Z level=info msg="Create endpoint request" addressing="&{10.62.129.163 c03ec021-ba83-4e84-9792-1afff94c98f1  }" containerID=04e4bce5b9e77d3658946ff4ab1d880fad7266884a4fef3430d4903ac3deea10 datapathConfiguration="&{false true false true 0xc002a3e27b}" interface=lxc07026740578a k8sPodName=cilium-test/client-8655c47fd7-jv2b5 labels="[]" subsys=daemon sync-build=true
2021-06-15T13:40:36.209683000Z level=info msg="New endpoint" containerID= datapathPolicyRevision=0 desiredPolicyRevision=0 endpointID=2736 ipv4= ipv6= k8sPodName=/ subsys=endpoint

And the final message for this endpoint is:

2021-06-15T13:40:36.523565802Z level=info msg="Rewrote endpoint BPF program" containerID= datapathPolicyRevision=0 desiredPolicyRevision=41 endpointID=2736 identity=34191 ipv4= ipv6= k8sPodName=/ subsys=endpoint

Generated from: https://github.com/cilium/cilium/blob/952d9d33740fc77e1a94d5a52b38d2a313e5c570/pkg/endpoint/bpf.go#L729

which means that realizeBPFState() will return and reach waitForProxyCompletions: https://github.com/cilium/cilium/blob/952d9d33740fc77e1a94d5a52b38d2a313e5c570/pkg/endpoint/bpf.go#L641-L671

Indeed, there are a couple of goroutines stack there in the gops-stack logs:

goroutine 9543 [select, 3 minutes]:
github.com/cilium/cilium/pkg/completion.(*WaitGroup).Wait(0xc002c82d40, 0x0, 0x0)
	/go/src/github.com/cilium/cilium/pkg/completion/completion.go:105 +0x1b3
github.com/cilium/cilium/pkg/endpoint.(*Endpoint).waitForProxyCompletions(0xc001978600, 0xc002c82d40, 0xc000400ee8, 0x0)
	/go/src/github.com/cilium/cilium/pkg/endpoint/endpoint.go:428 +0x176
github.com/cilium/cilium/pkg/endpoint.(*Endpoint).regenerateBPF(0xc001978600, 0xc000400c00, 0x0, 0x0, 0x0, 0x0)
	/go/src/github.com/cilium/cilium/pkg/endpoint/bpf.go:671 +0x618
github.com/cilium/cilium/pkg/endpoint.(*Endpoint).regenerate(0xc001978600, 0xc000400c00, 0x0, 0x0)
	/go/src/github.com/cilium/cilium/pkg/endpoint/policy.go:416 +0x84d
github.com/cilium/cilium/pkg/endpoint.(*EndpointRegenerationEvent).Handle(0xc0009e7ea0, 0xc002a83680)
	/go/src/github.com/cilium/cilium/pkg/endpoint/events.go:65 +0x49c
github.com/cilium/cilium/pkg/eventqueue.(*EventQueue).run.func1()
	/go/src/github.com/cilium/cilium/pkg/eventqueue/eventqueue.go:256 +0x16b
sync.(*Once).doSlow(0xc002a83098, 0xc002bfffb8)
	/usr/local/go/src/sync/once.go:68 +0xec
sync.(*Once).Do(...)
	/usr/local/go/src/sync/once.go:59
github.com/cilium/cilium/pkg/eventqueue.(*EventQueue).run(0xc002a83080)
	/go/src/github.com/cilium/cilium/pkg/eventqueue/eventqueue.go:244 +0x65
created by github.com/cilium/cilium/pkg/eventqueue.(*EventQueue).Run
	/go/src/github.com/cilium/cilium/pkg/eventqueue/eventqueue.go:240 +0x65

If L7 proxy is causing the CNI ADD call to timeout, that is indeed interesting, thanks for the update, we’ll continue to investigate that.

FWIW, you should be able to disable all to-world tests in the connectivity check as well:

cilium connectivity test --test '!pod-to-world-toFQDNs' --test `!pod-to-cidr`

@jrajahalme I have run some test during the weekend with --proxy-prometheus-port removed.

It seems that the deadlock problem is resolved however there is a bunch of other problems showing up. I get between 1 and 12 test that fails during my 120 cilium connectivity test runs.

I am not sure if you want me to do anything else to validate #16834

I have attached a sysdump from my nodes. One thing that I noticed is:

ERROR Error: Command 'kubectl logs cilium-operator-7f74d888b5-wrd8j --container=cilium-operator --timestamps=true --since=0 --limit-bytes=1073741824 -n kube-system > ./cilium-sysdump-20210712-091214/cilium-operator-7f74d888b5-wrd8j-cilium-operator-20210712-091253.log' returned non-zero exit status 1.. Could not collect log file: cilium-operator-7f74d888b5-wrd8j-cilium-operator-20210712-091253

cilium-sysdump-20210712-091214.zip cilium-sysdump-20210712-091058.zip cilium-sysdump-20210712-091340.zip

Looking at the sysdumps, it seems that you are indeed not hitting the proxy issue anymore. Do you have particular connectivity tests that are failing? Do the issues persist in Cilium 1.10.2?

The kubectl logs error often occurs if there are node connectivity issues (i.e. apiserver not being able to reach kubelet on the node). Did you observe issues between particular nodes?

Might be worth creating a new issue with the new issues, so we can keep track of them separately, as they seem unrelated to proxy.

I have run in again with the settings in the cilium-config as asked.

I noticed this as well in one of my test runs. I have attached the sysdump and the output of the test.

cilium-test-06.md cilium-sysdump-20210701-141236.zip

During the test runs to provoke the error. I noticed these problem.

❌ unable to get policy revisions for Cilium pods: %!w(*errors.StatusError=&{{{ } {   <nil>} Failure error dialing backend: dial tcp 10.62.60.56:10250: connect: connection refused  <nil> 500}})

---------------------------------------------------------------------------------------------------------------------
⌛ Applying CiliumNetworkPolicies...
---------------------------------------------------------------------------------------------------------------------
✅ applied CiliumNetworkPolicies: dcll/cilium-test/client-egress-to-echo
❌ policy apply failed
❌ unable to get policy revisions for Cilium pods: %!w(*errors.StatusError=&{{{ } {   <nil>} Failure error dialing backend: dial tcp 10.62.60.44:10250: connect: connection refused  <nil> 500}})
❌ unable to get policy revisions for Cilium pods: %!w(*errors.StatusError=&{{{ } {   <nil>} Failure error dialing backend: dial tcp 10.62.60.44:10250: connect: connection refused  <nil> 500}})

---------------------------------------------------------------------------------------------------------------------
⌛ Applying CiliumNetworkPolicies...
---------------------------------------------------------------------------------------------------------------------
✅ applied CiliumNetworkPolicies: dcll/cilium-test/client-egress-to-fqdns-google
❌ policy apply failed
❌ unable to get policy revisions for Cilium pods: %!w(*errors.StatusError=&{{{ } {   <nil>} Failure error dialing backend: dial tcp 10.62.60.44:10250: connect: connection refused  <nil> 500}})

The ip’s 10.62.60.44 and 10.62.60.56 is not IP’s on the client0 interface but on our point to point links on eno6 and eno7 image

@knfoo We’re trying to reproduce the issue in order to be able to debug it further, but currently with no success.

If you still have your test cluster, it would be incredibly helpful if we can gather more information with flow debug logging enabled:

  1. Add debug: "true" and debug-verbose: "flow" to the cilium-config ConfigMap
  2. Restart the cilium pods for the above settings to be applied
  3. Reproduce the issue by re-running the connectivity test until pods are stuck pending
  4. Capture a sysdump

Thanks!

The logs seem to verify that https://github.com/cilium/cilium/blob/952d9d33740fc77e1a94d5a52b38d2a313e5c570/pkg/endpoint/bpf.go#L671

never returns.

For example, for 309ac26cc3d7eb1727f97cf96197cc9e9387512234176df950d1bfe8f5460b13:

“Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "309ac26cc3d7eb1727f97cf96197cc9e9387512234176df950d1bfe8f5460b13": Unable to create endpoint: Cilium API client timeout exceeded”,

corresponding to endpoint 40 , there is an entry:

2021-06-23T17:29:00.748090381Z level=debug msg=“Waiting for proxy updates to complete…” containerID= datapathPolicyRevision=0 desiredPolicyRevision=281 endpointID=40 identity=19331 ipv4= ipv6= k8sPodName=/ subsys=endpoint

In the logs, but no Wait time for proxy updates: matching message.

2021-06-23T17:29:00.368532207Z level=debug msg="PUT /endpoint/{id} request" endpoint="{Addressing:0xc003da6980 ContainerID:309ac26cc3d7eb1727f97cf96197cc9e9387512234176df950d1bfe8f5460b13 ContainerName: DatapathConfiguration:<nil> DatapathMapID:0 DockerEndpointID: DockerNetworkID: HostMac:66:fa:ed:4a:fa:09 ID:0 InterfaceIndex:168958 InterfaceName:lxc37e60f3e47a9 K8sNamespace:cilium-test K8sPodName:echo-same-node-d7dcc498d-88nnx Labels:[] Mac:42:6e:51:cd:77:9a Pid:0 PolicyEnabled:false State:waiting-for-identity SyncBuildEndpoint:true}" subsys=daemon
2021-06-23T17:29:00.368586686Z level=info msg="Processing API request with rate limiter" maxWaitDuration=15s name=endpoint-create parallelRequests=10 subsys=rate uuid=ebcc43c3-86ed-4532-aa92-e8f0acdc4c83
2021-06-23T17:29:00.368604213Z level=info msg="API request released by rate limiter" burst=11 limit=2.25/s maxWaitDuration=15s maxWaitDurationLimiter=14.999921658s name=endpoint-create parallelRequests=10 subsys=rate uuid=ebcc43c3-86ed-4532-aa92-e8f0acdc4c83 waitDurationLimiter=0s waitDurationTotal="91.146µs"
2021-06-23T17:29:00.368650015Z level=info msg="Create endpoint request" addressing="&{10.62.130.97 d2af1217-e79d-42cf-8cc8-88a09b7f6be7  }" containerID=309ac26cc3d7eb1727f97cf96197cc9e9387512234176df950d1bfe8f5460b13 datapathConfiguration="&{false true false true 0xc001e2583a}" interface=lxc37e60f3e47a9 k8sPodName=cilium-test/echo-same-node-d7dcc498d-88nnx labels="[]" subsys=daemon sync-build=true
2021-06-23T17:29:00.368985205Z level=info msg="New endpoint" containerID= datapathPolicyRevision=0 desiredPolicyRevision=0 endpointID=40 ipv4= ipv6= k8sPodName=/ subsys=endpoint
2021-06-23T17:29:00.369189321Z level=debug msg="Skipping CiliumEndpoint update because security identity is not yet available" containerID= controller="sync-to-k8s-ciliumendpoint (40)" datapathPolicyRevision=0 desiredPolicyRevision=0 endpointID=40 ipv4= ipv6= k8sPodName=/ subsys=endpointsynchronizer
2021-06-23T17:29:00.369424260Z level=debug msg="Refreshing labels of endpoint" containerID=309ac26cc3 endpointID=40 identityLabels="k8s:io.cilium.k8s.policy.cluster=default,k8s:io.cilium.k8s.policy.serviceaccount=default,k8s:io.kubernetes.pod.namespace=cilium-test,k8s:kind=echo,k8s:name=echo-same-node,k8s:other=echo" infoLabels="k8s:pod-template-hash=d7dcc498d" subsys=endpoint
2021-06-23T17:29:00.369476704Z level=debug msg="Assigning information label" containerID= datapathPolicyRevision=0 desiredPolicyRevision=0 endpointID=40 ipv4= ipv6= k8sPodName=/ obj="{Key:pod-template-hash Value:d7dcc498d Source:k8s}" subsys=endpoint
2021-06-23T17:29:00.369510064Z level=debug msg="Assigning security relevant label" containerID= datapathPolicyRevision=0 desiredPolicyRevision=0 endpointID=40 ipv4= ipv6= k8sPodName=/ obj="{Key:io.kubernetes.pod.namespace Value:cilium-test Source:k8s}" subsys=endpoint
2021-06-23T17:29:00.369542509Z level=debug msg="Assigning security relevant label" containerID= datapathPolicyRevision=0 desiredPolicyRevision=0 endpointID=40 ipv4= ipv6= k8sPodName=/ obj="{Key:io.cilium.k8s.policy.serviceaccount Value:default Source:k8s}" subsys=endpoint
2021-06-23T17:29:00.369559745Z level=debug msg="Assigning security relevant label" containerID= datapathPolicyRevision=0 desiredPolicyRevision=0 endpointID=40 ipv4= ipv6= k8sPodName=/ obj="{Key:io.cilium.k8s.policy.cluster Value:default Source:k8s}" subsys=endpoint
2021-06-23T17:29:00.369569676Z level=debug msg="Assigning security relevant label" containerID= datapathPolicyRevision=0 desiredPolicyRevision=0 endpointID=40 ipv4= ipv6= k8sPodName=/ obj="{Key:kind Value:echo Source:k8s}" subsys=endpoint
2021-06-23T17:29:00.369579007Z level=debug msg="Assigning security relevant label" containerID= datapathPolicyRevision=0 desiredPolicyRevision=0 endpointID=40 ipv4= ipv6= k8sPodName=/ obj="{Key:name Value:echo-same-node Source:k8s}" subsys=endpoint
2021-06-23T17:29:00.369588085Z level=debug msg="Assigning security relevant label" containerID= datapathPolicyRevision=0 desiredPolicyRevision=0 endpointID=40 ipv4= ipv6= k8sPodName=/ obj="{Key:other Value:echo Source:k8s}" subsys=endpoint
2021-06-23T17:29:00.369604531Z level=info msg="Resolving identity labels (blocking)" containerID= datapathPolicyRevision=0 desiredPolicyRevision=0 endpointID=40 identityLabels="k8s:io.cilium.k8s.policy.cluster=default,k8s:io.cilium.k8s.policy.serviceaccount=default,k8s:io.kubernetes.pod.namespace=cilium-test,k8s:kind=echo,k8s:name=echo-same-node,k8s:other=echo" ipv4= ipv6= k8sPodName=/ subsys=endpoint
2021-06-23T17:29:00.369647408Z level=debug msg="Resolving identity for labels" containerID= datapathPolicyRevision=0 desiredPolicyRevision=0 endpointID=40 identityLabels="k8s:io.cilium.k8s.policy.cluster=default,k8s:io.cilium.k8s.policy.serviceaccount=default,k8s:io.kubernetes.pod.namespace=cilium-test,k8s:kind=echo,k8s:name=echo-same-node,k8s:other=echo" ipv4= ipv6= k8sPodName=/ subsys=endpoint
2021-06-23T17:29:00.369853706Z level=debug msg="Assigned new identity to endpoint" containerID= datapathPolicyRevision=0 desiredPolicyRevision=0 endpointID=40 identity=19331 identityLabels="k8s:io.cilium.k8s.policy.cluster=default,k8s:io.cilium.k8s.policy.serviceaccount=default,k8s:io.kubernetes.pod.namespace=cilium-test,k8s:kind=echo,k8s:name=echo-same-node,k8s:other=echo" ipv4= ipv6= k8sPodName=/ subsys=endpoint
2021-06-23T17:29:00.369928908Z level=debug msg="Set identity for this endpoint" code=OK containerID= datapathPolicyRevision=0 desiredPolicyRevision=0 endpointID=40 endpointState=ready ipv4= ipv6= k8sPodName=/ policyRevision=0 subsys=endpoint type=0
2021-06-23T17:29:00.369980264Z level=info msg="Identity of endpoint changed" containerID= datapathPolicyRevision=0 desiredPolicyRevision=0 endpointID=40 identity=19331 identityLabels="k8s:io.cilium.k8s.policy.cluster=default,k8s:io.cilium.k8s.policy.serviceaccount=default,k8s:io.kubernetes.pod.namespace=cilium-test,k8s:kind=echo,k8s:name=echo-same-node,k8s:other=echo" ipv4= ipv6= k8sPodName=/ oldIdentity="no identity" subsys=endpoint
2021-06-23T17:29:00.370005791Z level=debug msg="Triggering endpoint regeneration due to updated security labels" code=OK containerID= datapathPolicyRevision=0 desiredPolicyRevision=0 endpointID=40 endpointState=waiting-to-regenerate identity=19331 ipv4= ipv6= k8sPodName=/ policyRevision=0 subsys=endpoint type=0
2021-06-23T17:29:00.370127572Z level=info msg="Waiting for endpoint to be generated" containerID= datapathPolicyRevision=0 desiredPolicyRevision=0 endpointID=40 identity=19331 ipv4= ipv6= k8sPodName=/ subsys=endpoint
2021-06-23T17:29:00.370156887Z level=debug msg="Dequeued endpoint from build queue" containerID= datapathPolicyRevision=0 desiredPolicyRevision=0 endpointID=40 identity=19331 ipv4= ipv6= k8sPodName=/ subsys=endpoint
2021-06-23T17:29:00.370175465Z level=debug msg="Regenerating endpoint" containerID= datapathPolicyRevision=0 desiredPolicyRevision=0 endpointID=40 identity=19331 ipv4= ipv6= k8sPodName=/ reason="updated security labels" startTime="2021-06-23 17:29:00.370071196 +0000 UTC m=+4465.822988948" subsys=endpoint
2021-06-23T17:29:00.370292891Z level=debug msg="Endpoint labels unchanged, skipping resolution of identity" containerID= datapathPolicyRevision=0 desiredPolicyRevision=0 endpointID=40 identity=19331 identityLabels="k8s:io.cilium.k8s.policy.cluster=default,k8s:io.cilium.k8s.policy.serviceaccount=default,k8s:io.kubernetes.pod.namespace=cilium-test,k8s:kind=echo,k8s:name=echo-same-node,k8s:other=echo" ipv4= ipv6= k8sPodName=/ subsys=endpoint
2021-06-23T17:29:00.370338048Z level=debug msg="Regenerating endpoint: updated security labels" code=OK containerID= datapathPolicyRevision=0 desiredPolicyRevision=0 endpointID=40 endpointState=regenerating identity=19331 ipv4= ipv6= k8sPodName=/ policyRevision=0 subsys=endpoint type=0
2021-06-23T17:29:00.370397554Z level=debug msg="removing directory" containerID= datapathPolicyRevision=0 desiredPolicyRevision=0 directory=40_next endpointID=40 identity=19331 ipv4= ipv6= k8sPodName=/ subsys=endpoint
2021-06-23T17:29:00.370786339Z level=debug msg="flushing old PolicyMap" containerID= datapathPolicyRevision=0 desiredPolicyRevision=0 endpointID=40 identity=19331 ipv4= ipv6= k8sPodName=/ subsys=endpoint
2021-06-23T17:29:00.370819548Z level=debug msg="Starting policy recalculation..." containerID= datapathPolicyRevision=0 desiredPolicyRevision=0 endpointID=40 identity=19331 ipv4= ipv6= k8sPodName=/ subsys=endpoint
2021-06-23T17:29:00.370830573Z level=debug msg="Forced policy recalculation" containerID= datapathPolicyRevision=0 desiredPolicyRevision=0 endpointID=40 identity=19331 ipv4= ipv6= k8sPodName=/ subsys=endpoint
2021-06-23T17:29:00.370912341Z level=debug msg="Completed endpoint policy recalculation" containerID= datapathPolicyRevision=0 desiredPolicyRevision=281 endpointID=40 forcedRegeneration=true identity=19331 ipv4= ipv6= k8sPodName=/ policyCalculation="&{{{{{0 0} 0 0 0 0}}} {0 0 <nil>} 13634 0}" subsys=endpoint waitingForIdentityCache="&{{{{{0 0} 0 0 0 0}}} {0 0 <nil>} 0 0}" waitingForPolicyRepository="&{{{{{0 0} 0 0 0 0}}} {0 0 <nil>} 837 0}"
2021-06-23T17:29:00.371332015Z level=debug msg="BPF header file hashed (was: \"\")" bpfHeaderfileHash=460e5c8b2b6198ead43f25638eedb0e9583ad502f40675a48f392412ea07ef76 containerID= datapathPolicyRevision=0 desiredPolicyRevision=281 endpointID=40 identity=19331 ipv4= ipv6= k8sPodName=/ subsys=endpoint
2021-06-23T17:29:00.371341126Z level=debug msg="writing header file" containerID= datapathPolicyRevision=0 desiredPolicyRevision=281 endpointID=40 file-path=40_next/ep_config.h identity=19331 ipv4= ipv6= k8sPodName=/ subsys=endpoint
2021-06-23T17:29:00.375074449Z level=debug msg="Preparing to compile BPF" containerID= datapathPolicyRevision=0 desiredPolicyRevision=281 endpointID=40 identity=19331 ipv4= ipv6= k8sPodName=/ regeneration-level=rewrite+load subsys=endpoint
2021-06-23T17:29:00.748044273Z level=info msg="Rewrote endpoint BPF program" containerID= datapathPolicyRevision=0 desiredPolicyRevision=281 endpointID=40 identity=19331 ipv4= ipv6= k8sPodName=/ subsys=endpoint
2021-06-23T17:29:00.748090381Z level=debug msg="Waiting for proxy updates to complete..." containerID= datapathPolicyRevision=0 desiredPolicyRevision=281 endpointID=40 identity=19331 ipv4= ipv6= k8sPodName=/ subsys=endpoint
2021-06-23T17:29:02.913215640Z level=debug msg="GET /endpoint/{id} request" endpointID=40 subsys=daemon
2021-06-23T17:29:02.963013248Z level=debug msg="GET /endpoint/{id}/log request" endpointID=40 subsys=daemon
2021-06-23T17:29:03.011112439Z level=debug msg="GET /endpoint/{id}/log request" endpointID=40 subsys=daemon
2021-06-23T17:29:10.370320802Z level=debug msg="Getting CEP during an initialization" containerID= controller="sync-to-k8s-ciliumendpoint (40)" datapathPolicyRevision=0 desiredPolicyRevision=281 endpointID=40 identity=19331 ipv4= ipv6= k8sPodName=/ subsys=endpointsynchronizer
2021-06-23T17:29:20.414928008Z level=debug msg="Skipping CiliumEndpoint update because it has not changed" containerID= controller="sync-to-k8s-ciliumendpoint (40)" datapathPolicyRevision=0 desiredPolicyRevision=281 endpointID=40 identity=19331 ipv4= ipv6= k8sPodName=/ subsys=endpointsynchronizer
2021-06-23T17:29:30.415712926Z level=debug msg="Skipping CiliumEndpoint update because it has not changed" containerID= controller="sync-to-k8s-ciliumendpoint (40)" datapathPolicyRevision=0 desiredPolicyRevision=281 endpointID=40 identity=19331 ipv4= ipv6= k8sPodName=/ subsys=endpointsynchronizer
2021-06-23T17:29:40.416697443Z level=debug msg="Skipping CiliumEndpoint update because it has not changed" containerID= controller="sync-to-k8s-ciliumendpoint (40)" datapathPolicyRevision=0 desiredPolicyRevision=281 endpointID=40 identity=19331 ipv4= ipv6= k8sPodName=/ subsys=endpointsynchronizer
2021-06-23T17:29:50.417753774Z level=debug msg="Skipping CiliumEndpoint update because it has not changed" containerID= controller="sync-to-k8s-ciliumendpoint (40)" datapathPolicyRevision=0 desiredPolicyRevision=281 endpointID=40 identity=19331 ipv4= ipv6= k8sPodName=/ subsys=endpointsynchronizer
2021-06-23T17:30:00.418385541Z level=debug msg="Skipping CiliumEndpoint update because it has not changed" containerID= controller="sync-to-k8s-ciliumendpoint (40)" datapathPolicyRevision=0 desiredPolicyRevision=281 endpointID=40 identity=19331 ipv4= ipv6= k8sPodName=/ subsys=endpointsynchronizer
2021-06-23T17:30:10.419156982Z level=debug msg="Skipping CiliumEndpoint update because it has not changed" containerID= controller="sync-to-k8s-ciliumendpoint (40)" datapathPolicyRevision=0 desiredPolicyRevision=281 endpointID=40 identity=19331 ipv4= ipv6= k8sPodName=/ subsys=endpointsynchronizer
2021-06-23T17:30:20.419445405Z level=debug msg="Skipping CiliumEndpoint update because it has not changed" containerID= controller="sync-to-k8s-ciliumendpoint (40)" datapathPolicyRevision=0 desiredPolicyRevision=281 endpointID=40 identity=19331 ipv4= ipv6= k8sPodName=/ subsys=endpointsynchronizer
2021-06-23T17:30:30.420192516Z level=debug msg="Skipping CiliumEndpoint update because it has not changed" containerID= controller="sync-to-k8s-ciliumendpoint (40)" datapathPolicyRevision=0 desiredPolicyRevision=281 endpointID=40 identity=19331 ipv4= ipv6= k8sPodName=/ subsys=endpointsynchronizer
2021-06-23T17:30:30.502828110Z level=debug msg="DELETE /endpoint/{id} request" params="{HTTPRequest:0xc0003c3b00 ID:container-id:309ac26cc3d7eb1727f97cf96197cc9e9387512234176df950d1bfe8f5460b13}" subsys=daemon
2021-06-23T17:30:40.420735898Z level=debug msg="Skipping CiliumEndpoint update because it has not changed" containerID= controller="sync-to-k8s-ciliumendpoint (40)" datapathPolicyRevision=0 desiredPolicyRevision=281 endpointID=40 identity=19331 ipv4= ipv6= k8sPodName=/ subsys=endpointsynchroniz

Not sure why - but adding --set debug.enabled=true did something with the predictability of the failing test. So it took a little while extra to figure out how to provoke the error state.

cilium-sysdump-20210623-193409.zip cilium-sysdump-20210623-193436.zip

Because going over the GH issue can be a bit confusing, I think it would be value to post a summary of our current understanding after synching with @knfoo offline.

While performing connectivity tests, @knfoo discovered that in some cases certain nodes will reach a state where pod creation will fail. This seem to happen when the envoy proxy starts in said nodes [1] (if L7 tests are disabled, the issue does not happen). That is, in some cases starting envoy seems to obstruct pod creation.

In the latest sysdump, in k8s-events-20210615-154339.json, there are the following failures:

failed to setup network for sandbox \"04e4bce5b9e77d3658946ff4ab1d880fad7266884a4fef3430d4903ac3deea10\"
failed to setup network for sandbox \"fbaeff6a832b232ff5def3a01bd3367417a5b62e98154b62ab05d5b177caa5fb\"
failed to setup network for sandbox \"eaa76446cfab3bb4b8edb7cbdffdb330e0abdf86384004775d18d372be56af71\"

All of these seem to appear to be scheduled on the same agent:

cilium-4dnrn-cilium-agent-20210615-154347.log:850:2021-06-15T13:40:36.209287058Z level=info msg="Create endpoint request" addressing="&{10.62.129.163 c03ec021-ba83-4e84-9792-1afff94c98f1  }" containerID=04e4bce5b9e77d3658946ff4ab1d880fad7266884a4fef3430d4903ac3deea10 datapathConfiguration="&{false true false true 0xc002a3e27b}" interface=lxc07026740578a k8sPodName=cilium-test/client-8655c47fd7-jv2b5 labels="[]" subsys=daemon sync-build=true
cilium-4dnrn-cilium-agent-20210615-154347.log:858:2021-06-15T13:40:36.298953492Z level=info msg="Create endpoint request" addressing="&{10.62.129.46 a87cc702-a608-4df6-a3a3-c6f98099a944  }" containerID=fbaeff6a832b232ff5def3a01bd3367417a5b62e98154b62ab05d5b177caa5fb datapathConfiguration="&{false true false true 0xc00282acab}" interface=lxc77d7199c9937 k8sPodName=cilium-test/client2-657df6649d-vlsln labels="[]" subsys=daemon sync-build=true
cilium-4dnrn-cilium-agent-20210615-154347.log:868:2021-06-15T13:40:38.868091682Z level=info msg="Create endpoint request" addressing="&{10.62.129.24 cccfa8ff-0d76-40c0-9cd8-b60248a8656e  }" containerID=eaa76446cfab3bb4b8edb7cbdffdb330e0abdf86384004775d18d372be56af71 datapathConfiguration="&{false true false true 0xc002e336ab}" interface=lxcd84fc0ffd204 k8sPodName=cilium-test/echo-same-node-d7dcc498d-5l4wg labels="[]" subsys=daemon sync-build=true

Which is also the only agent that starts envoy:

cilium-4dnrn-cilium-agent-20210615-154347.log:547:2021-06-15T13:26:47.703666371Z level=info msg="Envoy: Listening for prometheus metrics" port=9095 subsys=envoy-manager

The issue does not happen every time envoy starts, but it can be consistently reproduced after doing multiple retries (last example was 1 attempt out of 8 but this might vary).

The pod creating request seem to reach the agent, but it’s not clear what happens next.

In the same sysdump, we see the following error:

"message": "failed to create pod sandbox: rpc error: code = unknown desc = failed to setup network for sandbox \"04e4bce5b9e77d3658946ff4ab1d880fad7266884a4fef3430d4903ac3deea10\": unable to create endpoint: cilium api client timeout exceeded"

Respective agent logs show that the above request reached the agent:

2021-06-15T13:40:36.209287058Z level=info msg="Create endpoint request" addressing="&{10.62.129.163 c03ec021-ba83-4e84-9792-1afff94c98f1  }" containerID=04e4bce5b9e77d3658946ff4ab1d880fad7266884a4fef3430d4903ac3deea10 datapathConfiguration="&{false true false true 0xc002a3e27b}" interface=lxc07026740578a k8sPodName=cilium-test/client-8655c47fd7-jv2b5 labels="[]" subsys=daemon sync-build=true

Just to clarify the test still fails if there is a sleep 5m before each run?

I reran the test with sleep 5m that did not make a difference.

I can update the issue title - but I am not really sure what to update it to.

This is in the when L7Proxy is misbehaving…

Warning FailedCreatePodSandBox 6s kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "eaa76446cfab3bb4b8edb7cbdffdb330e0abdf86384004775d18d372be56af71": Unable to create endpoint: Cilium API client timeout exceeded

So I am a little confused how you got to the statement that:

I think we also determined that the Cilium API does not become unavailable.

Thank you for verifying this.

Next, I would like to understand the underlying error when the L7 proxy is enabled.

This previous comment, for example, mentions that the pods failed to reach a ready state after the third attempt. Am I correct in that the above comment refers to the kubeproxy case? Are pods failing to reach a ready state for the kubeProxyReplacement=strict case also?

I would, also, like to understand whether the issue relates to having multiple tests in succession.

For the case(s) that there are pod readiness issues on the connectivity test, does the issue happen if you wait (e.g., using a sleep 5m) after deleting the namespace?

If adding a sufficient large delay does not cause pods readiness issues, can you also please run the tests in different namespaces, with something like the following and check whether the issue appears again?

for I in {0..10};
do
     cilium connectivity test --test-namespace cilium-test-$I ...
     kubecetl delete ns cilium-test-$I
     echo "Done with test $I"
done

I reran the test 50 delete ns and cilium connectivity test they all passed so I guess that


❌ policy is not applied in all Cilium nodes in time ℹ️ [dcll] Cilium agent kube-system/cilium-msnb7 logs since 2021-06-16 14:48:15.874727368 +0200 CEST m=+63.778213797: …

was a glitch.

I ran the test with the kubeproxy replacement

helm upgrade -i cilium cilium/cilium \
  --version 1.10.0 \
  --namespace kube-system \
  --set kubeProxyReplacement=strict \
  --set k8sServiceHost=<k8s-api> \
  --set k8sServicePort=6443 \
  --set localRedirectPolicy=true \
  --set prometheus.enabled=true \
  --set operator.prometheus.enabled=true \
  --set hubble.enabled=true \
  --set hubble.metrics.enabled="{dns,drop,tcp,flow,port-distribution,icmp,http}" \
  --set hubble.relay.enabled=true \
  --set hubble.ui.enabled=true \
  --set containerRuntime.integration=containerd \
  --set tunnel=disabled \
  --set ipam.mode=kubernetes \
  --set nativeRoutingCIDR=10.62.128.0/19 \
  --set devices=client0 \
  --set endpointRoutes.enabled=true \
  --set installNoConntrackIptablesRules=true \
  --set l7Proxy=false

I ran the test 50 delete ns and cilium connectivity test cycles and no errors. I guess it confirms that it has to do with the L7 proxy.

We had a closer look at the logs with @gandro.

One thing we would like to note is that the 9095 port is not the cilium agent API port. It is a port that is used by envoy to serve Prometheus metrics. Envoy is started when a L7 policy is imported.

For example, in the cilium-4dnrn-cilium-agent-20210615-154347.log, the following log entry:

2021-06-15T13:26:47.623326662Z level=info msg="Policy Add Request" ciliumNetworkPolicy="[&{EndpointSelector:{\"matchLabels\":{\"any:kind\":\"client\",\"k8s:io.kubernetes.pod.namespace\":\"cilium-test\"}} NodeSelector:{} Ingress:[] IngressDeny:[] Egress:[{EgressCommonRule:{ToEndpoints:[] ToRequires:[] ToCIDR: ToCIDRSet:[] ToEntities:[] ToServices:[] ToGroups:[] aggregatedSelectors:[{LabelSelector:0xc0004eefc0 requirements:0xc00000db30 cachedLabelSelectorString:&LabelSelector{MatchLabels:map[string]string{reserved.none: ,},MatchExpressions:[]LabelSelectorRequirement{},}}]} ToPorts:[{Ports:[{Port:80 Protocol:TCP}] TerminatingTLS:<nil> OriginatingTLS:<nil> Rules:0xc0026fcf50}] ToFQDNs:[{MatchName:google.com MatchPattern:}]} {EgressCommonRule:{ToEndpoints:[{\"matchLabels\":{\"k8s:io.kubernetes.pod.namespace\":\"kube-system\",\"k8s:k8s-app\":\"kube-dns\"}}] ToRequires:[] ToCIDR: ToCIDRSet:[] ToEntities:[] ToServices:[] ToGroups:[] aggregatedSelectors:[]} ToPorts:[{Ports:[{Port:53 Protocol:ANY}] TerminatingTLS:<nil> OriginatingTLS:<nil> Rules:0xc0026fd030}] ToFQDNs:[]}] EgressDeny:[] Labels:[k8s:io.cilium.k8s.policy.derived-from=CiliumNetworkPolicy k8s:io.cilium.k8s.policy.name=client-egress-to-fqdns-google k8s:io.cilium.k8s.policy.namespace=cilium-test k8s:io.cilium.k8s.policy.uid=7ffb8360-4a31-4746-b5a2-a989eed0b82f] Description:}]" policyAddRequest=30e02466-063a-4fc1-a9ce-02467a7456a4 subsys=daemon

is followed by Envoy start to listen on this port:

2021-06-15T13:26:47.703666371Z level=info msg="Envoy: Listening for prometheus metrics" port=9095 subsys=envoy-manager

In other agents, where there are no relevant L7 policies, the above message about envoy listening to 9095 does not exist.

To help pin down the issue, would it be possible to install cilium using --set l7Proxy=false so that envoy is disabled. And then run non-L7 connectivity tests using: cilium connectivity test --test '!pod-to-world-toFQDNs'?

We will continue investigating this, but it would helpful if we can rule out our theory that this relates to L7 proxy is correct.

It seems to be the same problem. During the 3rd cilium connectivity test the pods failed to get to a ready state:

Warning Unhealthy 73s (x49 over 2m49s) kubelet Readiness probe failed: Get "http://10.62.129.125:8080/": context deadline exceeded (Client.Timeout exceeded while awaiting headers) Canceled the test and deleted the namespace. then we are back to: Warning FailedCreatePodSandBox 6s kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "eaa76446cfab3bb4b8edb7cbdffdb330e0abdf86384004775d18d372be56af71": Unable to create endpoint: Cilium API client timeout exceeded

Sysdump: cilium-sysdump-20210615-154339.zip

–set k8sServiceHost=<k8s-api> \ –set k8sServicePort=6443 \

Could you also remove those since you’re now using kube-proxy?

I installed cilium with this command.

helm upgrade -i cilium cilium/cilium \
  --version 1.10.0 \
  --namespace kube-system \
  --set k8sServiceHost=<k8s-api> \
  --set k8sServicePort=6443  \
  --set prometheus.enabled=true  \
  --set operator.prometheus.enabled=true \
  --set hubble.enabled=true \
  --set hubble.metrics.enabled="{dns,drop,tcp,flow,port-distribution,icmp,http}" \
  --set hubble.relay.enabled=true \
  --set hubble.ui.enabled=true \
  --set containerRuntime.integration=containerd \
  --set tunnel=disabled \
  --set ipam.mode=kubernetes \
  --set nativeRoutingCIDR=10.62.128.0/19 \
  --set devices="client0" \
  --set endpointRoutes.enabled=true  \
  --set bandwidthManager=false

disabling the bandwidth manager did not change the problem. output from the cilium connectivity test the world errors are expected - we do not have opened up to all of the world in our border firewalls. test-2.md

Sysdump from the broken nodes. cilium-sysdump-20210611-111109.zip cilium-sysdump-20210611-111359.zip

I will recreate the cluster with out disabling kube-proxy

Those should be fine AFAIK.

From dmesg this is what I have that is recurring

[Fri Jun  4 11:00:01 2021] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
[Fri Jun  4 11:00:01 2021] IPv6: ADDRCONF(NETDEV_CHANGE): lxc16383abce7f3: link becomes ready
[Fri Jun  4 11:00:01 2021] lxc16383abce7f3: Caught tx_queue_len zero misconfig
[Mon Jun  7 10:16:45 2021] IPv6: ADDRCONF(NETDEV_CHANGE): lxc_health: link becomes ready
[Mon Jun  7 10:16:52 2021] lxc_health: Caught tx_queue_len zero misconfig

I will try to first run cilium with badnwidthManager=false and then disable the kubeProxyReplacement tomorrow

The cluster and nodes are idle. I have only run cilium connectivity test The other deployments are static no restarts, scaling etc.

cilium-sysdump-20210604-132435.zip cilium-sysdump-20210604-132759.zip

I have run the sysdump command on a working node as well - but even I specify the --size-limit the file is to large to upload to GH it is 36.3Mb