cilium: Cilium without kube-proxy and systemd 247 is broken on Debian Bullseye
Bug report
Installing cilium on a fresh kubernetes installation without kube-proxy on debian bullseye with systemd 247 doesn’t work: coredns won’t come up because the API server is not reachable. It keeps saying “still waiting on kubernetes”. Everything comes up except coredns.
It’s related to kube-proxy: it works with kube-proxy enabled.
General Information
- Linux: debian 10 with bullseye repos enabled
- Cilium version (run
cilium version): 1.9.1 - Kernel version (run
uname -a): 5.10.0-1-amd64 - Orchestration system version in use (e.g.
kubectl version, Mesos, …): kubectl: 1.20.2 - Link to relevant artifacts (policies, deployments scripts, …)
- Generate and upload a system zip: cilium-sysdump-20210119-220818.zip
How to reproduce the issue
- obtain a VM with debian bullseye
- update the whole system (
apt-get updateandapt-get dist-upgrade) - install kubeadm
- follow the cilium guide
- untaint master
kubectl taint nodes "${master_node}" node-role.kubernetes.io/master:NoSchedule-
More Details
We updated our system, with that systemd from 246 to 247. We were able to update the whole system with systemd on hold on 246 and it worked. Updating then systemd alone breaks it.
“Breaks it”: a service requested the API server, uses the IP directly of the server and runs into a timeout repeatedly. I suspect that’s also what coredns is meaning when it says “still waiting for kubernetes”.
The logs in the cilium operator or in the agent don’t say anything suspicious.
10645 is not helping. rp_filter is 0 everywhere (i.e. when doing sysctl -a | grep \\.rp_filter) and the mentioned option --enable-endpoint-routes is not helping either. Plus, we ran with systemd 246 before without issues.
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Comments: 30 (12 by maintainers)
This issue still persists. And it’s likely to cause trouble as soon as debian bullseye goes stable. Could we please keep it open?
@pchaigno that’s what I did in the bug report above: create a cloud instance in hetzner cloud with debian 10, add bullseye repos, apt-get update and apt-get dist-upgrade and install a cluster with cilium using kubeadm as described in the documentation for both tools and remove the taint for the master. No further networking involved, e.g. a private network or similar.
Then you might be hitting https://github.com/cilium/cilium/issues/14579.