cilium: Cilium without kube-proxy and systemd 247 is broken on Debian Bullseye

Bug report

Installing cilium on a fresh kubernetes installation without kube-proxy on debian bullseye with systemd 247 doesn’t work: coredns won’t come up because the API server is not reachable. It keeps saying “still waiting on kubernetes”. Everything comes up except coredns.

It’s related to kube-proxy: it works with kube-proxy enabled.

General Information

  • Linux: debian 10 with bullseye repos enabled
  • Cilium version (run cilium version): 1.9.1
  • Kernel version (run uname -a): 5.10.0-1-amd64
  • Orchestration system version in use (e.g. kubectl version, Mesos, …): kubectl: 1.20.2
  • Link to relevant artifacts (policies, deployments scripts, …)
  • Generate and upload a system zip: cilium-sysdump-20210119-220818.zip

How to reproduce the issue

  1. obtain a VM with debian bullseye
  2. update the whole system (apt-get update and apt-get dist-upgrade)
  3. install kubeadm
  4. follow the cilium guide
  5. untaint master kubectl taint nodes "${master_node}" node-role.kubernetes.io/master:NoSchedule-

More Details

We updated our system, with that systemd from 246 to 247. We were able to update the whole system with systemd on hold on 246 and it worked. Updating then systemd alone breaks it.

“Breaks it”: a service requested the API server, uses the IP directly of the server and runs into a timeout repeatedly. I suspect that’s also what coredns is meaning when it says “still waiting for kubernetes”.

The logs in the cilium operator or in the agent don’t say anything suspicious.

10645 is not helping. rp_filter is 0 everywhere (i.e. when doing sysctl -a | grep \\.rp_filter) and the mentioned option --enable-endpoint-routes is not helping either. Plus, we ran with systemd 246 before without issues.

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 30 (12 by maintainers)

Most upvoted comments

This issue still persists. And it’s likely to cause trouble as soon as debian bullseye goes stable. Could we please keep it open?

@pchaigno that’s what I did in the bug report above: create a cloud instance in hetzner cloud with debian 10, add bullseye repos, apt-get update and apt-get dist-upgrade and install a cluster with cilium using kubeadm as described in the documentation for both tools and remove the taint for the master. No further networking involved, e.g. a private network or similar.