calico: Calico-node crashing after switching to the eBPF mode

Expected Behavior

Calico working in the eBPF mode

Current Behavior

calico-node crashing:

2022-10-20 19:42:40.977 [INFO][10591] felix/int_dataplane.go 616: BPF enabled, starting BPF endpoint manager and map manager.
2022-10-20 19:42:40.978 [INFO][10591] felix/maps.go 474: Loaded map file descriptor. fd=0xc name="/sys/fs/bpf/tc/globals/cali_v4_ip_sets" 
2022-10-20 19:42:40.979 [INFO][10591] felix/maps.go 474: Loaded map file descriptor. fd=0xd name="/sys/fs/bpf/tc/globals/cali_v4_state3" 
2022-10-20 19:42:40.980 [INFO][10591] felix/maps.go 474: Loaded map file descriptor. fd=0xe name="/sys/fs/bpf/tc/globals/cali_v4_arp2" 
2022-10-20 19:42:40.981 [INFO][10591] felix/maps.go 474: Loaded map file descriptor. fd=0xf name="/sys/fs/bpf/tc/globals/cali_v4_fsafes2" 
2022-10-20 19:42:40.981 [INFO][10591] felix/maps.go 474: Loaded map file descriptor. fd=0x10 name="/sys/fs/bpf/tc/globals/cali_v4_nat_fe3" 
2022-10-20 19:42:40.982 [INFO][10591] felix/maps.go 474: Loaded map file descriptor. fd=0x11 name="/sys/fs/bpf/tc/globals/cali_v4_nat_be" 
2022-10-20 19:42:40.983 [INFO][10591] felix/maps.go 474: Loaded map file descriptor. fd=0x12 name="/sys/fs/bpf/tc/globals/cali_v4_nat_aff" 
2022-10-20 19:42:40.984 [INFO][10591] felix/maps.go 474: Loaded map file descriptor. fd=0x13 name="/sys/fs/bpf/tc/globals/cali_v4_routes" 
2022-10-20 19:42:40.985 [INFO][10591] felix/maps.go 474: Loaded map file descriptor. fd=0x14 name="/sys/fs/bpf/tc/globals/cali_v4_ct3" 
2022-10-20 19:42:40.986 [INFO][10591] felix/maps.go 474: Loaded map file descriptor. fd=0x15 name="/sys/fs/bpf/tc/globals/cali_v4_srmsg" 
2022-10-20 19:42:40.987 [INFO][10591] felix/maps.go 474: Loaded map file descriptor. fd=0x16 name="/sys/fs/bpf/tc/globals/cali_v4_ct_nats" 
2022-10-20 19:42:40.988 [INFO][10591] felix/maps.go 474: Loaded map file descriptor. fd=0x17 name="/sys/fs/bpf/tc/globals/cali_iface2" 
libbpf: prog 'calico_connect_v4': failed to attach to cgroup: No such file or directory
2022-10-20 19:42:41.003 [PANIC][10591] felix/int_dataplane.go 709: BPFConnTimeLBEnabled but failed to attach connect-time load balancer, bailing out. error=failed to attach program calico_connect_v4: failed to attach calico_connect_v4 to cgroup /run/calico/cgroup (legacy try operation not permitted): no such file or directory
panic: (*logrus.Entry) 0xc00050e540

goroutine 1 [running]:
github.com/sirupsen/logrus.(*Entry).log(0xc00050e4d0, 0x0, {0xc000624300, 0x52})
    /go/pkg/mod/github.com/sirupsen/logrus@v1.9.0/entry.go:260 +0x47e
github.com/sirupsen/logrus.(*Entry).Log(0xc00050e4d0, 0x0, {0xc0007c6f60?, 0x2b56120?, 0xc00004e660?})
    /go/pkg/mod/github.com/sirupsen/logrus@v1.9.0/entry.go:304 +0x4f
github.com/sirupsen/logrus.(*Entry).Panic(...)
    /go/pkg/mod/github.com/sirupsen/logrus@v1.9.0/entry.go:342
github.com/projectcalico/calico/felix/dataplane/linux.NewIntDataplaneDriver({{0xc000054134, 0xc}, 0x0, {0x0, 0x0}, 0x0, 0x0, 0x0, 0x12b5, 0x100000, ...})
    /go/src/github.com/projectcalico/calico/felix/dataplane/linux/int_dataplane.go:709 +0x344a
github.com/projectcalico/calico/felix/dataplane.StartDataplaneDriver(0xc000189000, 0xc00020e5d0, 0xc0005d4fd0, 0xc0005d4fe0, 0xc000439800)
    /go/src/github.com/projectcalico/calico/felix/dataplane/driver.go:365 +0x1ff9
github.com/projectcalico/calico/felix/daemon.Run({0x2f5a6a3, 0x15}, {0x34360c0, 0x7}, {0x3443820, 0x18}, {0x346a180, 0x28})
    /go/src/github.com/projectcalico/calico/felix/daemon/daemon.go:414 +0x20a5
main.main()
    /go/src/github.com/projectcalico/calico/node/cmd/calico-node/main.go:133 +0x86e

Log from mount-bpffs container:

2022-10-24 15:41:33.228 [INFO][1] init-best-effort/startup.go 425: Early log level set to info
2022-10-24 15:41:33.237 [INFO][1] init-best-effort/calico-init_linux.go 57: Checking if BPF filesystem is mounted.
2022-10-24 15:41:33.240 [INFO][1] init-best-effort/calico-init_linux.go 69: BPF filesystem is mounted.
2022-10-24 15:41:33.240 [INFO][1] init-best-effort/calico-init_linux.go 92: Checking if cgroup2 filesystem is mounted.
2022-10-24 15:41:33.244 [INFO][1] init-best-effort/calico-init_linux.go 120: Cgroup2 filesystem is not mounted. Trying to mount it...
2022-10-24 15:41:33.245 [INFO][1] init-best-effort/calico-init_linux.go 126: Mount point /run/calico/cgroup is ready for mounting root cgroup2 fs.
2022-10-24 15:41:33.296 [INFO][1] init-best-effort/calico-init_linux.go 135: Mounted root cgroup2 filesystem.

There is no /var/run/cgroups mounted in container:

$ k -n kube-system exec canal-sbbnn -c calico-node -- /bin/sh -c "ls -la /var/run/cgroup"
ls: cannot access '/var/run/cgroup': No such file or directory
command terminated with exit code 2

On worker node:

$ ls -la /var/run/cgroup
ls: cannot access '/var/run/cgroup': No such file or directory

Possible Solution

Steps to Reproduce (for bugs)

  1. following offical guidehttps://projectcalico.docs.tigera.io/maintenance/ebpf/enabling-ebpf

Context

I was trying to switch existing cluster from iptables dataplane to eBPF.

Your Environment

  • Canal v3.24.2 provided by this manifest
  • Kubernetes v1.23.9
  • Operating System and version: Debian 11.5, kernel 5.18.0-0.deb11.4-amd64

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Comments: 15 (9 by maintainers)

Most upvoted comments

I also encountered this problem in the rocky9 system, @mazdakn