cilium: Kube-proxy replacement not working when running without privileges
Is there an existing issue for this?
- I have searched the existing issues
What happened?
I’m running this setup:
- Single node Kubernetes cluster based on Talos Linux (Talos Linux 1.2.3 with Kubernetes 1.25.0, installed without kube-proxy and without CNI)
Install Cilium using helm (this needs to be this complex as SYS_MODULE is not available on Talos Linux):
$ cat <<PR > postrend.sh
#!/bin/sh
set -e
$ cat <&0 > base.yaml
kubectl kustomize .
PR
$ cat <<PATCH > caps-patch.yaml
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: cilium
namespace: kube-system
spec:
template:
spec:
containers:
- name: cilium-agent
securityContext:
capabilities:
add:
- CHOWN
- KILL
- NET_ADMIN
- NET_RAW
- IPC_LOCK
- SYS_RESOURCE
- PERFMON
- BPF
- DAC_OVERRIDE
- FOWNER
- SETGID
- SETUID
- SYS_ADMIN
initContainers:
- name: clean-cilium-state
securityContext:
capabilities:
add:
- NET_ADMIN
- SYS_RESOURCE
- PERFMON
- BPF
PATCH
$ cat <<KUST > kustomization.yaml
resources:
- base.yaml
patchesStrategicMerge:
- caps-patch.yaml
KUST
$ cat <VALUES > values.yaml
k8sServiceHost: <your loadbalancer IP>
k8sServicePort: 6443
kubeProxyReplacement: strict
operator:
replicas: 1
securityContext:
extraCapabilities:
- PERFMON
- BPF
privileged: false
VALUES
$ helm upgrade --install -n kube-system --version 1.12.2 -f values.yaml --post-renderer ./postrend.sh cilium cilium/cilium
In my case, coredns is reliably crashlooping or otherwise unhappy about not being able to reach the kubernetes API. This can further be debugged by launching a pod that does something like curl https://10.96.0.1:443/ (the ClusterIP Service default/kubernetes).
Funny thing is that when I change securityContext.privileged to true and upgrade the chart, the problem vanishes.
Inter-Pod communication works in both situations, even between nodes, as long as you connect to Pod IP and not through a service.
Cilium Version
v1.12.2
Kernel Version
5.15.68-talos
Kubernetes Version
v1.25.0
Sysdump
cilium-sysdump-20221006-140816.zip
Relevant log output
No response
Anything else?
I’ve upgraded to 1.13.0-rc1 without effect.
Code of Conduct
- I agree to follow this project’s Code of Conduct
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Reactions: 2
- Comments: 38 (34 by maintainers)
@rio This will be fixed by https://github.com/cilium/cilium/pull/23953
@aanm does the logs provided above provide any hint? Is there any other information needed?