cilium: Cannot create CiliumIdentity for ServiceAccount names longer than 63 chars

Bug report

I found this bug while trying out EMR on EKS on a Cilium-enabled EKS1.20 cluster.

When a service account’s name is longer than 63 characters, it is not possible to start a pod using this service account.

Indeed, when creating the CiliumIdentity, Cilium will set a label io.cilium.k8s.policy.serviceaccount whose value will be the name of the service account for that pod. However, label values are limited to a maximum of 63 characters. Service Account names can be up to 253 characters long.

General Information

  • Cilium version: v1.9.7
  • Kernel version: 5.5.4.117-58.216.amzn2.x86_64
  • Orchestration system version in use: EKS 1.20

How to reproduce the issue

  1. Create a service account with a name longer than 63 characters
$ kubectl create sa emr-containers-sa-spark-executor-123456789012-h94a5lkq1wmdnn0lu3ldn86aul757y413dgn7tj9zmkq4tujzz4mzp
  1. Create a pod using this serviceAccount
$ cat <<EOT >> pod.yaml
apiVersion: v1
kind: Pod
metadata:
  name: static-web
  labels:
    role: myrole
  namespace: emr
spec:
  containers:
    - name: web
      image: nginx
      ports:
        - name: web
          containerPort: 80
          protocol: TCP
  serviceAccountName: emr-containers-sa-spark-executor-123456789012-h94a5lkq1wmdnn0lu3ldn86aul757y413dgn7tj9zmkq4tujzz4mzp
EOT
$ kubectl apply -f pod.yaml

Then after a while the pod will fail to start:

$ kubectl describe pod static-web
Name:         static-web
Namespace:    emr
Priority:     0
Node:         ip-10-210-158-23.eu-west-1.compute.internal/10.210.158.23
Start Time:   Thu, 17 Jun 2021 14:03:05 +0200
Labels:       role=myrole
Annotations:  kubernetes.io/psp: eks.privileged
Status:       Pending
IP:           
IPs:          <none>
Containers:
  web:
    Container ID:   
    Image:          nginx
    Image ID:       
    Port:           80/TCP
    Host Port:      0/TCP
    State:          Waiting
      Reason:       ContainerCreating
    Ready:          False
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from emr-containers-sa-spark-executor-123456789012-h94a5lkq1wmdqrkqt (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  emr-containers-sa-spark-executor-123456789012-h94a5lkq1wmdqrkqt:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  emr-containers-sa-spark-executor-123456789012-h94a5lkq1wmdqrkqt
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                 node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason                  Age                  From               Message
  ----     ------                  ----                 ----               -------
  Normal   Scheduled               5m29s                default-scheduler  Successfully assigned emr/static-web to ip-10-210-158-23.eu-west-1.compute.internal
  Warning  FailedCreatePodSandBox  3m58s                kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "e2b35de5c07d64130ebc14ee80e9c936350ad0f8973357ae1f8641bb3edf2270" network for pod "static-web": networkPlugin cni failed to set up pod "static-web_emr" network: Unable to create endpoint: Cilium API client timeout exceeded
  Warning  FailedCreatePodSandBox  2m27s                kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "8f7a714de44b928728a21066979bd0a5421d29f9215471b6980c67e817da5924" network for pod "static-web": networkPlugin cni failed to set up pod "static-web_emr" network: Unable to create endpoint: Cilium API client timeout exceeded
  Warning  FailedCreatePodSandBox  56s                  kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "c60b6d3993dd53224453ec36877cec1ae3bfcd6334049cb4f76d3620419495da" network for pod "static-web": networkPlugin cni failed to set up pod "static-web_emr" network: Unable to create endpoint: Cilium API client timeout exceeded
  Normal   SandboxChanged          55s (x3 over 3m57s)  kubelet            Pod sandbox changed, it will be killed and re-created.

And then we will see this in Cilium logs:

level=warning msg="Key allocation attempt failed" attempt=10 error="unable to allocate ID 2664 for key [k8s:io.cilium.k8s.policy.cluster=default k8s:io.cilium.k8s.policy.serviceaccount=emr-containers-sa-spark-executor-123456789012-h94a5lkq1wmdnn0lu3ldn86aul757y413dgn7tj9zmkq4tujzz4mzp k8s:io.kubernetes.pod.namespace=emr k8s:role=myrole]: CiliumIdentity.cilium.io \"2664\" is invalid: metadata.labels: Invalid value: \"emr-containers-sa-spark-executor-123456789012-h94a5lkq1wmdnn0lu3ldn86aul757y413dgn7tj9zmkq4tujzz4mzp\": must be no more than 63 characters" key="[k8s:io.cilium.k8s.policy.cluster=default k8s:io.cilium.k8s.policy.serviceaccount=emr-containers-sa-spark-executor-123456789012-h94a5lkq1wmdnn0lu3ldn86aul757y413dgn7tj9zmkq4tujzz4mzp k8s:io.kubernetes.pod.namespace=emr k8s:role=myrole]" subsys=allocator

About this issue

  • Original URL
  • State: open
  • Created 3 years ago
  • Reactions: 4
  • Comments: 18 (7 by maintainers)

Most upvoted comments

We’ve also run into this issue (outside of EMR/EKS).

While one could say that this is an unfortunate Kubernetes upstream restriction, I believe it is on Cilium to address: it seems to put service account names (which can be longer than 63 characters) into labels (which must not be longer than 63 characters). Asking Kubernetes to change the limit would probably be quite an undertaking, and asking all clients to adjust does not scale well when you have thousands of Cilium users. Does Cilium actually select on the labels? I assume it does, though if not then putting service account names into annotations could be an alternative.

(Side note: this issue is fairly hard to detect just by looking at the events which only surface a Cilium API client timeout. A welcome drive-by improvement would be to surface the error more directly.)

When Cilium creates Identities with labels, the label content can be matched in policy. By associating the k8s service account as a label into the Identity, this means that (a) Identities for applications with otherwise similar labels will be differentiated in policy by the ServiceAccount and (b) Users can also write policies to match on those labels in order to allow traffic based on the ServiceAccount, in addition or as an alternative to other application labels.

If this is a sticking point for some users and the users do not wish to write policies based on the ServiceAccounts, then I think that the Cilium community would be open to PRs to propose a way to disable ServiceAccount population in Identities, for example using a flag.

Another option to explore might be to see whether the existing flags to limit Identity-relevant labels would provide a way to remove these labels. I’m not sure if it works that way, but it could be worth investigating.

@joestringer Couple of days ago I spoke with AWS engineers about that issue and seems they were not aware of it. I raised the support case for AWS. Maybe they will consider to make SA shorter for EMR.