kubernetes: Windows - FailedCreatePodSandBox: Failed to create pod sandbox: rpc error: code = Unknown desc = failed to reserve sandbox name

What happened?

Windows e2e tests flake due to the sandbox name being reserved for another container:

Jan  5 22:35:37.117: INFO: At 2022-01-05 22:27:52 +0000 UTC - event for ss2-1: {kubelet capz-conf-l8wfg} FailedCreatePodSandBox: Failed to create pod sandbox: rpc error: code = DeadlineExceeded desc = context deadline exceeded
Jan  5 22:35:37.118: INFO: At 2022-01-05 22:27:53 +0000 UTC - event for ss2-1: {kubelet capz-conf-l8wfg} FailedCreatePodSandBox: Failed to create pod sandbox: rpc error: code = Unknown desc = failed to reserve sandbox name "ss2-1_statefulset-8250_f1c4daf1-6df3-453a-

Reported in https://github.com/kubernetes-sigs/cluster-api-provider-azure/pull/1932#issuecomment-1006193061 then observed in CI: https://prow.k8s.io/view/gs/kubernetes-jenkins/logs/ci-kubernetes-e2e-capz-master-containerd-windows/1481857931363225600

It is also reported in the issue https://github.com/kubernetes/kubernetes/issues/88153

What did you expect to happen?

The pods to not fail creation

How can we reproduce it (as minimally and precisely as possible)?

seems to be related to resource contention and multiple containers being run at same time.

Anything else we need to know?

This looks to be a similar issue reported in https://github.com/containerd/containerd/issues/4604

From analysis done in that report it appears that kubelet may be duplicating requests when previous work is currently being done: https://github.com/containerd/containerd/issues/4604#issuecomment-1013268187

Kubernetes version

$ kubectl version 1.24+
# paste output here

Cloud provider

Azure

OS version

# On Linux:
$ cat /etc/os-release
# paste output here
$ uname -a
# paste output here

# On Windows:
C:\> wmic os get Caption, Version, BuildNumber, OSArchitecture
# paste output here
Windows 2019

Install tools

cluster api for azure

Container runtime (CRI) and and version (if applicable)

containerd 1.6.beta.4

Related plugins (CNI, CSI, …) and versions (if applicable)

calico 3.20.0

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Comments: 21 (15 by maintainers)

Most upvoted comments

@jsturtevant applying the latest patch version fixed this.