kubernetes: envFrom not working in ephemeral containers: failed to sync secret cache
What happened?
I have the following deployment:
apiVersion: apps/v1
kind: Deployment
metadata:
name: mytarget
spec:
selector:
matchLabels:
mylabel: mytarget
replicas: 2
template:
metadata:
labels:
mylabel: mytarget
spec:
serviceAccountName: default
containers:
- name: mytarget
image: <custom-image>
command: ["sleep", "100d"]
I am working on an automated script (written in Go) that, among other things, does the following:
- Creates a Secret resource
s := apiv1.Secret{
ObjectMeta: metav1.ObjectMeta{
Name: name,
Namespace: metav1.NamespaceDefault,
},
StringData: map[string]string{
"username": "foo",
"password": "bar",
},
}
_, err := clientset.CoreV1().Secrets(metav1.NamespaceDefault).Create(ctx, &s, metav1.CreateOptions{})
- Lists pods and creates one ephemeral container for each one of them with the Secret added as env variable using
envFrom
:
pod.Spec.EphemeralContainers = append(pod.Spec.EphemeralContainers, apiv1.EphemeralContainer{
TargetContainerName: targetContainerName,
EphemeralContainerCommon: apiv1.EphemeralContainerCommon{
Image: imageURL,
Name: ephemeralContainerName,
Command: getCommand(),
EnvFrom: []apiv1.EnvFromSource{
{
SecretRef: &apiv1.SecretEnvSource{
LocalObjectReference: apiv1.LocalObjectReference{
Name: name,
},
Optional: nil,
},
},
},
SecurityContext: &apiv1.SecurityContext{
Privileged: &[]bool{true}[0],
},
},
})
clientset.CoreV1().Pods(metav1.NamespaceDefault).UpdateEphemeralContainers(
ctx, pod.Name, &pod, metav1.UpdateOptions{},
)
When I run it, the ephemeral container is created but never started, it is left in Waiting
state with Reason CreateContainerConfigError
.
Ephemeral Containers:
ephemeral1234567:
Container ID:
Image: <image>
Image ID:
Port: <none>
Host Port: <none>
Command: <command>
State: Waiting
Reason: CreateContainerConfigError
Ready: False
Restart Count: 0
Environment Variables from:
secret-name Secret Optional: false
Environment: <none>
Mounts: <none>
The code seems to be correct since I can see the Secret is referenced in the Environment Variables.
Checking the events I can see the following:
Type | Reason | Age | From | Message |
---|---|---|---|---|
Warning | Failed | 3s | kubelet | Error: failed to sync secret cache: timed out waiting for the condition |
It seems the cache is not synced when an ephemeral container is created in a pod and therefore its creation fails because the Secret is not present in the cache.
Based on that previous assumption, I tried referencing the secret in the regular container defined in the deployment.
apiVersion: apps/v1
kind: Deployment
metadata:
name: mytarget
spec:
selector:
matchLabels:
mylabel: mytarget
replicas: 2
template:
metadata:
labels:
mylabel: mytarget
spec:
serviceAccountName: default
containers:
- name: mytarget
image: <custom-image>
command: ["sleep", "100d"]
envFrom:
- secretRef:
name: secret-name
Running the same code that creates the ephemeral container is now run successfully and it creates the container with the secret as an environment variable.
exp123456:
Container ID: containerd://2398e96b9add961fd292d127d9e383b98c0c88100f87e0b8c7a8ff46aa1becdc
Image: <image>
Image ID: <imageId>
Port: <none>
Host Port: <none>
Command: <command>
State: Running
Started: Wed, 11 Jan 2023 13:28:57 +0000
Ready: False
Restart Count: 0
Environment Variables from:
manual-secret Secret Optional: false
Environment: <none>
Mounts: <none>
What did you expect to happen?
I would expect the ephemeral container to be created and successfully started with the appropriate environment variable containing the data in the secret.
How can we reproduce it (as minimally and precisely as possible)?
Using the code provided in the explanation of the issue should be enough to reproduce the problem.
Deployment:
apiVersion: apps/v1
kind: Deployment
metadata:
name: mytarget
spec:
selector:
matchLabels:
mylabel: mytarget
replicas: 2
template:
metadata:
labels:
mylabel: mytarget
spec:
serviceAccountName: default
containers:
- name: mytarget
image: <custom-image>
command: ["sleep", "100d"]
Secret:
apiVersion: v1
kind: Secret
metadata:
name: manual-secret
type: Opaque
stringData:
token: test1234
Request to update the Pod to create an ephemeral container:
PATCH http://localhost:8080/api/v1/namespaces/{namespace}/pods/{podname}/ephemeralcontainers
Add the following ephemeral container
ephemeralContainers:
- name: ephemeral-uid-123456
image: nginx
command:
- "sleep 100d"
envFrom:
- secretRef:
name: manual-secret
resources: {}
terminationMessagePath: "/dev/termination-log"
terminationMessagePolicy: File
imagePullPolicy: IfNotPresent
securityContext:
privileged: true
targetContainerName: mytarget
Anything else we need to know?
No response
Kubernetes version
$ kubectl version
WARNING: This version information is deprecated and will be replaced with the output from kubectl version --short. Use --output=yaml|json to get the full version.
Client Version: version.Info{Major:"1", Minor:"26", GitVersion:"v1.26.0", GitCommit:"b46a3f887ca979b1a5d14fd39cb1af43e7e5d12d", GitTreeState:"clean", BuildDate:"2022-12-08T19:58:30Z", GoVersion:"go1.19.4", Compiler:"gc", Platform:"linux/amd64"}
Kustomize Version: v4.5.7
Server Version: version.Info{Major:"1", Minor:"24+", GitVersion:"v1.24.7-eks-fb459a0", GitCommit:"c240013134c03a740781ffa1436ba2688b50b494", GitTreeState:"clean", BuildDate:"2022-10-24T20:36:26Z", GoVersion:"go1.18.7", Compiler:"gc", Platform:"linux/amd64"}
WARNING: version difference between client (1.26) and server (1.24) exceeds the supported minor version skew of +/-1
Cloud provider
OS version
# On Linux:
$ cat /etc/os-release
NAME="Alpine Linux"
ID=alpine
VERSION_ID=3.17.1
PRETTY_NAME="Alpine Linux v3.17"
HOME_URL="https://alpinelinux.org/"
BUG_REPORT_URL="https://gitlab.alpinelinux.org/alpine/aports/-/issues"
$ uname -a
Linux pod-6bb864f684-tl5l7 5.4.219-126.411.amzn2.x86_64 #1 SMP Wed Nov 2 17:44:17 UTC 2022 x86_64 Linux
Install tools
Container runtime (CRI) and version (if applicable)
Related plugins (CNI, CSI, …) and versions (if applicable)
About this issue
- Original URL
- State: open
- Created a year ago
- Reactions: 3
- Comments: 20 (14 by maintainers)
Hello @enj ! 👋🏼 Bug triage for the 1.29 release cycle is here! This issue hasn’t been updated for a long time, so I wanted to check what the status is. The code freeze will start (01:00 UTC Wednesday 1st November 2023 / 18:00 PDT Tuesday 31st October 2023), which is this week. We want to make sure that every PR has a chance to be merged on time.
As the issue is tagged for 1.29, is it still planned for that release?
Just a wild guess, can this be related to this bug: https://github.com/kubernetes/kubernetes/issues/114167?