kubernetes: /dev/shm is not cleared across pod restarts

What happened: When a pod is restarted (eg. due to a failing livenessCheck or pid 1 exit) the files created in /dev/shm aren’t cleared like they would be if the pod was deleted and recreated

What you expected to happen: Regardless of how the container starts, /dev/shm should always start in a consistent state (ie. empty)

How to reproduce it (as minimally and precisely as possible): This deployment will touch a new file in /dev/shm and restart every minute or so - you can see the files from previous containers still hanging around when the pod is restarted:

apiVersion: apps/v1beta2
kind: Deployment
metadata:
  name: shm-test
spec:
  selector:
    matchLabels:
      app: shm-test
  template:
    metadata:
      labels:
        app: shm-test
    spec:

      containers:
      - name: csp-gateway
        image: "centos:7"
        imagePullPolicy: Always

        command: 
        - /bin/sh

        args:
        - -c
        - 'set -x; ls -la /dev/shm; touch "/dev/shm/$(date)"; sleep 60'

        livenessProbe:
          exec:
            command:
            - /bin/sh
            - -c
            - "exit 1"
          initialDelaySeconds: 50
          periodSeconds: 3
          failureThreshold: 3

Example output:

# First run, /dev/shm is clear
PS C:\Users\Seb> kubectl get po
NAME                       READY   STATUS    RESTARTS   AGE
shm-test-676f6bb69-nkxxb   1/1     Running   0          23s
PS C:\Users\Seb> kubectl logs deployment/shm-test -f
+ ls -la /dev/shm
total 0
drwxrwxrwt 2 root root  40 Aug  6 02:28 .
drwxr-xr-x 5 root root 360 Aug  6 02:28 ..
++ date
+ touch '/dev/shm/Tue Aug  6 02:28:59 UTC 2019'
+ sleep 60

# Second run, /dev/shm still has the "shared memory" we put in there in the previous container
PS C:\Users\Seb> kubectl get po
NAME                       READY   STATUS    RESTARTS   AGE
shm-test-676f6bb69-nkxxb   1/1     Running   1          98s
PS C:\Users\Seb> kubectl logs deployment/shm-test -f
+ ls -la /dev/shm
total 0
drwxrwxrwt 2 root root  60 Aug  6 02:28 .
drwxr-xr-x 5 root root 360 Aug  6 02:30 ..
-rw-r--r-- 1 root root   0 Aug  6 02:28 Tue Aug  6 02:28:59 UTC 2019
++ date
+ touch '/dev/shm/Tue Aug  6 02:30:08 UTC 2019'
+ sleep 60

# Third run, the first two "shared memory" files still exist, this will continue..
PS C:\Users\Seb> kubectl get po
NAME                       READY   STATUS    RESTARTS   AGE
shm-test-676f6bb69-nkxxb   1/1     Running   2          3m22s
PS C:\Users\Seb> kubectl logs deployment/shm-test -f
+ ls -la /dev/shm
total 0
drwxrwxrwt 2 root root  80 Aug  6 02:30 .
drwxr-xr-x 5 root root 360 Aug  6 02:31 ..
-rw-r--r-- 1 root root   0 Aug  6 02:28 Tue Aug  6 02:28:59 UTC 2019
-rw-r--r-- 1 root root   0 Aug  6 02:30 Tue Aug  6 02:30:08 UTC 2019
++ date
+ touch '/dev/shm/Tue Aug  6 02:31:18 UTC 2019'
+ sleep 60
...

Anything else we need to know?: We initially found this problem because a piece of software was trying to use shared memory from a previous container and crashing in unexpected ways.

The vendor is addressing this issue, but I felt like the fact that this doesn’t get cleared is unexpected since one of the main container selling points is to always start to be from the same known clean slate (except for explicitly mapped volumes of course).

As a workaround, adding rm -rf /dev/shm/* to our entrypoint script works ok.

I found this in our production environment on v1.12.8 and also reproduced in on the latest version v1.15.0 in minikube

Environment:

  • Kubernetes version (use kubectl version): v1.12.8, v1.15.0
  • Cloud provider or hardware configuration: EC2, Hyper-V
  • OS (e.g: cat /etc/os-release): Debian GNU/Linux 9 (stretch), Buildroot 2018.05.3
  • Kernel (e.g. uname -a): Linux ip-172-20-32-242 4.9.0-9-amd64 #1 SMP Debian 4.9.168-1 (2019-04-12) x86_64 GNU/Linux, Linux minikube 4.15.0 #1 SMP Sun Jun 23 23:02:01 PDT 2019 x86_64 GNU/Linux
  • Install tools: Kops, Minikube
  • Network plugin and version (if this is a network-related bug): n/a
  • Others: n/a

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Reactions: 1
  • Comments: 29 (17 by maintainers)

Most upvoted comments

I too have been caught out by this. It would be good to see this fixed and have consistency:

What you expected to happen: Regardless of how the container starts, /dev/shm should always start in a consistent state (ie. empty)

Containers in Pod share the same IPC namespace which will use /dev/shm. So I think it makes sense that /dev/shm is not cleared across container restarts

Need to have look at it later /assign @adisky