kubernetes: Stop timeout isn't respected at shutdown/reboot

What happened: Containers are getting terminated by systemd without respecting the terminationGracePeriodSeconds set in the pod yaml on reboot or shutdown of a node.

What you expected to happen: terminationGracePeriodSeconds is respected by systemd when using systemd as the cgroup manager.

How to reproduce it (as minimally and precisely as possible):

  1. Use systemd as the cgroup manager in your container runtime.
  2. Create a pod yaml with terminationGracePeriodSeconds set to 120 seconds.
  3. Reboot the node.
  4. You will notice that the containers get SIGTERM, followed by systemd default stop timeout (typically 90 seconds) and then they are SIGKILLed.

Anything else we need to know?: This can be fixed by passing the stop timeout to the containers as part of the CreateContainer CRI API. This will allow the container runtimes to set the systemd property for the scope to override the default stop timeout to the value set through terminationGracePeriodSeconds. This needs changes across the stack as runc doesn’t currently provide a way to set the TimeoutStopUSec for the systemd scope created for a container. The behavior with cgroupfs cgroup manager will need further investigation.

Environment:

  • Kubernetes version (use kubectl version): All versions.

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Reactions: 1
  • Comments: 23 (16 by maintainers)

Commits related to this issue

Most upvoted comments

/reopen

while this impacts all pods, it is particularly an issue with static pods or daemon set backed pods which typically are not drained before a maintenance action.

/milestone v1.15