kubernetes: Log something about OOMKilled containers
Is this a BUG REPORT or FEATURE REQUEST?:
/kind feature
What happened:
Container gets killed because it tries to use more memory than allowed.
What you expected to happen:
Have an OOMKilled
event tied to the pod and logs about this
/sig node
About this issue
- Original URL
- State: open
- Created 6 years ago
- Reactions: 106
- Comments: 72 (25 by maintainers)
This has been discussed in #sig-instrumentation on Slack and was brought up on the sig-node call yesterday to determine a path forward.
There are two requests:
To summarize what’s currently available in kube-state-metrics:
kube_pod_container_status_terminated_reason
This is a (binary) gauge which has a value of1
for the current reason, and0
for all other reasons. As soon as the Pod restarts, all reasons go to0
.kube_pod_container_status_last_terminated_reason
Same as above for the prior reason, so it’s available after the Pod restarts.kube_pod_container_status_restarts_total
A count of the restarts, with no detail on the reason.The issues are:
For example, given a Pod that is sometimes being OOMKilled, and sometimes crashing, it’s desired to be able to view the historical termination reasons over time.
As a note: it was discussed and it appears the design of kube-state-metrics prevents aggregating the reason gauge into counters, and it’s preferred if this happens at the source.
Implementing both of the above requests will significantly improve the ability of cluster-users and monitoring vendors to debug when Pods are failing.
Can @kubernetes/sig-node-feature-requests provide some guidance on the next steps here?
CC: @dchen1107
This query combines container restart and termination reason:
Our team came up with a custom controller to implement the idea of having an OOMKilled event tied to the Pod. Please find it here: https://github.com/xing/kubernetes-oom-event-generator
From the README: The Controller listens to the Kubernetes API for “Container Started” events and searches for those claiming they were OOMKilled previously. For matching ones an Event is generated as
Warning
with the reasonPreviousContainerWasOOMKilled
.We would be very happy to get feedback on it.
Indeed, it seems to work 😃
@brancz do you know why this happens? also tried it in 1.3.1.
Now that #87856 is closed, what is the best way to alert on OOMKilled containers?
@lukeschlather #100487 should cover the logging and oom event being created for the associated pod that you are wanting.
/remove-lifecycle stale
/remove-lifecycle stale
Thanks, this seems to work fine for my use case:
This throws an alert on container OOM events and resolves the alert directly afterwards.
/remove-lifecycle stale
Is there a good way of probing OOMKilled? My use case is I want to detect OOM and have actions based on it. Thanks!
/remove-lifecycle rotten
I still think this should be more properly addressed.
@bjhaid fwiw you can use mtail against
dmesg
to produce metrics about oomkill messages.The problem here is that a pod can disappear and there’s no record of why. A metric is useful in that it lets you know something is wrong but it doesn’t actually tell you what is wrong. K8s shouldn’t be killing pods without leaving a record of why it killed which pod in an obvious place.
/remove-lifecycle stale
There’s an in progress PR about this now. https://github.com/kubernetes/kubernetes/pull/87856
@anderson4u2 I am a bit confused by your last comment. You wrote:
But in the example below you use
kube_pod_container_status_terminated_reason
, notkube_pod_container_status_last_terminated_reason
.So as far as I see, the new (very useful) metric
kube_pod_container_status_last_terminated_reason
is still unreleased./remove-lifecycle stale
Is this still relevant after https://github.com/kubernetes/kubernetes/pull/108004? It seems to me that it is covering the gaps kube-state-metrics has with OOMKilled events.
https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#how-pods-with-resource-limits-are-run
@lukeschlather ⬆️
Are the memory requests and limits just cgroups under the hood?
@lukeschlather for the record, the kernel kills pods, not k8s. that’s the whole problem with this issue 😦
please google for ( “oom kill kernel” )
/remove-lifecycle stale
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied,lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle stale
/lifecycle rotten
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
What is the component that actually OOMkills the container for going over the memory limit? Can that component simply log something? Where would that log go in GKE? The kubernetes apiserver logs? The node logs?
It seems like a lot of the related issues to this one get bogged down in how to deal with pathological cases (stuff getting killed by the kernel rather than simply getting killed for going over its limit.) Also, I want an event but if it’s going to be another 2 years before someone can figure out how to properly generate an event I would settle for logging anything anywhere at all.
To the best of my knowledge there is so far no built in way for GKE.
We are using
https://github.com/xing/kubernetes-oom-event-generator
in combination with alerting on a metric. Just be aware: This only works if the main process is killed and the POD gets evicted. If a subprocess (like a gunicorn worker) is nuked you need to rely on the logging of your running application. See e.g. https://github.com/benoitc/gunicorn/pull/2475https://github.com/xing/kubernetes-oom-event-generator
may be helpful