kubernetes: node is not ready because of kubelet report a meaningful conflict error

Is this a BUG REPORT or FEATURE REQUEST?:

/kind bug

What happened:

kubelet report an error.kubectl get node is not ready there is a meaningful conflict (firstResourceVersion: "104201", currentResourceVersion: "4293"): diff1={"metadata":{"resourceVersion":"4293"},"status":{"conditions":[{"lastHeartbeatTime":"2018-01-03T07:38:24Z","lastTransitionTime":"2018-01-03T07:42:59Z","message":"Kubelet stopped posting node status.","reason":"NodeStatusUnknown","status":"Unknown","type":"DiskPressure"},{"lastHeartbeatTime":"2018-01-03T07:38:24Z","lastTransitionTime":"2018-01-03T07:42:59Z","message":"Kubelet stopped posting node status.","reason":"NodeStatusUnknown","status":"Unknown","type":"MemoryPressure"},{"lastHeartbeatTime":"2018-01-03T07:38:24Z","lastTransitionTime":"2018-01-03T07:42:59Z","message":"Kubelet stopped posting node status.","reason":"NodeStatusUnknown","status":"Unknown","type":"OutOfDisk"},{"lastHeartbeatTime":"2018-01-03T07:38:24Z","lastTransitionTime":"2018-01-03T07:42:59Z","message":"Kubelet stopped posting node status.","reason":"NodeStatusUnknown","status":"Unknown","type":"Ready"}]}} , diff2={"status":{"conditions":[{"lastHeartbeatTime":"2018-01-04T09:31:09Z","lastTransitionTime":"2018-01-04T09:31:09Z","message":"kubelet has no disk pressure","reason":"KubeletHasNoDiskPressure","status":"False","type":"DiskPressure"},{"lastHeartbeatTime":"2018-01-04T09:31:09Z","lastTransitionTime":"2018-01-04T09:31:09Z","message":"kubelet has sufficient memory available","reason":"KubeletHasSufficientMemory","status":"False","type":"MemoryPressure"},{"lastHeartbeatTime":"2018-01-04T09:31:09Z","lastTransitionTime":"2018-01-04T09:31:09Z","message":"kubelet has sufficient disk space available","reason":"KubeletHasSufficientDisk","status":"False","type":"OutOfDisk"},{"lastHeartbeatTime":"2018-01-04T09:31:09Z","lastTransitionTime":"2018-01-04T09:31:09Z","message":"kubelet is posting ready status","reason":"KubeletReady","status":"True","type":"Ready"}],"nodeInfo":{"gpus":[]}}} E0104 17:31:09.779522 7223 kubelet_node_status.go:318] Unable to update node status: update node status exceeds retry count

What you expected to happen:

I found a PR about this issue, #44788 it has been picked to 1.6. I want to know why this issue still happend.

How to reproduce it (as minimally and precisely as possible):

this issue is incidental, bug I found an issue still meet this problem #52498 he solves it by changing etcd 3.1.10 instead of 3.2.7. when I change the leader of etcd cluster, this issue will be gone. is this a known incompatibility between k8s and etcd or a etcd bug?

Anything else we need to know?:

Environment:

Kubernetes version (use kubectl version): 1.6.9
Etcd version: 3.0.17

About this issue

Original URL
State: closed
Created 6 years ago
Reactions: 1
Comments: 18 (14 by maintainers)

Commits related to this issue

Merge pull request #66171 from liggitt/patch-retry-1.9 Automatic merge from submit-queue. Remove patch retry conflict detection Minimal backport of #63146 Fixes #58002 Fixes spurious patch erro... — committed to kubernetes/kubernetes by deleted user 6 years ago
Merge pull request #66169 from liggitt/patch-retry-1.10 Automatic merge from submit-queue. Remove patch retry conflict detection Minimal backport of #63146 Fixes #58002 Fixes spurious patch err... — committed to kubernetes/kubernetes by deleted user 6 years ago

Most upvoted comments

I’ve seen several of these bug reports, and am having one myself, but they all get ignored.

SleepyBrett on Jan 19, 2018

Here is the work-around to restore the node:

SSH onto the affected node (somehow)
Stop the kubelet (systemctl stop kubelet)
Delete the node from Kubernetes kubectl delete nodes <node-name>
Restart the kubelet, it will re-register itself and clear the conflict.

I still think this is a bug in the kubelet though, I’m going to investigate that code.

brendandburns on Mar 6, 2018