kubernetes: node is not ready because of kubelet report a meaningful conflict error
Is this a BUG REPORT or FEATURE REQUEST?:
/kind bug
What happened:
kubelet report an error.kubectl get node is not ready
there is a meaningful conflict (firstResourceVersion: "104201", currentResourceVersion: "4293"): diff1={"metadata":{"resourceVersion":"4293"},"status":{"conditions":[{"lastHeartbeatTime":"2018-01-03T07:38:24Z","lastTransitionTime":"2018-01-03T07:42:59Z","message":"Kubelet stopped posting node status.","reason":"NodeStatusUnknown","status":"Unknown","type":"DiskPressure"},{"lastHeartbeatTime":"2018-01-03T07:38:24Z","lastTransitionTime":"2018-01-03T07:42:59Z","message":"Kubelet stopped posting node status.","reason":"NodeStatusUnknown","status":"Unknown","type":"MemoryPressure"},{"lastHeartbeatTime":"2018-01-03T07:38:24Z","lastTransitionTime":"2018-01-03T07:42:59Z","message":"Kubelet stopped posting node status.","reason":"NodeStatusUnknown","status":"Unknown","type":"OutOfDisk"},{"lastHeartbeatTime":"2018-01-03T07:38:24Z","lastTransitionTime":"2018-01-03T07:42:59Z","message":"Kubelet stopped posting node status.","reason":"NodeStatusUnknown","status":"Unknown","type":"Ready"}]}} , diff2={"status":{"conditions":[{"lastHeartbeatTime":"2018-01-04T09:31:09Z","lastTransitionTime":"2018-01-04T09:31:09Z","message":"kubelet has no disk pressure","reason":"KubeletHasNoDiskPressure","status":"False","type":"DiskPressure"},{"lastHeartbeatTime":"2018-01-04T09:31:09Z","lastTransitionTime":"2018-01-04T09:31:09Z","message":"kubelet has sufficient memory available","reason":"KubeletHasSufficientMemory","status":"False","type":"MemoryPressure"},{"lastHeartbeatTime":"2018-01-04T09:31:09Z","lastTransitionTime":"2018-01-04T09:31:09Z","message":"kubelet has sufficient disk space available","reason":"KubeletHasSufficientDisk","status":"False","type":"OutOfDisk"},{"lastHeartbeatTime":"2018-01-04T09:31:09Z","lastTransitionTime":"2018-01-04T09:31:09Z","message":"kubelet is posting ready status","reason":"KubeletReady","status":"True","type":"Ready"}],"nodeInfo":{"gpus":[]}}} E0104 17:31:09.779522 7223 kubelet_node_status.go:318] Unable to update node status: update node status exceeds retry count
What you expected to happen:
I found a PR about this issue, #44788 it has been picked to 1.6. I want to know why this issue still happend.
How to reproduce it (as minimally and precisely as possible):
this issue is incidental, bug I found an issue still meet this problem #52498 he solves it by changing etcd 3.1.10 instead of 3.2.7. when I change the leader of etcd cluster, this issue will be gone. is this a known incompatibility between k8s and etcd or a etcd bug?
Anything else we need to know?:
Environment:
- Kubernetes version (use
kubectl version
): 1.6.9 - Etcd version: 3.0.17
About this issue
- Original URL
- State: closed
- Created 6 years ago
- Reactions: 1
- Comments: 18 (14 by maintainers)
Commits related to this issue
- Merge pull request #66171 from liggitt/patch-retry-1.9 Automatic merge from submit-queue. Remove patch retry conflict detection Minimal backport of #63146 Fixes #58002 Fixes spurious patch erro... — committed to kubernetes/kubernetes by deleted user 6 years ago
- Merge pull request #66169 from liggitt/patch-retry-1.10 Automatic merge from submit-queue. Remove patch retry conflict detection Minimal backport of #63146 Fixes #58002 Fixes spurious patch err... — committed to kubernetes/kubernetes by deleted user 6 years ago
I’ve seen several of these bug reports, and am having one myself, but they all get ignored.
Here is the work-around to restore the node:
kubectl delete nodes <node-name>
I still think this is a bug in the kubelet though, I’m going to investigate that code.