kubernetes: Failing to list pods with resourceVersion=0
Running curl -k -v -XGET -H "Accept: application/json, */*" -H 'User-Agent: cluster-autoscaler/v0.0.0 (linux/amd64) kubernetes/$Format' 'http://127.0.0.1:8080/api/v1/pods?fieldSelector=spec.nodeName%21%3D%2Cstatus.phase%21%3DFailed%2Cstatus.phase%21%3DSucceeded&limit=500&resourceVersion=0' on master node in 1.9 cluster results in listing all pods that are assigned to some node.
The same on cluster running k8s version from HEAD:
* Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to 127.0.0.1 (127.0.0.1) port 8080 (#0)
> GET /api/v1/pods?fieldSelector=spec.nodeName%21%3D%2Cstatus.phase%21%3DFailed%2Cstatus.phase%21%3DSucceeded&limit=500&resourceVersion=0 HTTP/1.1
> Host: 127.0.0.1:8080
> Accept: application/json, */*
> User-Agent: cluster-autoscaler/v0.0.0 (linux/amd64) kubernetes/$Format
>
< HTTP/1.1 200 OK
< Audit-Id: 54b8afbe-dbd1-4608-b6f7-1375d73532ce
< Content-Type: application/json
< Date: Thu, 01 Mar 2018 15:59:31 GMT
< Content-Length: 113
<
{"kind":"PodList","apiVersion":"v1","metadata":{"selfLink":"/api/v1/pods","resourceVersion":"95581"},"items":[]}
* Connection #0 to host 127.0.0.1 left intact
Removing the part of selector with spec.nodeName results in listing all current pods, regardless of whether they’re scheduled.
This looks like a regression from 1.9 (and also breaks Cluster Autoscaler, which relies on selecting scheduled pods based on non-empty field.) Interestingly, it seems that the watch part still works (there are updates for scheduled pods that have been modified.)
(Edited to fix command)
About this issue
- Original URL
- State: closed
- Created 6 years ago
- Comments: 18 (18 by maintainers)
@wojtek-t Thanks for tips on how to debug this! After I added logging, it turned out that all objects are in cache, but for those older than the API server,
elem.Fieldsandelem.Labelsarenil, so they can’t match any non-empty selector and are filtered out. Since those fields are set when processing events, result includes only objects for which at least one event was observed. Setting those fields when filling cache fixes the issue.