kubernetes: absence of openapi configuration in integration tests makes server-side apply panic (broke apf controller when it switched to SSA)
What happened?
Integration test timed out due to the following error
E0120 14:05:02.398256 118058 runtime.go:77] Observed a panic: FieldManager must be installed to run apply
The error appeared 748 times in the log. Looks like lots of churning happens during the apf configuration bootstrapping phase. Is there a race condition between when the type is being registered and the API calls?
Stack trace:
I0120 14:05:02.397911 118058 apf_controller.go:879] Triggered API priority and fairness config reloading because priority level exempt is undesired and idle
I0120 14:05:02.398135 118058 panic.go:1038] "HTTP" verb="APPLY" URI="/apis/flowcontrol.apiserver.k8s.io/v1beta2/flowschemas/system-node-high/status?fieldManager=api-priority-and-fairness-config-consumer-v1&force=true" latency="650.951µs" userAgent="Go-http-client/1.1" audit-ID="bc3c561e-1a80-403e-b611-63ba7bd484b9" srcIP="127.0.0.1:46572" apf_pl="exempt" apf_fs="exempt" apf_fd="" resp=0
E0120 14:05:02.398256 118058 runtime.go:77] Observed a panic: FieldManager must be installed to run apply
goroutine 136765 [running]:
k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/handlers/finisher.finishRequest.func1.1()
/home/prow/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/handlers/finisher/finisher.go:105 +0xaf
panic({0x3fc0bc0, 0x542eb00})
/usr/local/go/src/runtime/panic.go:1038 +0x215
k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/handlers.(*applyPatcher).applyPatchToCurrentObject(0x551ceb0, {0x553da98, 0xc0887ccc00}, {0x551ceb0, 0xc0887da480})
/home/prow/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/handlers/patch.go:482 +0x449
k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/handlers.(*patcher).applyPatch(0xc0bf781a40, {0x553da98, 0xc0887ccc00}, {0x4c296a0, 0xc0887da480}, {0x551ceb0, 0xc0887da480})
/home/prow/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/handlers/patch.go:566 +0xd3
k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/registry/rest.(*defaultUpdatedObjectInfo).UpdatedObject(0x81f4878, {0x553da98, 0xc0887ccc00}, {0x551ceb0, 0xc0887da480})
/home/prow/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/registry/rest/update.go:229 +0xd0
k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/registry/generic/registry.(*Store).Update.func1({0x551ceb0, 0xc0887da480}, {0xc07278e397b5dfde, 0x8b5f27eafb})
/home/prow/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/registry/generic/registry/store.go:533 +0x1f9
k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/storage/etcd3.(*store).updateState(0xc03d204bd0, 0xc06ba15310, 0x42)
/home/prow/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/storage/etcd3/store.go:894 +0x3e
k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/storage/etcd3.(*store).GuaranteedUpdate(0xc03d204bd0, {0x553da98, 0xc0887ccc00}, {0xc01908f2e0, 0x1d}, {0x551ceb0, 0xc0887da300}, 0x1, 0x40ef94, 0xc0848d34a0, ...)
/home/prow/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/storage/etcd3/store.go:365 +0x56e
k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/storage/cacher.(*Cacher).GuaranteedUpdate(0xc0e9754c60, {0x553da98, 0xc0887ccc00}, {0xc01908f2e0, 0x1d}, {0x551ceb0, 0xc0887da300}, 0xa8, 0x4b86ae0, 0xc0848d34a0, ...)
/home/prow/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/storage/cacher/cacher.go:721 +0x1b5
k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/registry/generic/registry.(*DryRunnableStorage).GuaranteedUpdate(0x4cbc218, {0x553da98, 0xc0887ccc00}, {0xc01908f2e0, 0x4cac0b0}, {0x551ceb0, 0xc0887da300}, 0xf2, 0x1, 0xc0848d34a0, ...)
/home/prow/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/registry/generic/registry/dryrun.go:97 +0x1c7
k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/registry/generic/registry.(*Store).Update(0xc0e5c5b2c0, {0x553da98, 0xc0887ccc00}, {0xc0412d2b3d, 0xb}, {0x5523328, 0xc0887cccf0}, 0xc0105369b0, 0x4ea4ac8, 0x0, ...)
/home/prow/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/registry/generic/registry/store.go:521 +0x508
k8s.io/kubernetes/pkg/registry/flowcontrol/flowschema/storage.(*StatusREST).Update(0xc0887ccd50, {0x553da98, 0xc0887ccc00}, {0xc0412d2b3d, 0xc086dd5bf0}, {0x5523328, 0xc0887cccf0}, 0xc0991a2748, 0x40ef94, 0x1, ...)
/home/prow/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/pkg/registry/flowcontrol/flowschema/storage/storage.go:93 +0x52
k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/handlers.(*patcher).patchResource.func2()
/home/prow/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/handlers/patch.go:665 +0xa7
k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/handlers.(*patcher).patchResource.func3()
/home/prow/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/handlers/patch.go:671 +0x38
k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/handlers/finisher.finishRequest.func1()
/home/prow/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/handlers/finisher/finisher.go:117 +0x8f
created by k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/handlers/finisher.finishRequest
/home/prow/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/handlers/finisher/finisher.go:92 +0xe5
Integration job link: https://prow.k8s.io/view/gs/kubernetes-jenkins/pr-logs/pull/107456/pull-kubernetes-integration/1484156493719670784 PR: https://github.com/kubernetes/kubernetes/pull/107456
I searched for this error in CI and this PR, looks like we only see this error in this PR so far. So I am hoping it has not introduced any flake. (maybe we need to let more time to pass to start seeing flakes)
What did you expect to happen?
We should not see this error FieldManager must be installed to run apply
How can we reproduce it (as minimally and precisely as possible)?
I saw it only in the PR mentioned.
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Comments: 47 (45 by maintainers)
Taking a fresh look at this PR and wondering if it has been mostly fixed already?
From @liggitt’s comments my understanding is that this panic occurred due to two conditions:
kube-apiserverrunning as part of the integration test suite is asked to service an SSA request whileOpenAPIModelsare nil. The apiservice has logic to panic in this case. From the code path @liggitt commentedOpenAPIModelsis nil whenOpenAPIConfigthe server is created with is nil.As a brief check to see if this might be fixed, I ran the test
TestUnschedulableNodeDaemonDoesLaunchPod@Jefftree mentioned was failing from SSA. I logged out theNode’s created and found that they all had managed fields. I saw this as an indication SSA was now functioning properly.Indeed, to start the test server the
setupfunction callskubeapiservertesting.StartTestServerOrDie. This function has a code path which unconditionally leads to the apiserver’sOpenAPIConfigbeing set.StartTestServercallsCreateServerChainbefore running the returned apiserver:https://github.com/kubernetes/kubernetes/blob/6f706775bcb0007082ca940527a154e728b4399f/cmd/kube-apiserver/app/testing/testserver.go#L212-L231
CreateServerChaincallsCreateKubeAPIServerConfigbefore using the returned config to create the returned apiserver:https://github.com/kubernetes/kubernetes/blob/0527a0dd453c4b76259389ec8e8e6888c5e2a5ab/cmd/kube-apiserver/app/server.go#L176-L195
CreateKubeAPIServerConfigusesbuildGenericConfigwhich setsOpenAPIConfig:https://github.com/kubernetes/kubernetes/blob/0527a0dd453c4b76259389ec8e8e6888c5e2a5ab/cmd/kube-apiserver/app/server.go#L237-L248
https://github.com/kubernetes/kubernetes/blob/0527a0dd453c4b76259389ec8e8e6888c5e2a5ab/cmd/kube-apiserver/app/server.go#L388-L396
Thus, any test using
kubeapiservertesting.StartTestServerOrDieshould no longer be affected by this panic. The question then becomes: do all integration tests start their servers in this way?It’s hard to say. I found a PR (#110529, ) which refactors many occurrences of the old stanza:
Into something using the new method:
But it appears
RunAnAPIServeris still being used in a few integration tests:Should these occurrences be refactored to use
kubeapiservertesting.StartTestServerOrDietoo?