kubernetes: Job controller keeps logging panics
What happened?
Job controller, and possibly other controllers keep logging panics from this line in FilterActivePods: https://github.com/kubernetes/kubernetes/blob/349b85650530da2b4091dd1977f9344ff4f83201/pkg/controller/controller_utils.go#L955.
This happens during e2e tests, and I think it happens on production as well.
Here is an example for the successful build:
https://storage.googleapis.com/kubernetes-jenkins/pr-logs/pull/121302/pull-kubernetes-e2e-kind/1715287183352401920/artifacts/kind-control-plane/containers/kube-controller-manager-kind-control-plane_kube-system_kube-controller-manager-ec8b5cc2095bc6b1bdbfe61f132b3d493dea09ab0808935b59e10dcc5ffe1082.log
2023-10-20T09:10:28.016158573Z stderr F I1020 09:10:28.016037 1 controller_utils.go:955] "Ignoring inactive pod" pod="ttlafterfinished-3394/rand-non-local-ghcvx" phase="Failed" deletionTime="<panic: runtime error: invalid memory address or nil pointer dereference>"
What did you expect to happen?
No panics during e2e test from this line in FilterActivePods.
How can we reproduce it (as minimally and precisely as possible)?
Run e2e or integration tests for the job controller.
Anything else we need to know?
No response
Kubernetes version
$ kubectl version
# paste output here
Cloud provider
OS version
# On Linux:
$ cat /etc/os-release
# paste output here
$ uname -a
# paste output here
# On Windows:
C:\> wmic os get Caption, Version, BuildNumber, OSArchitecture
# paste output here
Install tools
Container runtime (CRI) and version (if applicable)
Related plugins (CNI, CSI, …) and versions (if applicable)
About this issue
- Original URL
- State: open
- Created 8 months ago
- Comments: 22 (22 by maintainers)
When logging an object’s
DeletionTimestamp, “is nil” is the right information to log when the pointer is nil. There is no default that can or should be used instead.I can work of this 😊 /assign