client: Make error-window when waiting for a service to be ready configurable
The use case described below in this description would be supported by a new option as described below
/kind question
we are running with cluster-autoscaler, and https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/FAQ.md#how-can-i-configure-overprovisioning-with-cluster-autoscaler to put some low-priority pause pod in the cluster. When the worker node cpu/memory are not nearly 99% (30% of them are occupied by the pause pods), we create a knative service with kn, and got error:
kn service create hello --image xxx --wait-timeout 300 --env TARGET=revision1
Creating service 'hello' in namespace:
0.380s The Route is still working to reflect the latest desired specification.
1.253s Configuration "hello" is waiting for a Revision to become ready.
4.249s Revision "hello-xxx" failed with message: 0/15 nodes are available: 1 Insufficient memory, 14 Insufficient cpu..
5.077s Configuration "hello" does not have any ready Revision.
Error: RevisionFailed: Revision "hello-xxx" failed with message: 0/15 nodes are available: 1 Insufficient memory, 14 Insufficient cpu..
I checked with k8s scheduler team that the pod schedule will happen in 2 stage, the 1st pod placement attempt failed and the scheduler preempted low-priority Pods; then the 2nd pod placement attempt succeed. So as a result , the final knative service reconcile succeed.
But when using kn client, the end-user got the scary failed msg … If the end-user don’t have enough knowledge for the k8s reconcile, he/she will be frightened.
Another case is from some race condition case in knative itself. refer to : https://github.com/knative/serving/issues/8675 When the error is thrown out from kn client, the ksvc just created for 4 seconds. Later on, with more reconcile, the ksvc is finally ready.
So, I am wondering whether there are a better idea for watch to reduce these intermittent errors since reconcile is a designed behaviour of k8s. maybe just adding a shorter wait time to see whether any condition change for the next reconcile?
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Comments: 17 (8 by maintainers)
Commits related to this issue
- [release-v1.1.0] Update kn-plugin-func to v0.23.1 (#1023) * [release-v1.1.0] Update kn-plugin-func to v0.23.1 * Update vendor dir — committed to dsimansk/client by dsimansk 2 years ago
the concept is hard to explain in one word anyway. It should start with
--waitso that it aligns with the other wait option (--wait-timeout), so I would be fine with--wait-windowand explaining it in the help message. It also a balancing act between being precise and too verbose (which leads to more typing and harder to memorize). Also, we already have the concept of a “window” included with--autoscale-window(which actually should be named--scale-windowlike the other autoscale parameters), so I would be fine with a--wait-window.