kubernetes: CRD age does not change from as soon as it becomes Established
Is this a BUG REPORT or FEATURE REQUEST?: /kind bug
What happened: When deploying https://github.com/istio/istio/blob/master/install/kubernetes/istio.yaml on minikube, I get the following errors:
unable to recognize "install/kubernetes/istio.yaml": no matches for config.istio.io/, Kind=attributemanifest
for each of the custom resources specified in https://github.com/istio/istio/blob/master/install/kubernetes/istio.yaml
Running kubernetes get crd returns:
NAME AGE
attributemanifests.config.istio.io <invalid>
After several seconds, kubernetes get crd returns valid age for the CRDs. Then the custom resources are added successfully.
What you expected to happen:
I expect that it should be explicitly documented: can custom resources be added immediately after their CRDs? Or the user should wait until kubernetes get crd returns valid age.
How to reproduce it (as minimally and precisely as possible): Deploy https://github.com/istio/istio/blob/master/install/kubernetes/istio.yaml on a minikube with limited resources, so it will take time for the CRDs to become valid.
Anything else we need to know?:
Environment:
- Kubernetes version (use
kubectl version): 1.8.0 - Cloud provider or hardware configuration: Minikube 0.23.0
- OS (e.g. from /etc/os-release): Minikube on Virtual Box on Mac OS
- Kernel (e.g.
uname -a): - Install tools:
- Others:
About this issue
- Original URL
- State: closed
- Created 7 years ago
- Comments: 34 (24 by maintainers)
Working on reproducing the issue and implementing tests for this case.
I’ve worked with @sttts on reproducing the issue and finding the root of the problem. I’ll try my best to explain what have I found.
It’s hard to define what limited resources are. In the most of cases, everything worked flawlessly. To be able to reproduce the issue, I had to put my local machine (which was running the cluster via
./hack/local-cluster-up.sh) under the heavy load (100% CPU and a lot of I/O operations). To accomplish this I was using thestresscommand.Under normal conditions everything is working as expected, age is valid, and CRs are created as well. Under heavy load, it can happen that CRs fail to create, but it’s very hard to reproduce the age
<invalid>.It is expected for CRDs to take some time to get fully created/provisioned, but in normal conditions, that doesn’t make a problem. If cluster has limited resources, this can be a problem, but there could be a possible fix that we can discuss.
About the
<invalid>age… the age is parsed/set by theShortHumanDurationfunction under thedurationpackage. That happens only if there’s some deviation between machine times.This function is invoked by the
translateTimestampfunction withnow-creationTimestamp.Because the
translateTimestampfunction checks is time zero (and if yes it returns<unknown>instead of<invalid>), we are sure some creation timestamp is returned by the API.To further analyze it, I’ve tried to see does API return indeed the
creationTimestamp, even under the load, and yes, it does. The time is set by theBeforeCreatefunction, specifically, by theFillObjectMetaSystemFieldsfunction, which is defined in the same package. TheFillObjectMetaSystemFieldsfunction sets thecreationTimestamptoNow().This Playground is from the beginning of my research, but it can contain some useful information, along with a test that confirm that
creationTimestampis not zero.To fix the problem with CR creation failure, @sttts proposed the following solution when I was talking to him about the issue:
I would also like to mention, that the age
<invalid>is not limited to CRDs. I had the same problem with the local cluster, but with pods instead the CRDs. The following command:kubectl get pods -w --all-namespacesreturned:Just to note, the
stresscommand was running while the cluster was provisioning, so the local machine had limited resources.And of course there is no connnection to Minikube, but it is like that by design on every cluster.
https://kubernetes.io/docs/tasks/access-kubernetes-api/extend-api-custom-resource-definitions/ does not document this, but should. PR welcome.
@vadimeisenbergibm
I’m not sure what our guidance is beyond waiting. You could also programmatically poll/watch the CRD and wait for it to be established. cc @kubernetes/sig-api-machinery-misc for increased visibility & guidance
Here is a sample I just ran: