harvester: [BUG] Upgrade: rancher-monitoring charts can't be upgraded
Describe the bug
This was observed after https://github.com/harvester/harvester-installer/pull/229.
rancher-monitoring and rancher-monitoring-crd ManagedCharts fail to upgrade.
The message is another operation (install/upgrade/rollback) is in progress.
fleet-local mcc-rancher-monitoring 0/1 ErrApplied(1) [Cluster fleet-local/local: another operation (install/upgrade/rollback) is in progress]; mutatingwebhookconfiguration.admissionregistration.k8s.io rancher-monitoring-admission modified {"webhooks":[{"admissionReviewVersions":["v1","v1beta1"],"clientConfig":{"caBundle":"LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUJkVENDQVJ1Z0F3SUJBZ0lRUDd5MVA0K2cvRlZ5OGdaSW93bElDVEFLQmdncWhrak9QUVFEQWpBUE1RMHcKQ3dZRFZRUUtFd1J1YVd3eE1DQVhEVEl5TURFd05EQXlNRGcwTUZvWUR6SXhNakV4TWpFeE1ESXdPRFF3V2pBUApNUTB3Q3dZRFZRUUtFd1J1YVd3eE1Ga3dFd1lIS29aSXpqMENBUVlJS29aSXpqMERBUWNEUWdBRXEzQWhrdnh0CmVMUHBpOG45NmVkRmlWbGZiK0xRWHRmSVJFOUJDMmRlVmhxWjNvVUx1R0hucldqc3k0OWJMM0JVZEw0QWg4VEkKQUhORG9sbzNRVElGRHFOWE1GVXdEZ1lEVlIwUEFRSC9CQVFEQWdJRU1CTUdBMVVkSlFRTU1Bb0dDQ3NHQVFVRgpCd01CTUE4R0ExVWRFd0VCL3dRRk1BTUJBZjh3SFFZRFZSME9CQllFRlB3U0xOd2VaeE02VHZMS1dodnNxd2t5Cjc0RzNNQW9HQ0NxR1NNNDlCQU1DQTBnQU1FVUNJUUR3VmlHZEtSdXI3N2dUTElZVkxkNzZXM3N3eGtYd1I5c2gKdDBXN081azdNUUlnYmpRYWVsWEd6dnhQK1dtaHdGYWVndFdFVGJSSkQ2aXpzZVpJOWJ3c3ZQZz0KLS0tLS1FTkQgQ0VSVElGSUNBVEUtLS0tLQo=","service":{"name":"rancher-monitoring-operator","namespace":"cattle-monitoring-system","path":"/admission-prometheusrules/mutate","port":443}},"failurePolicy":"Ignore","matchPolicy":"Equivalent","name":"prometheusrulemutate.monitoring.coreos.com","namespaceSelector":{},"objectSelector":{},"reinvocationPolicy":"Never","rules":[{"apiGroups":["monitoring.coreos.com"],"apiVersions":["*"],"operations":["CREATE","UPDATE"],"resources":["prometheusrules"],"scope":"*"}],"sideEffects":"None","timeoutSeconds":10}]}; validatingwebhookconfiguration.admissionregistration.k8s.io rancher-monitoring-admission modified {"webhooks":[{"admissionReviewVersions":["v1","v1beta1"],"clientConfig":{"caBundle":"LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUJkVENDQVJ1Z0F3SUJBZ0lRUDd5MVA0K2cvRlZ5OGdaSW93bElDVEFLQmdncWhrak9QUVFEQWpBUE1RMHcKQ3dZRFZRUUtFd1J1YVd3eE1DQVhEVEl5TURFd05EQXlNRGcwTUZvWUR6SXhNakV4TWpFeE1ESXdPRFF3V2pBUApNUTB3Q3dZRFZRUUtFd1J1YVd3eE1Ga3dFd1lIS29aSXpqMENBUVlJS29aSXpqMERBUWNEUWdBRXEzQWhrdnh0CmVMUHBpOG45NmVkRmlWbGZiK0xRWHRmSVJFOUJDMmRlVmhxWjNvVUx1R0hucldqc3k0OWJMM0JVZEw0QWg4VEkKQUhORG9sbzNRVElGRHFOWE1GVXdEZ1lEVlIwUEFRSC9CQVFEQWdJRU1CTUdBMVVkSlFRTU1Bb0dDQ3NHQVFVRgpCd01CTUE4R0ExVWRFd0VCL3dRRk1BTUJBZjh3SFFZRFZSME9CQllFRlB3U0xOd2VaeE02VHZMS1dodnNxd2t5Cjc0RzNNQW9HQ0NxR1NNNDlCQU1DQTBnQU1FVUNJUUR3VmlHZEtSdXI3N2dUTElZVkxkNzZXM3N3eGtYd1I5c2gKdDBXN081azdNUUlnYmpRYWVsWEd6dnhQK1dtaHdGYWVndFdFVGJSSkQ2aXpzZVpJOWJ3c3ZQZz0KLS0tLS1FTkQgQ0VSVElGSUNBVEUtLS0tLQo=","service":{"name":"rancher-monitoring-operator","namespace":"cattle-monitoring-system","path":"/admission-prometheusrules/validate","port":443}},"failurePolicy":"Ignore","matchPolicy":"Equivalent","name":"prometheusrulemutate.monitoring.coreos.com","namespaceSelector":{},"objectSelector":{},"rules":[{"apiGroups":["monitoring.coreos.com"],"apiVersions":["*"],"operations":["CREATE","UPDATE"],"resources":["prometheusrules"],"scope":"*"}],"sideEffects":"None","timeoutSeconds":10}]}
fleet-local mcc-rancher-monitoring-crd 0/1 ErrApplied(1) [Cluster fleet-local/local: another operation (install/upgrade/rollback) is in progress]
To Reproduce Steps to reproduce the behavior:
- Setup a 1.0.0 cluster.
- Upgrade with master ISO.
Expected behavior
Support bundle
Environment:
- Harvester ISO version:
- Underlying Infrastructure (e.g. Baremetal with Dell PowerEdge R630):
Additional context Add any other context about the problem here.
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Comments: 18 (14 by maintainers)
@bk201 I found one possible cause, described in https://github.com/harvester/harvester/issues/1983#issuecomment-1076812969 , but not sure if fleet-agent noticed this and has plan to improve it.
@w13915984028 @weihanglo Do you know any workaround to get rid of this state? I did have luck to rollback the revision on a chart before, not sure if there is a better way.
When checking the fleet-agent log, it showes, even in a normal running Harvester cluster there are also many "performing update for … " log. (Why it does so in under investigation via #2013 .)
There are chances to hit this bug.
I’m still seeing harvester chart goes into this state occasionally. Even I already pause the managed chart and wait for all Rancehr stuff to settle down. A trick to get rid of the state is to rollback the chart by using helm command. The managedchart will be applied again.