harvester: [BUG] stuck at "Upgrade system services" from v1.0.3 to v1.1.0 upgrading progress
Describe the bug
In a progress that upgrade from v1.0.3 to v1.1.0, I stuck at “Upgrade system services” for a long time( about 3 hours).

Here is my upgrade status:
apiVersion: v1
items:
- apiVersion: harvesterhci.io/v1beta1
kind: Upgrade
metadata:
creationTimestamp: "2022-10-26T07:05:45Z"
finalizers:
- wrangler.cattle.io/harvester-upgrade-controller
generateName: hvst-upgrade-
generation: 14
labels:
harvesterhci.io/latestUpgrade: "true"
harvesterhci.io/upgradeState: UpgradingSystemServices
name: hvst-upgrade-fnlkm
namespace: harvester-system
resourceVersion: "32750759"
uid: b96a9405-07b0-4705-b3fc-004155958201
spec:
image: ""
version: v1.1.0
status:
conditions:
- status: Unknown
type: Completed
- lastUpdateTime: "2022-10-26T07:11:43Z"
status: "True"
type: ImageReady
- lastUpdateTime: "2022-10-26T07:15:30Z"
status: "True"
type: RepoReady
- lastUpdateTime: "2022-10-26T07:15:50Z"
status: "True"
type: NodesPrepared
- status: Unknown
type: SystemServicesUpgraded
imageID: harvester-system/harvester-iso-ncqn7
nodeStatuses:
harvester0:
state: Images preloaded
harvester1:
state: Images preloaded
harvester2:
state: Images preloaded
previousVersion: v1.0.3
repoInfo: |
release:
harvester: v1.1.0
harvesterChart: 1.1.0
os: Harvester v1.1.0
kubernetes: v1.24.7+rke2r1
rancher: v2.6.9
monitoringChart: 100.1.0+up19.0.3
kind: List
metadata:
resourceVersion: ""
selfLink: ""
Here is my log of jobs/hvst-upgrade-fnlkm-apply-manifests:
harvester: v1.1.0
harvesterChart: 1.1.0
os: Harvester v1.1.0
kubernetes: v1.24.7+rke2r1
rancher: v2.6.9
monitoringChart: 100.1.0+up19.0.3
loggingChart: 100.1.3+up3.17.7
kubevirt: 0.54.0-150400.3.3.2
minUpgradableVersion: 'v1.0.3'
rancherDependencies:
fleet:
chart: 100.1.0+up0.4.0
app: 0.4.0
fleet-crd:
chart: 100.1.0+up0.4.0
app: 0.4.0
rancher-webhook:
chart: 1.0.6+up0.2.7
app: 0.2.7
Current version: 1.0.3
Minimum upgradable version: 1.0.3
Current version is supported.
Executing v1.0.3 pre-hook...
Remove rke.cattle.io/init-node-machine-id label in fleet-local/local cluster
label "rke.cattle.io/init-node-machine-id" not found.
cluster.provisioning.cattle.io/local not labeled
managedchart.management.cattle.io/harvester patched (no change)
managedchart.management.cattle.io/harvester-crd patched (no change)
managedchart.management.cattle.io/rancher-monitoring patched (no change)
managedchart.management.cattle.io/rancher-monitoring-crd patched (no change)
Upgrading Rancher
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 117 100 117 0 0 16714 0 --:--:-- --:--:-- --:--:-- 16714
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 57.9M 100 57.9M 0 0 29.1M 0 0:00:01 0:00:01 --:--:-- 29.1M
time="2022-10-26T07:15:56Z" level=info msg="Extract mapping / => /tmp/upgrade/rancher"
time="2022-10-26T07:15:56Z" level=info msg="Checking local image archives in /tmp/upgrade/images for index.docker.io/rancher/system-agent-installer-rancher:v2.6.9"
time="2022-10-26T07:15:57Z" level=info msg="Extracting file run.sh to /tmp/upgrade/rancher/run.sh"
time="2022-10-26T07:15:58Z" level=info msg="Extracting file rancher-2.6.9.tgz to /tmp/upgrade/rancher/rancher-2.6.9.tgz"
time="2022-10-26T07:15:58Z" level=info msg="Extracting file helm to /tmp/upgrade/rancher/helm"
Rancher values:
antiAffinity: required
bootstrapPassword: admin
features: multi-cluster-management=false,multi-cluster-management-agent=false
hostPort: 0
ingress:
enabled: false
noDefaultAdmin: false
rancherImage: rancher/rancher
rancherImagePullPolicy: IfNotPresent
rancherImageTag: v2.6.4-harvester3
replicas: -3
systemDefaultRegistry: ""
tls: external
useBundledSystemChart: true
I have even tried some commands from https://docs.harvesterhci.io/v1.1/upgrade/previous-releases/v1-0-0-to-v1-0-1#stuck-in-upgrading-system-service , but it doesn’t work for me.
To Reproduce Steps to reproduce the behavior:
- Click
Upgradeat harvester UI
Support bundle
supportbundle_6200f644-e4a1-40c1-bf7c-070ccbcb556d_2022-10-26T08-00-43Z.zip
Environment
- Harvester ISO version: v1.0.3
- Underlying Infrastructure: Baremetal with PowerEdge R740
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Comments: 26 (9 by maintainers)
I have realized that the node harvester2 went into “cordoned” but I can not say way.
I did a manual uncordon but after some time it has been cordoned, again.
Not that I can remember.
DM send…