longhorn: [TASK] Handle the CRD validation error
What’s the task? Please describe
Hit two CRD validation errors on Kubernetes v1.19.7+k3s1 while upgrading from v1.2.2 to v1beta2 version. The errors are not encountered on Kubernetes v1.21.6+k3s1.
upgradeBackupTargets=failed to update for BackupTarget status default: BackupTarget.longhorn.io \"default\" is invalid: status.lastSyncedAt: Invalid value: \"null\": status.lastSyncedAt in body must be of type string: \"null\""
failed to update=Node.longhorn.io \“ku50-master\” is invalid: [spec.disks.default-disk-745ff2be8d312b52.tags: Invalid value: \“null\“: spec.disks.default-disk-745ff2be8d312b52.tags in body must be of type array: \“null\“, status.diskStatus: Invalid value: \“null\“: status.diskStatus in body must be of type object: \“null\“]”
Need to check which Kubernetes versions have the same issue and make sure backward compatibility across K8s versions.
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Reactions: 1
- Comments: 26 (19 by maintainers)
Validation - PASSED
Upgrading from v1.1.3 with 1 volume, 1 backing image in-use:
longhorn-admission-webhook E0218 14:32:43.770666 1 reflector.go:178] k8s.io/client-go/informers/factory.go:135: Failed to list *v1.CSIDriver: the server could not find the requested resourceAs for more complex resource scenario will be covered in release test
Since this is an upgrade case then adding it to the upgrade path should just address the issue. Should be backported to v1.2.x.
I see, then we need to consider assigning the default value to it which will implement in the upgrade path or by defaulter webhook. Need to think further about which one is the proper way for all the Longhorn CR resources.
Validation - FAILED
Unable to upgrade to v1.3.0-master-head there is volume created from previous versions.
Upgrading with following path via helm3: v1.1.2 → v1.1.3 → v1.2.3 → master-head
longhorn-manager log:
backing image spec:
Trying to resolve the volume error manually, but unable to make
status.conditionsvalid,Created an issue at: #3426
A possible related problem while trying to upgrade v1.2.3 to master branch, longhorn-manager will not able to bring up:
And with following identical logs across longhorn-manager identical pods:
Steps to reproduce: 0. With a fresh rke1 v1.20 cluster(1+3 nodes)
For supported OSs and K8s distro versions, more testing is always welcome but need to consider resources/costs as well.
Right now, we have OS matrix, so probably we need to have K8s version matrix as well only for supported versions. cc @khushboo-rancher
We need to consider the nightly E2E test on different Kubernetes versions as well. (k8s 1.18+) cc @innobead @longhorn/qa
To upgrade to v1beta2 and pass the CRD validation if not set
+ nullablemarker to fields, the Kubernetes cluster should be at least v1.20.cc @innobead