rook: Helm upgrade to 1.5.x fails with "cannot convert int64 to float64"

Attempting to upgrade rook ceph operator via Helm from 1.5.1 to 1.5.5 fails with an error related to cephclusters CRD: cannot convert int64 to float64

Update: Workaround for anyone encountering the same issue: I’ve noticed that downgrading to Helm 3.4.x or lower allows upgrading rook and other charts encountering a similar issue. This was later confirmed by other people.

Deviation from expected behavior: Upgrade to 1.5.1 to 1.5.5 should work without issues using Helm but instead Helm fails to update the CRDs. Instead Helm fails partially, it does seem to update the Operator as it restarts and now has 1.5.5 docker image version, but the CRD update for cephclusters fails:

Error

Error: UPGRADE FAILED: cannot patch "cephclusters.ceph.rook.io" with kind CustomResourceDefinition:  "" is invalid: patch: Invalid value: "
map[
    spec:map[
        versions:[
            map[
                additionalPrinterColumns:[
                    map[description:Directory used on the K8s nodes jsonPath:.spec.dataDirHostPath name:DataDirHostPath type:string]
                    map[description:Number of MONs jsonPath:.spec.mon.count name:MonCount type:string]
                    map[jsonPath:.metadata.creationTimestamp name:Age type:date]
                    map[description:Phase jsonPath:.status.phase name:Phase type:string]
                    map[description:Message jsonPath:.status.message name:Message type:string]
                    map[description:Ceph Health jsonPath:.status.ceph.health name:Health type:string]
                ] 
                name:v1 
                schema:map[
                    openAPIV3Schema:map[properties:map[spec:map[properties:map[annotations:map[nullable:true type:object x-kubernetes-preserve-unknown-fields:true] cephVersion:map[properties:map[allowUnsupported:map[type:boolean] image:map[type:string]] type:object] cleanupPolicy:map[properties:map[allowUninstallWithVolumes:map[type:boolean] confirmation:map[pattern:^$|^yes-really-destroy-data$ type:string] sanitizeDisks:map[properties:map[dataSource:map[pattern:^(zero|random)$ type:string] iteration:map[format:int32 type:integer] method:map[pattern:^(complete|quick)$ type:string]] type:object]] type:object] continueUpgradeAfterChecksEvenIfNotHealthy:map[type:boolean] crashCollector:map[properties:map[disable:map[type:boolean]] type:object] dashboard:map[properties:map[enabled:map[type:boolean] port:map[maximum:65535 minimum:0 type:integer] ssl:map[type:boolean] urlPrefix:map[type:string]] type:object] dataDirHostPath:map[pattern:^/(\\S+) type:string] disruptionManagement:map[properties:map[machineDisruptionBudgetNamespace:map[type:string] manageMachineDisruptionBudgets:map[type:boolean] managePodBudgets:map[type:boolean] osdMaintenanceTimeout:map[type:integer] pgHealthCheckTimeout:map[type:integer]] type:object] driveGroups:map[items:map[properties:map[name:map[type:string] placement:map[nullable:true type:object x-kubernetes-preserve-unknown-fields:true] spec:map[type:object x-kubernetes-preserve-unknown-fields:true]] required:[name spec] type:object] nullable:true type:array] external:map[properties:map[enable:map[type:boolean]] type:object] healthCheck:map[nullable:true type:object x-kubernetes-preserve-unknown-fields:true] labels:map[nullable:true type:object x-kubernetes-preserve-unknown-fields:true] logCollector:map[properties:map[enabled:map[type:boolean] periodicity:map[type:string]] type:object] mgr:map[properties:map[modules:map[items:map[properties:map[enabled:map[type:boolean] name:map[type:string]] type:object] type:array]] type:object] mon:map[properties:map[allowMultiplePerNode:map[type:boolean] count:map[maximum:9 minimum:0 type:integer] stretchCluster:map[nullable:true properties:map[failureDomainLabel:map[type:string] subFailureDomain:map[type:string] zones:map[items:map[properties:map[arbiter:map[type:boolean] name:map[type:string] volumeClaimTemplate:map[type:object x-kubernetes-preserve-unknown-fields:true]] type:object] type:array]] type:object] volumeClaimTemplate:map[type:object x-kubernetes-preserve-unknown-fields:true]] type:object] monitoring:map[properties:map[enabled:map[type:boolean] externalMgrEndpoints:map[items:map[properties:map[ip:map[type:string]] type:object] type:array] externalMgrPrometheusPort:map[maximum:65535 minimum:0 type:integer] rulesNamespace:map[type:string]] type:object] network:map[nullable:true properties:map[hostNetwork:map[type:boolean] ipFamily:map[type:string] provider:map[type:string] selectors:map[nullable:true type:object x-kubernetes-preserve-unknown-fields:true]] type:object] placement:map[nullable:true type:object x-kubernetes-preserve-unknown-fields:true] priorityClassNames:map[nullable:true type:object x-kubernetes-preserve-unknown-fields:true] removeOSDsIfOutAndSafeToRemove:map[type:boolean] resources:map[nullable:true type:object x-kubernetes-preserve-unknown-fields:true] security:map[properties:map[kms:map[properties:map[connectionDetails:map[nullable:true type:object x-kubernetes-preserve-unknown-fields:true] tokenSecretName:map[type:string]] type:object]] type:object] skipUpgradeChecks:map[type:boolean] storage:map[properties:map[config:map[nullable:true type:object x-kubernetes-preserve-unknown-fields:true] deviceFilter:map[nullable:true type:string] devicePathFilter:map[type:string] devices:map[items:map[properties:map[config:map[nullable:true type:object x-kubernetes-preserve-unknown-fields:true] fullPath:map[type:string] name:map[type:string]] type:object] type:array] disruptionManagement:map[nullable:true properties:map[machineDisruptionBudgetNamespace:map[type:string] manageMachineDisruptionBudgets:map[type:boolean] managePodBudgets:map[type:boolean] osdMaintenanceTimeout:map[type:integer] pgHealthCheckTimeout:map[type:integer]] type:object] nodes:map[items:map[properties:map[config:map[nullable:true properties:map[databaseSizeMB:map[type:string] encryptedDevice:map[pattern:^(true|false)$ type:string] journalSizeMB:map[type:string] metadataDevice:map[type:string] osdsPerDevice:map[type:string] storeType:map[pattern:^(bluestore)$ type:string] walSizeMB:map[type:string]] type:object] deviceFilter:map[nullable:true type:string] devicePathFilter:map[type:string] devices:map[items:map[properties:map[config:map[nullable:true type:object x-kubernetes-preserve-unknown-fields:true] fullPath:map[type:string] name:map[type:string]] type:object] type:array] name:map[type:string] resources:map[nullable:true type:object x-kubernetes-preserve-unknown-fields:true] useAllDevices:map[type:boolean]] type:object] nullable:true type:array] storageClassDeviceSets:map[items:map[properties:map[config:map[nullable:true type:object x-kubernetes-preserve-unknown-fields:true] count:map[format:int32 type:integer] encrypted:map[type:boolean] name:map[type:string] placement:map[nullable:true type:object x-kubernetes-preserve-unknown-fields:true] portable:map[type:boolean] preparePlacement:map[nullable:true type:object x-kubernetes-preserve-unknown-fields:true] resources:map[nullable:true type:object x-kubernetes-preserve-unknown-fields:true] schedulerName:map[type:string] tuneDeviceClass:map[type:boolean] tuneFastDeviceClass:map[type:boolean] volumeClaimTemplates:map[items:map[type:object x-kubernetes-preserve-unknown-fields:true] type:array]] type:object] nullable:true type:array] useAllDevices:map[type:boolean] useAllNodes:map[type:boolean]] type:object]] type:object] status:map[type:object x-kubernetes-preserve-unknown-fields:true]] type:object]
                ] 
                served:true 
                storage:true 
                subresources:map[status:map[]]
            ]
        ]
    ]
]": cannot convert int64 to float64

Error unformatted:

Error: UPGRADE FAILED: cannot patch "cephclusters.ceph.rook.io" with kind CustomResourceDefinition:  "" is invalid: patch: Invalid value: "map[spec:map[versions:[map[additionalPrinterColumns:[map[description:Directory used on the K8s nodes jsonPath:.spec.dataDirHostPath name:DataDirHostPath type:string] map[description:Number of MONs jsonPath:.spec.mon.count name:MonCount type:string] map[jsonPath:.metadata.creationTimestamp name:Age type:date] map[description:Phase jsonPath:.status.phase name:Phase type:string] map[description:Message jsonPath:.status.message name:Message type:string] map[description:Ceph Health jsonPath:.status.ceph.health name:Health type:string]] name:v1 schema:map[openAPIV3Schema:map[properties:map[spec:map[properties:map[annotations:map[nullable:true type:object x-kubernetes-preserve-unknown-fields:true] cephVersion:map[properties:map[allowUnsupported:map[type:boolean] image:map[type:string]] type:object] cleanupPolicy:map[properties:map[allowUninstallWithVolumes:map[type:boolean] confirmation:map[pattern:^$|^yes-really-destroy-data$ type:string] sanitizeDisks:map[properties:map[dataSource:map[pattern:^(zero|random)$ type:string] iteration:map[format:int32 type:integer] method:map[pattern:^(complete|quick)$ type:string]] type:object]] type:object] continueUpgradeAfterChecksEvenIfNotHealthy:map[type:boolean] crashCollector:map[properties:map[disable:map[type:boolean]] type:object] dashboard:map[properties:map[enabled:map[type:boolean] port:map[maximum:65535 minimum:0 type:integer] ssl:map[type:boolean] urlPrefix:map[type:string]] type:object] dataDirHostPath:map[pattern:^/(\\S+) type:string] disruptionManagement:map[properties:map[machineDisruptionBudgetNamespace:map[type:string] manageMachineDisruptionBudgets:map[type:boolean] managePodBudgets:map[type:boolean] osdMaintenanceTimeout:map[type:integer] pgHealthCheckTimeout:map[type:integer]] type:object] driveGroups:map[items:map[properties:map[name:map[type:string] placement:map[nullable:true type:object x-kubernetes-preserve-unknown-fields:true] spec:map[type:object x-kubernetes-preserve-unknown-fields:true]] required:[name spec] type:object] nullable:true type:array] external:map[properties:map[enable:map[type:boolean]] type:object] healthCheck:map[nullable:true type:object x-kubernetes-preserve-unknown-fields:true] labels:map[nullable:true type:object x-kubernetes-preserve-unknown-fields:true] logCollector:map[properties:map[enabled:map[type:boolean] periodicity:map[type:string]] type:object] mgr:map[properties:map[modules:map[items:map[properties:map[enabled:map[type:boolean] name:map[type:string]] type:object] type:array]] type:object] mon:map[properties:map[allowMultiplePerNode:map[type:boolean] count:map[maximum:9 minimum:0 type:integer] stretchCluster:map[nullable:true properties:map[failureDomainLabel:map[type:string] subFailureDomain:map[type:string] zones:map[items:map[properties:map[arbiter:map[type:boolean] name:map[type:string] volumeClaimTemplate:map[type:object x-kubernetes-preserve-unknown-fields:true]] type:object] type:array]] type:object] volumeClaimTemplate:map[type:object x-kubernetes-preserve-unknown-fields:true]] type:object] monitoring:map[properties:map[enabled:map[type:boolean] externalMgrEndpoints:map[items:map[properties:map[ip:map[type:string]] type:object] type:array] externalMgrPrometheusPort:map[maximum:65535 minimum:0 type:integer] rulesNamespace:map[type:string]] type:object] network:map[nullable:true properties:map[hostNetwork:map[type:boolean] ipFamily:map[type:string] provider:map[type:string] selectors:map[nullable:true type:object x-kubernetes-preserve-unknown-fields:true]] type:object] placement:map[nullable:true type:object x-kubernetes-preserve-unknown-fields:true] priorityClassNames:map[nullable:true type:object x-kubernetes-preserve-unknown-fields:true] removeOSDsIfOutAndSafeToRemove:map[type:boolean] resources:map[nullable:true type:object x-kubernetes-preserve-unknown-fields:true] security:map[properties:map[kms:map[properties:map[connectionDetails:map[nullable:true type:object x-kubernetes-preserve-unknown-fields:true] tokenSecretName:map[type:string]] type:object]] type:object] skipUpgradeChecks:map[type:boolean] storage:map[properties:map[config:map[nullable:true type:object x-kubernetes-preserve-unknown-fields:true] deviceFilter:map[nullable:true type:string] devicePathFilter:map[type:string] devices:map[items:map[properties:map[config:map[nullable:true type:object x-kubernetes-preserve-unknown-fields:true] fullPath:map[type:string] name:map[type:string]] type:object] type:array] disruptionManagement:map[nullable:true properties:map[machineDisruptionBudgetNamespace:map[type:string] manageMachineDisruptionBudgets:map[type:boolean] managePodBudgets:map[type:boolean] osdMaintenanceTimeout:map[type:integer] pgHealthCheckTimeout:map[type:integer]] type:object] nodes:map[items:map[properties:map[config:map[nullable:true properties:map[databaseSizeMB:map[type:string] encryptedDevice:map[pattern:^(true|false)$ type:string] journalSizeMB:map[type:string] metadataDevice:map[type:string] osdsPerDevice:map[type:string] storeType:map[pattern:^(bluestore)$ type:string] walSizeMB:map[type:string]] type:object] deviceFilter:map[nullable:true type:string] devicePathFilter:map[type:string] devices:map[items:map[properties:map[config:map[nullable:true type:object x-kubernetes-preserve-unknown-fields:true] fullPath:map[type:string] name:map[type:string]] type:object] type:array] name:map[type:string] resources:map[nullable:true type:object x-kubernetes-preserve-unknown-fields:true] useAllDevices:map[type:boolean]] type:object] nullable:true type:array] storageClassDeviceSets:map[items:map[properties:map[config:map[nullable:true type:object x-kubernetes-preserve-unknown-fields:true] count:map[format:int32 type:integer] encrypted:map[type:boolean] name:map[type:string] placement:map[nullable:true type:object x-kubernetes-preserve-unknown-fields:true] portable:map[type:boolean] preparePlacement:map[nullable:true type:object x-kubernetes-preserve-unknown-fields:true] resources:map[nullable:true type:object x-kubernetes-preserve-unknown-fields:true] schedulerName:map[type:string] tuneDeviceClass:map[type:boolean] tuneFastDeviceClass:map[type:boolean] volumeClaimTemplates:map[items:map[type:object x-kubernetes-preserve-unknown-fields:true] type:array]] type:object] nullable:true type:array] useAllDevices:map[type:boolean] useAllNodes:map[type:boolean]] type:object]] type:object] status:map[type:object x-kubernetes-preserve-unknown-fields:true]] type:object]] served:true storage:true subresources:map[status:map[]]]]]]": cannot convert int64 to float64

Helm

helm upgrade -i rook-ceph rook-release/rook-ceph \
		--namespace=rook-ceph --create-namespace \
		--version=v1.5.5

Expected behavior:

How to reproduce it (minimal and precise):

  • Create 1.19.4 Cluster (in my case with RKE)
  • install Rook 1.5.1 with Helm
  • Upgrade 1.19.6 (with RKE)
  • upgrade rook/ceph to 1.5.5 with Helm

File(s) to submit: CephCluster:

apiVersion: ceph.rook.io/v1
kind: CephCluster
metadata:
  name: rook-ceph
  namespace: rook-ceph

spec:

  cephVersion:
    image: ceph/ceph:v15.2.5
    allowUnsupported: false

  dataDirHostPath: /var/lib/rook

  mon:
    count: 3
    allowMultiplePerNode: false

  dashboard:
    enabled: true

  network:
    hostNetwork: true

  resources:
    mgr:
      requests:
        cpu: "200m"
        memory: "512Mi"
      limits:
        cpu: "500m"
        memory: "1024Mi"
    mon:
      requests:
        cpu: "150m"
        memory: "1024Mi"
      limits:
        cpu: "500m"
        memory: "1024Mi"
    osd:
      requests:
        cpu: "250m"
        memory: "3000Mi"
      limits:
        cpu: "1000m"
        memory: "4000Mi"
  storage:
    useAllNodes: false
    useAllDevices: false
    deviceFilter: sda2
    config:
      osdsPerDevice: "1"
    nodes:
    - name: worker-1.<<redacted cluster name>.<<redacted cluster domain>>
    - name: worker-2.<<redacted cluster name>.<<redacted cluster domain>>
    - name: worker-3.<<redacted cluster name>.<<redacted cluster domain>>

Environment:

  • OS (e.g. from /etc/os-release): irrelevant
  • Helm Version: 3.5.0
  • Kernel (e.g. uname -a): irrelevant
  • Cloud provider or hardware configuration: Hetzner Cloud
  • Rook version (use rook version inside of a Rook Pod): v1.5.1/v1.5.5
  • Storage backend version (e.g. for ceph do ceph -v): irrelevant
  • Kubernetes version (use kubectl version): 1.19.4/1.19.6
  • Kubernetes cluster type (e.g. Tectonic, GKE, OpenShift): RKE (without Rancher)
  • Storage backend status: All OK, even after partially failed upgrade (for now, only been a couple hours)

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 25 (6 by maintainers)

Most upvoted comments

@travisn I’ll try to reproduce it on 1.20 in our automated clusters & CI pipelines tomorrow, but I’ll first have to get a RKE test version running as release version only supports 1.19.