rancher: [v2.2.4] Upgrade fails for setups with clusters having OpenStack CloudProvider with a LoadBalancer config: `cannot unmarshal number into Go value of type string`

What kind of request is this (question/bug/enhancement/feature request): Bug

Steps to reproduce (least amount of steps as possible):

Upgrade from v2.2.3 to v2.2.4

Result:

Rancher Server fails to start with the following error mesages:

E0606 07:39:20.296926       8 reflector.go:134] github.com/rancher/norman/controller/generic_controller.go:175: Failed to list *v3.Cluster: json: cannot unmarshal number into Go value of type string

Other details that may be helpful: After a Downgrade to v2.2.3 Rancher is running fine again

Environment information

  • Rancher version (rancher/rancher/rancher/server image tag or shown bottom left in the UI): v2.2.4
  • Installation option (single install/HA): Single Installation

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Comments: 44 (10 by maintainers)

Most upvoted comments

@alena1108 Thank you very much! You are absolutely right. Changing the values of monitor-delay and monitor-timeout manually by changing the cluster specs (kubectl edit cluster c-xxx) has fixed the problem (at least for me).

I went back and changed both those to "1s" and have the same problem: reflector.go:134] github.com/rancher/norman/controller/generic_controller.go:175: Failed to list *v3.Cluster: json: cannot unmarshal number into Go value of type string

EDIT: I win the “learn to read” prize today, there were multiple loadbalancer: fields in there so there were multiple instances of each monitor line. Changing all of them fixed the issue.

Steps to mitigate:

* run kubectl on a rancher management plane, find the affected cluster
* kubectl edit clustername --namespace clusternamespace
* Find the param in the clusterSpec and ClusterStatus, quote the param and add a unit to it. Example: if it was 30, change it to "30s", if it was 0, change it to ""
* Click save

For anyone else seeing the same issue as @pameladelgado, we fixed it using: https://github.com/rancher/rancher/issues/20909#issuecomment-521352246

@DJAyth can you try the steps mentioned by @kinarashah - she specifically tested it for HA use case:

In 2.2.3, those values were 0
I upgraded to 2.2.4, upgrade fails
I edited clusters to make them ""
pods start running and I can access the login page

If you change the value while being on 2.2.3, the 2.2.3 UI will fail to load as it doesn’t understand string. I’ve updated my initial comment to reflect this info. Please let us know how if that works for you

Thanks @ptrunk I or someone on my team will investigate further.