rancher: [BUG] rancher-vsphere-cpi generates broken YAML configuration, fails to update

Rancher Server Setup

  • Rancher version: 2.6.11
  • Installation option (Docker install/Helm Chart): Helm Chart

Information about the Cluster

  • Kubernetes version: v1.24.9
  • Cluster Type (Local/Downstream): Downstream, rancher launched RKE1 on vSphere

User Information

  • What is the role of the user logged in? Global Administrator

Describe the bug After upgrading to Rancher 2.6.11 from 2.6.10, we attemped to upgrade the rancher-vsphere-cpi chart from 100.4.0 to 100.5.0. The rancher-vsphere-cpi-cloud-controller-manager pods started to crash with the following error:

0309 17:49:40.816427       1 config.go:69] ReadCPIConfigYAML failed: yaml: line 13: did not find expected alphabetic or numeric character
E0309 17:49:40.816453       1 config.go:73] ReadConfigINI failed: 3:1: expected section header
F0309 17:49:40.816465       1 main.go:265] Cloud provider could not be initialized: could not init cloud provider "vsphere": 3:1: expected section header
goroutine 1 [running]:
k8s.io/klog/v2.stacks(0x1)
	/go/pkg/mod/k8s.io/klog/v2@v2.60.1/klog.go:860 +0x8a
k8s.io/klog/v2.(*loggingT).output(0x3da4f00, 0x3, 0x0, 0xc00040ce00, 0x1, {0x2f8899e, 0x1}, 0x3da5a60, 0x0)
	/go/pkg/mod/k8s.io/klog/v2@v2.60.1/klog.go:825 +0x686
k8s.io/klog/v2.(*loggingT).printfDepth(0x3da4f00, 0xb08ae5b8, 0x0, {0x0, 0x0}, 0x0, {0x257e076, 0x2b}, {0xc0002567f0, 0x1, ...})
	/go/pkg/mod/k8s.io/klog/v2@v2.60.1/klog.go:630 +0x1f2
k8s.io/klog/v2.(*loggingT).printf(...)
	/go/pkg/mod/k8s.io/klog/v2@v2.60.1/klog.go:612
k8s.io/klog/v2.Fatalf(...)
	/go/pkg/mod/k8s.io/klog/v2@v2.60.1/klog.go:1516
main.initializeCloud(0xc000612528, {0x7ffe955b45a6, 0xc0001764e8})
	/build/cmd/vsphere-cloud-controller-manager/main.go:265 +0xf4
main.main.func6(0xc0002ab390, {0xc0002abf70, 0x1a95a65, 0x161db00})
	/build/cmd/vsphere-cloud-controller-manager/main.go:177 +0x50c
main.main.func7(0xc0002b0500, {0xc0009453b0, 0x0, 0x3})
	/build/cmd/vsphere-cloud-controller-manager/main.go:211 +0x23e
github.com/spf13/cobra.(*Command).execute(0xc0002b0500, {0xc0000dc050, 0x3, 0x3})
	/go/pkg/mod/github.com/spf13/cobra@v1.4.0/command.go:860 +0x5f8
github.com/spf13/cobra.(*Command).ExecuteC(0xc0002b0500)
	/go/pkg/mod/github.com/spf13/cobra@v1.4.0/command.go:974 +0x3bc
github.com/spf13/cobra.(*Command).Execute(...)
	/go/pkg/mod/github.com/spf13/cobra@v1.4.0/command.go:902
main.main()
	/build/cmd/vsphere-cloud-controller-manager/main.go:214 +0x8c5

goroutine 7 [sleep]:
time.Sleep(0x6fc23ac00)
	/usr/local/go/src/runtime/time.go:193 +0x12e
sigs.k8s.io/controller-runtime/pkg/log.init.0.func1()
	/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.12.1/pkg/log/log.go:63 +0x38
created by sigs.k8s.io/controller-runtime/pkg/log.init.0
	/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.12.1/pkg/log/log.go:62 +0x25

goroutine 125 [select]:
k8s.io/klog/v2.(*flushDaemon).run.func1()
	/go/pkg/mod/k8s.io/klog/v2@v2.60.1/klog.go:1045 +0x125
created by k8s.io/klog/v2.(*flushDaemon).run
	/go/pkg/mod/k8s.io/klog/v2@v2.60.1/klog.go:1041 +0x17d

goroutine 130 [chan receive]:
k8s.io/client-go/util/workqueue.(*Type).updateUnfinishedWorkLoop(0xc00028ce40)
	/go/pkg/mod/k8s.io/client-go@v0.24.1/util/workqueue/queue.go:271 +0xa7
created by k8s.io/client-go/util/workqueue.newQueue
	/go/pkg/mod/k8s.io/client-go@v0.24.1/util/workqueue/queue.go:63 +0x1af

goroutine 131 [select]:
k8s.io/client-go/util/workqueue.(*delayingType).waitingLoop(0xc0003a0060)
	/go/pkg/mod/k8s.io/client-go@v0.24.1/util/workqueue/delaying_queue.go:233 +0x34e
created by k8s.io/client-go/util/workqueue.newDelayingQueue
	/go/pkg/mod/k8s.io/client-go@v0.24.1/util/workqueue/delaying_queue.go:70 +0x247

goroutine 132 [chan receive]:
k8s.io/client-go/util/workqueue.(*Type).updateUnfinishedWorkLoop(0xc0003a00c0)
	/go/pkg/mod/k8s.io/client-go@v0.24.1/util/workqueue/queue.go:271 +0xa7
created by k8s.io/client-go/util/workqueue.newQueue
	/go/pkg/mod/k8s.io/client-go@v0.24.1/util/workqueue/queue.go:63 +0x1af

goroutine 133 [select]:
k8s.io/client-go/util/workqueue.(*delayingType).waitingLoop(0xc0003a01e0)
	/go/pkg/mod/k8s.io/client-go@v0.24.1/util/workqueue/delaying_queue.go:233 +0x34e
created by k8s.io/client-go/util/workqueue.newDelayingQueue
	/go/pkg/mod/k8s.io/client-go@v0.24.1/util/workqueue/delaying_queue.go:70 +0x247

goroutine 134 [chan receive]:
k8s.io/client-go/util/workqueue.(*Type).updateUnfinishedWorkLoop(0xc0003a03c0)
	/go/pkg/mod/k8s.io/client-go@v0.24.1/util/workqueue/queue.go:271 +0xa7
created by k8s.io/client-go/util/workqueue.newQueue
	/go/pkg/mod/k8s.io/client-go@v0.24.1/util/workqueue/queue.go:63 +0x1af

goroutine 135 [select]:
k8s.io/client-go/util/workqueue.(*delayingType).waitingLoop(0xc0003a05a0)
	/go/pkg/mod/k8s.io/client-go@v0.24.1/util/workqueue/delaying_queue.go:233 +0x34e
created by k8s.io/client-go/util/workqueue.newDelayingQueue
	/go/pkg/mod/k8s.io/client-go@v0.24.1/util/workqueue/delaying_queue.go:70 +0x247

goroutine 141 [chan receive]:
k8s.io/apimachinery/pkg/watch.(*Broadcaster).loop(0xc0000dd7c0)
	/go/pkg/mod/k8s.io/apimachinery@v0.24.1/pkg/watch/mux.go:247 +0x49
created by k8s.io/apimachinery/pkg/watch.NewLongQueueBroadcaster
	/go/pkg/mod/k8s.io/apimachinery@v0.24.1/pkg/watch/mux.go:89 +0x11b

goroutine 139 [IO wait]:
internal/poll.runtime_pollWait(0x7f5e89a99ac8, 0x72)
	/usr/local/go/src/runtime/netpoll.go:234 +0x89
internal/poll.(*pollDesc).wait(0xc0005a9380, 0xc0008fa000, 0x0)
	/usr/local/go/src/internal/poll/fd_poll_runtime.go:84 +0x32
internal/poll.(*pollDesc).waitRead(...)
	/usr/local/go/src/internal/poll/fd_poll_runtime.go:89
internal/poll.(*FD).Read(0xc0005a9380, {0xc0008fa000, 0x16e1, 0x16e1})
	/usr/local/go/src/internal/poll/fd_unix.go:167 +0x25a
net.(*netFD).Read(0xc0005a9380, {0xc0008fa000, 0xc0008fa005, 0xa83})
	/usr/local/go/src/net/fd_posix.go:56 +0x29
net.(*conn).Read(0xc0002bc010, {0xc0008fa000, 0x506ade, 0xc00058f7f0})
	/usr/local/go/src/net/net.go:183 +0x45
crypto/tls.(*atLeastReader).Read(0xc0003ccd38, {0xc0008fa000, 0x0, 0x409ccd})
	/usr/local/go/src/crypto/tls/conn.go:777 +0x3d
bytes.(*Buffer).ReadFrom(0xc0000cc278, {0x288a200, 0xc0003ccd38})
	/usr/local/go/src/bytes/buffer.go:204 +0x98
crypto/tls.(*Conn).readFromUntil(0xc0000cc000, {0x2891640, 0xc0002bc010}, 0x16e1)
	/usr/local/go/src/crypto/tls/conn.go:799 +0xe5
crypto/tls.(*Conn).readRecordOrCCS(0xc0000cc000, 0x0)
	/usr/local/go/src/crypto/tls/conn.go:606 +0x112
crypto/tls.(*Conn).readRecord(...)
	/usr/local/go/src/crypto/tls/conn.go:574
crypto/tls.(*Conn).Read(0xc0000cc000, {0xc0007ef000, 0x1000, 0x8ac000})
	/usr/local/go/src/crypto/tls/conn.go:1277 +0x16f
bufio.(*Reader).Read(0xc0002ba780, {0xc00054c4a0, 0x9, 0x8c8ec2})
	/usr/local/go/src/bufio/bufio.go:227 +0x1b4
io.ReadAtLeast({0x288a040, 0xc0002ba780}, {0xc00054c4a0, 0x9, 0x9}, 0x9)
	/usr/local/go/src/io/io.go:328 +0x9a
io.ReadFull(...)
	/usr/local/go/src/io/io.go:347
golang.org/x/net/http2.readFrameHeader({0xc00054c4a0, 0x9, 0xc0013f4c60}, {0x288a040, 0xc0002ba780})
	/go/pkg/mod/golang.org/x/net@v0.3.0/http2/frame.go:237 +0x6e
golang.org/x/net/http2.(*Framer).ReadFrame(0xc00054c460)
	/go/pkg/mod/golang.org/x/net@v0.3.0/http2/frame.go:498 +0x95
golang.org/x/net/http2.(*clientConnReadLoop).run(0xc00058ff98)
	/go/pkg/mod/golang.org/x/net@v0.3.0/http2/transport.go:2229 +0x130
golang.org/x/net/http2.(*ClientConn).readLoop(0xc0002b8600)
	/go/pkg/mod/golang.org/x/net@v0.3.0/http2/transport.go:2124 +0x6f
created by golang.org/x/net/http2.(*Transport).newClientConn
	/go/pkg/mod/golang.org/x/net@v0.3.0/http2/transport.go:821 +0xc78

goroutine 142 [chan receive]:
k8s.io/client-go/tools/record.(*eventBroadcasterImpl).StartEventWatcher.func1()
	/go/pkg/mod/k8s.io/client-go@v0.24.1/tools/record/event.go:304 +0x98
created by k8s.io/client-go/tools/record.(*eventBroadcasterImpl).StartEventWatcher
	/go/pkg/mod/k8s.io/client-go@v0.24.1/tools/record/event.go:302 +0x91

goroutine 143 [chan receive]:
k8s.io/client-go/tools/record.(*eventBroadcasterImpl).StartEventWatcher.func1()
	/go/pkg/mod/k8s.io/client-go@v0.24.1/tools/record/event.go:304 +0x98
created by k8s.io/client-go/tools/record.(*eventBroadcasterImpl).StartEventWatcher
	/go/pkg/mod/k8s.io/client-go@v0.24.1/tools/record/event.go:302 +0x91

Here is the configmap generated from the chart, please pay close attention to the comment at the top of the YAML file:

apiVersion: v1
data:
  vsphere.yaml: >-
    # Global properties in this section will be used for all specified vCenters
    unless overriden in VirtualCenter section.


    global:
      secretName: "vsphere-cpi-creds"
      secretNamespace: "kube-system"
      port: 443
      insecureFlag: true

    vcenter:
      "_redacted_":
        server: "_redacted_"
        user: _redacted_
        password: _redacted_
        datacenters:
          - "_redacted_"
kind: ConfigMap
metadata:
  annotations:
    meta.helm.sh/release-name: vsphere-cpi
    meta.helm.sh/release-namespace: kube-system
  labels:
    app.kubernetes.io/managed-by: Helm
    component: rancher-vsphere-cpi-cloud-controller-manager
    vsphere-cpi-infra: config
  name: vsphere-cloud-config
  namespace: kube-system

I’m not sure how that word wrapping is happening but it absolutely appears in the file on disk. Also, our password has special characters in it and neither the username nor password are quoted in there.

To Reproduce Upgrade to rancher-vsphere-cpi 100.5.

Result The cloud controller pods cannot start and crash loop.

Expected Result The new containers start successfully.

About this issue

  • Original URL
  • State: closed
  • Created a year ago
  • Comments: 23 (12 by maintainers)

Most upvoted comments

@ikogan Sorry, could you clarify causes them to start?

Fixes the issue; if I replace all instances of * and $ in the password, everything works.

Debugging this caused me to discover a bug in the Dashboard (which has been filed at https://github.com/rancher/dashboard/issues/8427).

When you view a resource via Edit as YAML, a long comment is soft wrapped and given a new line number even though there is no line break character present. It can be misleading on this issue where screen shots show a line break that may not be present.

I would suggest using kubectl get configmap ... to view the source and see what is present for the YAML comment.

Well, it sort of is since we have to change that password now in both our vCenter and all of the clusters that use it. That’s…not ideal.

It’s used by the CSI driver too, which means we need to change it and update it everywhere before something needs to provision or attach a PVC.