vsphere-csi-driver: CnsFault error: VSLM task failed
Is this a BUG REPORT or FEATURE REQUEST?:
Uncomment only one, leave it on its own line:
/kind bug
What happened: I am a novice, I followed the document smoothly to this step
Then an error was reported when creating the PV.
{“level”:“info”,“time”:“2021-11-12T19:42:02.055011665Z”,“caller”:“volume/manager.go:407”,“msg”:“CreateVolume: VolumeName: "pvc-bca23169-66d8-4685-b045-1fd47397e619", opId: "06ed1681"”,“TraceId”:“353215b1-a18c-4432-a454-a5994d1b3ee9”} {“level”:“info”,“time”:“2021-11-12T19:42:02.055060708Z”,“caller”:“volume/util.go:364”,“msg”:“Extract vimfault type: +types.CnsFault vimFault: +{<nil> VSLM task failed} Fault: &{DynamicData:{} Fault:{BaseMethodFault:<nil> Reason:VSLM task failed} LocalizedMessage:CnsFault error: VSLM task failed} from resp: +&{{} {{} } 0xc000701e80}”,“TraceId”:“353215b1-a18c-4432-a454-a5994d1b3ee9”}
{"level":"error","time":"2021-11-12T19:42:02.055090708Z","caller":"volume/util.go:291","msg":"failed to create volume with fault: \"(*types.LocalizedMethodFault)(0xc000701e80)({\\n DynamicData: (types.DynamicData) {\\n },\\n Fault: (types.CnsFault) {\\n BaseMethodFault: (types.BaseMethodFault) <nil>,\\n Reason: (string) (len=16) \\\"VSLM task failed\\\"\\n },\\n LocalizedMessage: (string) (len=32) \\\"CnsFault error: VSLM task failed\\\"\\n})\\n\"","TraceId":"353215b1-a18c-4432-a454-a5994d1b3ee9","stacktrace":"sigs.k8s.io/vsphere-csi-driver/v2/pkg/common/cns-lib/volume.validateCreateVolumeResponseFault\n\t/build/pkg/common/cns-lib/volume/util.go:291\nsigs.k8s.io/vsphere-csi-driver/v2/pkg/common/cns-lib/volume.(*defaultManager).createVolumeWithImprovedIdempotency\n\t/build/pkg/common/cns-lib/volume/manager.go:424\nsigs.k8s.io/vsphere-csi-driver/v2/pkg/common/cns-lib/volume.(*defaultManager).CreateVolume.func1\n\t/build/pkg/common/cns-lib/volume/manager.go:567\nsigs.k8s.io/vsphere-csi-driver/v2/pkg/common/cns-lib/volume.(*defaultManager).CreateVolume\n\t/build/pkg/common/cns-lib/volume/manager.go:572\nsigs.k8s.io/vsphere-csi-driver/v2/pkg/csi/service/common.CreateBlockVolumeUtil\n\t/build/pkg/csi/service/common/vsphereutil.go:242\nsigs.k8s.io/vsphere-csi-driver/v2/pkg/csi/service/vanilla.(*controller).createBlockVolume\n\t/build/pkg/csi/service/vanilla/controller.go:541\nsigs.k8s.io/vsphere-csi-driver/v2/pkg/csi/service/vanilla.(*controller).CreateVolume.func1\n\t/build/pkg/csi/service/vanilla/controller.go:830\nsigs.k8s.io/vsphere-csi-driver/v2/pkg/csi/service/vanilla.(*controller).CreateVolume\n\t/build/pkg/csi/service/vanilla/controller.go:832\ngithub.com/container-storage-interface/spec/lib/go/csi._Controller_CreateVolume_Handler\n\t/go/pkg/mod/github.com/container-storage-interface/spec@v1.4.0/lib/go/csi/csi.pb.go:5589\ngoogle.golang.org/grpc.(*Server).processUnaryRPC\n\t/go/pkg/mod/google.golang.org/grpc@v1.27.1/server.go:1024\ngoogle.golang.org/grpc.(*Server).handleStream\n\t/go/pkg/mod/google.golang.org/grpc@v1.27.1/server.go:1313\ngoogle.golang.org/grpc.(*Server).serveStreams.func1.1\n\t/go/pkg/mod/google.golang.org/grpc@v1.27.1/server.go:722"} {"level":"error","time":"2021-11-12T19:42:02.059953685Z","caller":"common/vsphereutil.go:244","msg":"failed to create disk pvc-bca23169-66d8-4685-b045-1fd47397e619 with error failed to create volume with fault: \"(*types.LocalizedMethodFault)(0xc000701e80)({\\n DynamicData: (types.DynamicData) {\\n },\\n Fault: (types.CnsFault) {\\n BaseMethodFault: (types.BaseMethodFault) <nil>,\\n Reason: (string) (len=16) \\\"VSLM task failed\\\"\\n },\\n LocalizedMessage: (string) (len=32) \\\"CnsFault error: VSLM task failed\\\"\\n})\\n\" faultType \"vim.fault.CnsFault\"","TraceId":"353215b1-a18c-4432-a454-a5994d1b3ee9","stacktrace":"sigs.k8s.io/vsphere-csi-driver/v2/pkg/csi/service/common.CreateBlockVolumeUtil\n\t/build/pkg/csi/service/common/vsphereutil.go:244\nsigs.k8s.io/vsphere-csi-driver/v2/pkg/csi/service/vanilla.(*controller).createBlockVolume\n\t/build/pkg/csi/service/vanilla/controller.go:541\nsigs.k8s.io/vsphere-csi-driver/v2/pkg/csi/service/vanilla.(*controller).CreateVolume.func1\n\t/build/pkg/csi/service/vanilla/controller.go:830\nsigs.k8s.io/vsphere-csi-driver/v2/pkg/csi/service/vanilla.(*controller).CreateVolume\n\t/build/pkg/csi/service/vanilla/controller.go:832\ngithub.com/container-storage-interface/spec/lib/go/csi._Controller_CreateVolume_Handler\n\t/go/pkg/mod/github.com/container-storage-interface/spec@v1.4.0/lib/go/csi/csi.pb.go:5589\ngoogle.golang.org/grpc.(*Server).processUnaryRPC\n\t/go/pkg/mod/google.golang.org/grpc@v1.27.1/server.go:1024\ngoogle.golang.org/grpc.(*Server).handleStream\n\t/go/pkg/mod/google.golang.org/grpc@v1.27.1/server.go:1313\ngoogle.golang.org/grpc.(*Server).serveStreams.func1.1\n\t/go/pkg/mod/google.golang.org/grpc@v1.27.1/server.go:722"} {"level":"error","time":"2021-11-12T19:42:02.060009157Z","caller":"vanilla/controller.go:544","msg":"failed to create volume. Error: failed to create volume with fault: \"(*types.LocalizedMethodFault)(0xc000701e80)({\\n DynamicData: (types.DynamicData) {\\n },\\n Fault: (types.CnsFault) {\\n BaseMethodFault: (types.BaseMethodFault) <nil>,\\n Reason: (string) (len=16) \\\"VSLM task failed\\\"\\n },\\n LocalizedMessage: (string) (len=32) \\\"CnsFault error: VSLM task failed\\\"\\n})\\n\"","TraceId":"353215b1-a18c-4432-a454-a5994d1b3ee9","stacktrace":"sigs.k8s.io/vsphere-csi-driver/v2/pkg/csi/service/vanilla.(*controller).createBlockVolume\n\t/build/pkg/csi/service/vanilla/controller.go:544\nsigs.k8s.io/vsphere-csi-driver/v2/pkg/csi/service/vanilla.(*controller).CreateVolume.func1\n\t/build/pkg/csi/service/vanilla/controller.go:830\nsigs.k8s.io/vsphere-csi-driver/v2/pkg/csi/service/vanilla.(*controller).CreateVolume\n\t/build/pkg/csi/service/vanilla/controller.go:832\ngithub.com/container-storage-interface/spec/lib/go/csi._Controller_CreateVolume_Handler\n\t/go/pkg/mod/github.com/container-storage-interface/spec@v1.4.0/lib/go/csi/csi.pb.go:5589\ngoogle.golang.org/grpc.(*Server).processUnaryRPC\n\t/go/pkg/mod/google.golang.org/grpc@v1.27.1/server.go:1024\ngoogle.golang.org/grpc.(*Server).handleStream\n\t/go/pkg/mod/google.golang.org/grpc@v1.27.1/server.go:1313\ngoogle.golang.org/grpc.(*Server).serveStreams.func1.1\n\t/go/pkg/mod/google.golang.org/grpc@v1.27.1/server.go:722"}
What you expected to happen:
How to reproduce it (as minimally and precisely as possible): I just used it to test the environment, I used a single node to build a VSAN
Anything else we need to know?:
Environment:
-
csi-vsphere version:
-
vsphere-cloud-controller-manager version: image: gcr.io/cloud-provider-vsphere/cpi/release/manager:v1.21.1
-
Kubernetes version: 1.21.5
-
vSphere version: vsphere version 7.0.3.00100
-
OS (e.g. from /etc/os-release): photon os 4.0
-
Kernel (e.g.
uname -a
): 5.10 -
Install tools:
-
Others:
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Comments: 20 (2 by maintainers)
Enabling Changed Block Tracking (CBT) did the trick for us too. Thanks a bunch.
Solution: Adding
ctkEnabled = "TRUE"
flag in the Rancher node template fixes it for all the new nodes.Hi all,
VMware support solved/workaround this issue by manually enabling Changed Block Tracking (CBT) on our existing worker nodes. https://kb.vmware.com/s/article/1020128
"In some cases, such as a power failure or hard shutdown while virtual machines are powered on, CBT might reset and lose track of incremental changes."
As per the VMware support CBT sometimes resets (disables) in case of power failures or hard shutdown of powered on VMs. In our case, we had a few PSODs on our ESXi caused by the issues of ESXi 7.0 Update 3b.
I hope this helps @Moezenka @torbendury @lakxtxue @jwhb
@ThoSap this also works for us, as it seems
I am experiencing the same problem and i have opened a ticket with VMware.
I was told that the engineering team suspects that this is a vCenter issue and that it should be solved in the coming versions.
I had this issue too and the second resolution works like a charm in my case : https://kb.vmware.com/s/article/88193
I think people who had success with setting
ctkEnabled=TRUE
were dealing with the issue described here: https://kb.vmware.com/s/article/88193. In short, attaching an FCD with CBT enabled to a VM with CBT disabled won’t work.In my case, I found this issue after getting the same “VSLM task failed” error in the output of
kubectl get events
. Unfortunately In my case settingctkEnabled=TRUE
didn’t fix the issue. What did work for me was a re-scan of storage in my ESXi cluster (in inventory, right click on the ESXi cluster -> storage -> rescan storage).I strongly suggest anyone stumbling on this issue because of a “VSLM task failed” error first try a storage rescan, and bear in mind that if you do decide to try setting
ctkEnabled=TRUE
, disabling it again requires manual work for each attached FCD as detailed in the KB link above.