k8s-config-connector: ComputeDisk fails to be created/updated successfully due to "immutable field(s): [interface]"
Checklist
- I did not find a related open issue.
- I did not find a solution in the troubleshooting guide: (https://cloud.google.com/config-connector/docs/troubleshooting)
- If this issue is time-sensitive, I have submitted a corresponding issue with GCP support.
Bug Description
ComputeDisk resources fail to be created/updated properly, due to Update call failed: cannot make changes to immutable field(s): [interface].
Additional Diagnostic Information
Kubernetes Cluster Version
Client Version: v1.21.2
Server Version: v1.20.8-gke.900
Config Connector Version
1.52.0
Config Connector Mode
namespaced
Log Output
cnrm-controller-manager-xxx-0 manager {"severity":"error","logger":"controller-runtime.manager.controller.computedisk-controller","msg":"Reconciler error","reconciler group":"compute.cnrm.cloud.google.com","reconciler kind":"ComputeDisk","name":"<disk>","namespace":"<namespace>","error":"Update call failed: cannot make changes to immutable field(s): [interface]"}
Steps to Reproduce
Steps to reproduce the issue
- Apply a
ComputeDisktemplate as below (either from scratch or for an existing resource) - Observe that the resource never becomes ready:
kubectl wait --for condition=Ready ComputeDisk <disk> -n <namespace> --timeout 60s - Check the status of the resource:
# kubectl get computedisk -n <namespace> <disk> -o jsonpath='{.status}' {"conditions":[{"lastTransitionTime":"2021-07-30T15:20:37Z","message":"Update call failed: cannot make changes to immutable field(s): [interface]","reason":"UpdateFailed","status":"False","type":"Ready"}],"observedGeneration":1}
YAML snippets
apiVersion: compute.cnrm.cloud.google.com/v1beta1
kind: ComputeDisk
metadata:
name: <disk>
namespace: <namespace>
annotations:
cnrm.cloud.google.com/deletion-policy: abandon
spec:
location: <region>
replicaZones:
- https://www.googleapis.com/compute/v1/projects/<project>/zones/<zone1>
- https://www.googleapis.com/compute/v1/projects/<project>/zones/<zone2>
resourcePolicies:
- name: <snapshot-policy>
namespace: <snapshot-policy-namespace>
size: 200
It doesn’t help to set the default interface: SCSI field for this resource explicitly.
Also of note is that we have another cluster with the same versions of everything, and the same templates don’t fail there.
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Reactions: 1
- Comments: 34 (13 by maintainers)
@toumorokoshi any update on this? After my fix to set the interface to
interface: ''it fixed the issue for about a week and now its back (failing to reconcile). Do I just need to revert my fix or something?gcloud compute disks describestill shows no interface field on the disk for me.As an update here: we’ve modified Config Connector to no longer honor the field (as the API didn’t really use it in the first place).
This issue should be resolved now with version 1.63.0. It’ll take a while for the add-on to catch up, but if you can install manually it’ll resolve the issue.
I’ll note this as closed since there is a working version out, but re-open if you see it’s still happening with 1.63 or beyond.
Got it, thanks for confirming! That isn’t great, but at least we now have confirmation on the behavior.
As you can imagine, Config Connector needs the API to behave in a consistent manner, even if just to workraround consistently on behavior. I’m going to connect your internal issue to the compute disk team, and hopefully we can get a a good answer which will let us move forward on a fix (whatever that turns out to be).
@williambrode come to think of it, you’re right! I get the same error trying to apply the config now. I’m pretty sure that the initial behavior was on
waitand not on theapplythough, since it temporarily “resolved” once I’ve removed the “wait” (and we even closed the Google support ticket, because this “workaround” was found). Then, it started failing again after a few days, so something must’ve changed, maybe even on the API side? 🤷 It did seem very mysterious…@toumorokoshi thanks, I’ll respond with more info on Mon!
Yes, that makes sense. Just for reference - this is the error I got when trying to apply the config to change the
interfacefield. Not sure why you didn’t run into this.Adding
cnrm.cloud.google.com/deletion-policy: abandonand then deleting and recreating the ComputeDisk config worked for me. And it looks like settinginterface: ''fixed my config connector errors like it did yours.Hi!
I’ve gotten word back from the Compute Disk API team that “UNSPECIFIED” is a valid enum value. This is unfortunately not documented, so it may be hard to point to an official public source about the existence of this value.
We probably have to do some work on the Config Connector side for this edge case. As @maqiuyujoyce we’ll discuss a long term fix, but glad to hear the short term fix of using an empty string works.
The choice for the string pointer was to differentiate between leaving the value unspecified (at which point Config Connector will leave the field unmanaged, and should use the value from the API) or set to an explicit value (set the value in the GCP API).
In this case I think the issue is that Config Connector doesn’t understand how to handle interfaces that return back “unspecified”, which seems to manifest in the API as not being present in the response body. nil should have also worked (Config Connector shouldn’t modify the value).
Thanks a lot for digging and sharing your findings, @dinvlad ! This is an interesting edge case and sorry that it has cost you a lot of time to work it around.
I’ll bring up the issue to the team for potential fixes/options on our end. Will keep you posted.