terraform-provider-ibm: Intermittently ibm_container_vpc_cluster fails saying: "A cluster with the same name already exists"
Internal Project Golden Eye
We have noticed that intermittently (but not very frequently), when creating a cluster using ibm_container_vpc_cluster the IKS api responds with Error: Request failed with status code: 409, ServerErrorResponse: {"incidentID":"1a1240b1-c05e-481d-90f4-eefdb89a03b0,1a1240b1-c05e-481d-90f4-eefdb89a03b0","code":"E0007","description":"A cluster with the same name already exists. Choose another name.","type":"Provisioning"}
Any time this has happened, we have logged in and checked for a cluster with that name. In every case, we have found that actually the ibm_container_vpc_cluster did successfully create the cluster! And the timestamp matches the timestamp in the logs. So why is IKS api failing with that error?
We reproduced the issue with trace logs, but to be honest I am struggling to see what the root cause is. Is it possible that somewhere in the provider code, a process to provision the cluster was kicked off, but due to some glitch, it was kicked off a second time, and so ended up with the IKS api response to say cluster with that name already exists?
Here is a screenshot which shows the creation time of the cluster (in UTC +1 time) - 12.04pm:

And below I have attached the logs (including trace log) which show the timestamp matches the cluster creation (these logs are in UTC time):
TestOCPSMBasic/TestOCPSMBasic_0 2021-10-13T11:04:11Z logger.go:66: [0m[1mibm_container_vpc_cluster.cluster: Creating...[0m[0m
Community Note
- Please vote on this issue by adding a π reaction to the original issue to help the community and maintainers prioritize this request
- Please do not leave β+1β or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
- If you are interested in working on this issue or have submitted a pull request, please leave a comment
Terraform CLI and Terraform IBM Provider Version
Terraform v1.0.8 on darwin_amd64
- provider registry.terraform.io/ibm-cloud/ibm v1.34.0
Affected Resource(s)
- ibm_container_vpc_cluster
Terraform Configuration Files
Internal url: https://github.ibm.com/GoldenEye/ocp-service-mesh-module/tree/master/examples/basic
Debug Output
Panic Output
Expected Behavior
api should not fail if cluster provisioning passed
Actual Behavior
api response saying cluster already exists
Steps to Reproduce
terraform apply
Important Factoids
References
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Reactions: 1
- Comments: 19
Thanks for confirming this. We will update our provider version to a newer version so that we get correct logging. I am working with IKS to find out the root cause of the 500. I donβt think that any changes are needed from the provider side - IKS need to ensure that the cluster provisioning process does not proceed if it returns a 500 response.