terraform-provider-google: Cannot delete instance group because it's being used by a backend service
Community Note
- Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request.
- Please do not leave +1 or me too comments, they generate extra noise for issue followers and do not help prioritize the request.
- If you are interested in working on this issue or have submitted a pull request, please leave a comment.
- If an issue is assigned to the
modular-magicianuser, it is either in the process of being autogenerated, or is planned to be autogenerated soon. If an issue is assigned to a user, that user is claiming responsibility for the issue. If an issue is assigned tohashibot, a community member has claimed the issue already.
Terraform Version
Terraform v0.12.24
- provider.google v3.21.0
- provider.google-beta v3.21.0
Affected Resource(s)
- google_compute_region_backend_service
- google_compute_instance_group
Terraform Configuration Files
locals {
project = "<project-id>"
network = "<vpc-name>"
network_project = "<vpc-project>"
zones = ["europe-west1-b", "europe-west1-c", "europe-west1-d"]
s1_count = 3
}
provider "google" {
project = local.project
version = "~> 3.0"
}
data "google_compute_network" "network" {
name = local.network
project = local.network_project
}
resource "google_compute_region_backend_service" "s1" {
name = "s1"
dynamic "backend" {
for_each = google_compute_instance_group.s1
content {
group = backend.value.self_link
}
}
health_checks = [
google_compute_health_check.default.self_link,
]
}
resource "google_compute_health_check" "default" {
name = "s1"
tcp_health_check {
port = "80"
}
}
resource "google_compute_instance_group" "s1" {
count = local.s1_count
name = format("s1-%02d", count.index + 1)
zone = element(local.zones, count.index)
network = data.google_compute_network.network.self_link
}
I’m not sure is this a general TF problem or a Google provider problem, but here it goes.
Currently it’s not possible to lover the number of google_compute_instance_group that are used in a google_compute_region_backend_service. In the code above if we lower the number of google_compute_instance_group resources and try to apply the configuration, TF will first try to delete the not needed instance groups and then update the backend configuration, but that order doesn’t work because you cannot delete an instance group that is used by the backend service, the order should be the other way around.
So to sum it up, when I lower the number of the instance group resources TF does this:
- delete surplus
google_compute_instance_group-> this fails - update
google_compute_region_backend_service
It should do this the other way around:
- update
google_compute_region_backend_service - delete surplus
google_compute_instance_group-> this fails
Here is the output it generates:
google_compute_instance_group.s1[2]: Destroying... [id=projects/<project-id>/zones/europe-west1-d/instanceGroups/s1-03]
Error: Error deleting InstanceGroup: googleapi: Error 400: The instance_group resource 'projects/<project-id>/zones/europe-west1-d/instanceGroups/s1-03' is already being used by 'projects/<project-id>/regions/europe-west1/backendServices/s1', resourceInUseByAnotherResource
Expected Behavior
TF should first update the google_compute_region_backend_service, then delete the instance group.
Actual Behavior
TF tried to delete the instance group first, which resulted in an error.
Steps to Reproduce
terraform apply- Set
s1_count = 2 terraform apply
Important Factoids
It’s not a simple task to fix this. One “workaround” is to change the dynamic for_each to have a slice() function like this:
dynamic "backend" {
for_each = slice(google_compute_instance_group.s1, 0, 2)
content {
group = backend.value.self_link
}
}
So you first set the second number of slice() to the new number of the instanca groups run apply, then lower the s1_count to that same number and run apply again, but that’s just to complicated for a simple task like this.
b/308569276
About this issue
- Original URL
- State: open
- Created 4 years ago
- Reactions: 51
- Comments: 23 (4 by maintainers)
Commits related to this issue
- Add updated go.sum file (#6376) Signed-off-by: Modular Magician <magic-modules@google.com> — committed to modular-magician/terraform-provider-google by modular-magician 2 years ago
- Add updated go.sum file (#6376) (#12260) Signed-off-by: Modular Magician <magic-modules@google.com> — committed to hashicorp/terraform-provider-google by modular-magician 2 years ago
This has been driving me nuts for months. Using Cloud Run behind external GCLB. Backend services for the Serverless NEGs are in use by the URL map.
Once all this config/infra is in place, the service / backend service cannot be deleted even if removing the URL map in the same change. It’s becomes a two step of removing URL map, then removing service and backend service.
In an enterprise setting with ~10 environments each receiving different releases at different schedules, having repeat CI pipelines is not okay and is basically unmanageable.
Can confirm that this is the case with manual global load balancing setup on Google Provider as well. Definitely super annoying that we need to manually need to:
terraform applyto achieve desired state.This means anytime we turn down on a region some administrator is going to have to do this instead of simply relying on CI/CD. What’s worse is that it makes proving certain security/compliance certifications harder as our CI/CD + pull request process is audited and logged; but random CLI commands from an administrator’s shell environment is harder to track (i.e. we need to involve GCP Audit Logging in the business justifications).
Looking forward to an elegant solution by the provider here.
@pdecat that should work, and requires implementing a new fine-grained resource
google_compute_region_backend_service_backend.Reopening the issue since a solution is possible, and this will be tracked similarly to other feature-requests.
lack of pretty essential features and bugs like this makes me very disappointed with all the terraform and GCP
@StephenWithPH
ForceNewwould have the same effect, but make every change (addition as well as removal) to the backend set destructive. Providing a new fine-grained resource is the cleaner option here.I actually just ran into this issue a couple of days ago, and I was able to resolve it by appending a random string to the end of the group manager’s name and using the
create_before_destroylifecycle policy for the instance group manager resource. For whatever reason, doing so leads Terraform to modify the backend service before destroying the original instance group. Still not the prettiest hack in the world, but better than having to issue multiple applies.This issue is actually quite problematic
I get these errors trying to destroy the whole module. It requires multiple targeted terraform destroys to complete
Disappointing this exists for 2+ years and still no fix.
How come terraform doesn’t understand it can’t delete a managed instance group without first removing the load balancer (i.e. backend) depending on it? Seems a pretty simple idea, which for some reason isn’t implemented?
hi could you paste an example of what you did with the create_before_destroy ?
I can relate to this, GCP doesn’t update the URL map before destroying backend services. Very frustrating.