terraform-provider-google: instance_group_manager marked tainted if healthcheck failing
Community Note
- Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request.
- Please do not leave +1 or me too comments, they generate extra noise for issue followers and do not help prioritize the request.
- If you are interested in working on this issue or have submitted a pull request, please leave a comment.
- If an issue is assigned to the
modular-magician
user, it is either in the process of being autogenerated, or is planned to be autogenerated soon. If an issue is assigned to a user, that user is claiming responsibility for the issue. If an issue is assigned tohashibot
, a community member has claimed the issue already.
Terraform Version
Terraform v1.0.3
Affected Resource(s)
google_compute_region_instance_group_manager
google_compute_instance_group_manager
Terraform Configuration Files
I’m deploying a typical MIG, but with wait_for_instances = true
:
resource "google_compute_instance_template" "my_app" {
project = google_compute_subnetwork.primary_region.project
region = google_compute_subnetwork.primary_region.region
name_prefix = "my-app-"
machine_type = "n1-standard-1"
disk {
boot = true
source_image = "cos-cloud/cos-stable"
disk_type = "pd-ssd"
disk_size_gb = 40
}
network_interface {
subnetwork = google_compute_subnetwork.primary_region.self_link
}
lifecycle {
create_before_destroy = true
}
}
resource "google_compute_health_check" "my_app" {
project = google_compute_instance_template.my_app.project
name = "my-app"
check_interval_sec = 10
timeout_sec = 5
unhealthy_threshold = 5
http_health_check {
port = 80
request_path = "/-/health"
}
}
resource "google_compute_region_instance_group_manager" "my_app" {
project = google_compute_instance_template.my_app.project
region = google_compute_instance_template.my_app.region
name = "my-app"
base_instance_name = "my-app"
version {
instance_template = google_compute_instance_template.my_app.id
}
target_size = 1
wait_for_instances = false
named_port {
name = "http"
port = 80
}
auto_healing_policies {
health_check = google_compute_health_check.my_app.self_link
initial_delay_sec = 30
}
}
Debug Output
https://gist.github.com/dv-stephen/610fafba3eddd0de9e941ee6fa7e13bd
Expected Behavior
If there is an issue with the MIG such as a bad health check or faulty instance config that prevents the MIG from reaching a healthy state, terraform should be able to refresh the resource and allow code changes to fix the MIG.
Actual Behavior
Terraform hangs on the refresh phase of the MIG resource, waiting for the MIG to become healthy which never happens. The only solution is manual intervention, preventing a GitOps model with changes being made through code.
Steps to Reproduce
- Deploy a MIG with
wait_for_instances = true
and a health check that will fail - The terraform run will timeout waiting for the MIG to become healthy
- Run
terraform apply
again which will timeout during the refresh phase
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Comments: 23 (1 by maintainers)
@dv-stephen: While https://github.com/hashicorp/terraform-provider-google/issues/9657 isn’t directly related (the problem there is that the user specified the wrong format and the API is behaving badly) you’re correct that Terraform is incorrectly sending requests to
projects/projects/{{project}}
despite the correct value being in use in your config.I’ve spun out https://github.com/hashicorp/terraform-provider-google/issues/9722 to cover investigating that. We hadn’t noticed the issue because the API was behaving correctly despite that- I suspect a change to the client library we use is the root cause. That said, I don’t believe that error has an effect on the instance group manager behaviour here, so we can probably isolate the two discussions / fixes.
@dv-stephen agreed. We shouldn’t be doing this polling during read since it can result a broken refresh. I’ll make the change to do this polling during create/update with an increased timeout.