terraform-provider-google: google_compute_backend_service failing to apply multiple backends

Community Note

Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
Please do not leave “+1” or “me too” comments, they generate extra noise for issue followers and do not help prioritize the request
If you are interested in working on this issue or have submitted a pull request, please leave a comment
If an issue is assigned to the “modular-magician” user, it is either in the process of being autogenerated, or is planned to be autogenerated soon. If an issue is assigned to a user, that user is claiming responsibility for the issue. If an issue is assigned to “hashibot”, a community member has claimed the issue already.

Description of Problem

I’m experiencing issues when trying to build a google_compute_backend_service with multiple backends (instance groups) in order to target all the nodes of my GKE cluster.

I have cluster module & a cluster-lb module which I execute from an environment terraform configuration. I am outputting the instance groups at the end of the cluster module based on a data resource to ensure I get the urls to all cluster nodes eg…

output "K8S_INSTANCE_GROUP_URLS" {
  value       = data.google_container_cluster.information.instance_group_urls
  description = "URLs to the instance groups for all nodes"
}

For simplicity sake I am taking a variable in the cluster-lb module which is that list.

variable "backend_group_list" {
  description = "Map backend indices to list of backend maps."
  type        = list
  default     = []
}

In my module code I am trying to configure the backend subblock as described here, which has a specific format of (i think):

backend = [
    { group = <url> },
    { group = <url> }
]

(seems to be what this imples)

or backend block is specified twice?

backend { group = <url> }
backend { group = <url> }

The topic of documentation is covered in #3498 and I initially added some error logs in this comment.

Terraform Version

Terraform v0.12.2
+ provider.google v2.9.1
+ provider.null v2.1.2
+ provider.random v2.1.2
+ provider.template v2.1.2

Affected Resource(s)

google_compute_backend_service

Terraform Configuration Files

cluster-lb module backends

variable "backend_group_list" {
  description = "Map backend indices to list of backend maps."
  type        = list
  default     = []
}

variable "backend_public" {
  description = "Parameters to the public backend"
  type = object({
    enabled         = bool
    health_path     = string
    port_name       = string
    port_number     = number
    timeout_seconds = number
    iap_enabled     = bool
  })

  default = {
    enabled         = true
    health_path     = "/"
    port_name       = "http"
    port_number     = 30100
    timeout_seconds = 30
    iap_enabled     = false
  }
}

variable "backend_private" {
  description = "Parameters to the private backend"
  type = object({
    enabled         = bool
    health_path     = string
    port_name       = string
    port_number     = number
    timeout_seconds = number
    iap_enabled     = bool
  })

  default = {
    enabled         = true
    health_path     = "/"
    port_name       = "http"
    port_number     = 30100
    timeout_seconds = 30
    iap_enabled     = true
  }
}

variable "backend_monitor" {
  description = "Parameters to the monitoring backend"
  type = object({
    enabled         = bool
    health_path     = string
    port_name       = string
    port_number     = number
    timeout_seconds = number
    iap_enabled     = bool
  })

  default = {
    enabled         = true
    health_path     = "/"
    port_name       = "monitor"
    port_number     = 30101
    timeout_seconds = 30
    iap_enabled     = true
  }
}

resource "google_compute_backend_service" "public" {
  project     = var.project
  name        = "${var.name}-backend-public"
  port_name   = var.backend_public["port_name"]
  protocol    = "HTTP"
  timeout_sec = var.backend_public["timeout_seconds"]
  dynamic "backend" {
    for_each = [ for b in var.backend_group_list : b ]
    content {
      group = backend.value
    }
  }

  health_checks = list(google_compute_health_check.public.self_link)
}

resource "google_compute_backend_service" "private" {
  project     = var.project
  name        = "${var.name}-backend-private"
  port_name   = var.backend_private["port_name"]
  protocol    = "HTTP"
  timeout_sec = var.backend_private["timeout_seconds"]
  dynamic "backend" {
    for_each = var.backend_group_list
    content {
      group                        = backend.value
      // adding null values otherwise reapplication fails
      balancing_mode               = null
      capacity_scaler              = null
      description                  = null
      max_connections              = null
      max_connections_per_instance = null
      max_rate                     = null
      max_rate_per_instance        = null
      max_utilization              = null
    }
  }
  health_checks = list(google_compute_health_check.private.self_link)
  
  iap {
    oauth2_client_id     = var.iap_oauth_id
    oauth2_client_secret = var.iap_oauth_secret
  }
}

resource "google_compute_backend_service" "monitor" {
  project     = var.project
  name        = "${var.name}-backend-monitor"
  port_name   = var.backend_monitor["port_name"]
  protocol    = "HTTP"
  timeout_sec = var.backend_monitor["timeout_seconds"]
  dynamic "backend" {
    for_each = var.backend_group_list
    content {
      group = backend.value
    }
  }
  health_checks = list(google_compute_health_check.monitor.self_link)

  iap {
    oauth2_client_id     = var.iap_oauth_id
    oauth2_client_secret = var.iap_oauth_secret
  }
}

Debug Output

I’ve posted the encrypted version (sing hashicorp key from keybase) in this gist: https://gist.github.com/hawksight/bde83268020c8701fc9ac35c1b6d3fb8

Used the following to encrypt:

keybase pgp encrypt -i ~/Logs/1561630982-terraform.log -o ~/Logs/1561630982-terraform.log.crypt hashicorp

Wasn’t confident there wouldn’t be any sensitive details in the debug log, hence encryption. Let me know if I need to share another way.

Panic Output

None

Expected Behavior

I have three backends which I am manually specifying with different names. They are all backends to the same set of GKE nodes. Our clusters use multi-zone node pools and usually have two node pools. In GKE, this means you have an instance group for each zone for each node pool. In the example I am showing here, I have setup with two node pools in a single zone, so two instance groups equating to two backends to specify.

In the plan I expect to see two backend blocks as I am using the dynamic` provisioner from 0.12 to generate a block for each group URL / self-link passed in.

In the application I expect the backend to be created and have both instance groups as its target, not the fail with the error provided.

Actual Behavior

The plan worked although it only specifies one backend in the output. It only knows the groups after application, which I find unhelpful. Even when the cluster is prebuilt the plan still doesn’t see that I have more than one instance group to add. This is probably something to with the way terraform plans things, but unsure on specifics.

Here’s an example plan output:

  # module.cluster-lb.google_compute_backend_service.monitor will be created
  + resource "google_compute_backend_service" "monitor" {
      + connection_draining_timeout_sec = 300
      + creation_timestamp              = (known after apply)
      + fingerprint                     = (known after apply)
      + health_checks                   = (known after apply)
      + id                              = (known after apply)
      + load_balancing_scheme           = "EXTERNAL"
      + name                            = "vpc-du-lb-backend-monitor"
      + port_name                       = "http"
      + project                         = "MASKED"
      + protocol                        = "HTTP"
      + self_link                       = (known after apply)
      + session_affinity                = (known after apply)
      + timeout_sec                     = 30

      + backend {
          + balancing_mode  = "UTILIZATION"
          + capacity_scaler = 1
          + group           = (known after apply)
          + max_utilization = 0.8
        }

      + cdn_policy {
          + signed_url_cache_max_age_sec = (known after apply)

          + cache_key_policy {
              + include_host           = (known after apply)
              + include_protocol       = (known after apply)
              + include_query_string   = (known after apply)
              + query_string_blacklist = (known after apply)
              + query_string_whitelist = (known after apply)
            }
        }

      + iap {
          + oauth2_client_id            = "MASKED"
          + oauth2_client_secret        = (sensitive value)
          + oauth2_client_secret_sha256 = (sensitive value)
        }
    }

  # module.cluster-lb.google_compute_backend_service.private will be created
  + resource "google_compute_backend_service" "private" {
      + connection_draining_timeout_sec = 300
      + creation_timestamp              = (known after apply)
      + fingerprint                     = (known after apply)
      + health_checks                   = (known after apply)
      + id                              = (known after apply)
      + load_balancing_scheme           = "EXTERNAL"
      + name                            = "vpc-du-lb-backend-private"
      + port_name                       = "http"
      + project                         = "MASKED"
      + protocol                        = "HTTP"
      + self_link                       = (known after apply)
      + session_affinity                = (known after apply)
      + timeout_sec                     = 30

      + backend {
          + balancing_mode               = (known after apply)
          + capacity_scaler              = (known after apply)
          + description                  = (known after apply)
          + group                        = (known after apply)
          + max_connections              = (known after apply)
          + max_connections_per_instance = (known after apply)
          + max_rate                     = (known after apply)
          + max_rate_per_instance        = (known after apply)
          + max_utilization              = (known after apply)
        }

      + cdn_policy {
          + signed_url_cache_max_age_sec = (known after apply)

          + cache_key_policy {
              + include_host           = (known after apply)
              + include_protocol       = (known after apply)
              + include_query_string   = (known after apply)
              + query_string_blacklist = (known after apply)
              + query_string_whitelist = (known after apply)
            }
        }

      + iap {
          + oauth2_client_id            = "MASKED"
          + oauth2_client_secret        = (sensitive value)
          + oauth2_client_secret_sha256 = (sensitive value)
        }
    }

  # module.cluster-lb.google_compute_backend_service.public will be created
  + resource "google_compute_backend_service" "public" {
      + connection_draining_timeout_sec = 300
      + creation_timestamp              = (known after apply)
      + fingerprint                     = (known after apply)
      + health_checks                   = (known after apply)
      + id                              = (known after apply)
      + load_balancing_scheme           = "EXTERNAL"
      + name                            = "vpc-du-lb-backend-public"
      + port_name                       = "http"
      + project                         = "MASKED"
      + protocol                        = "HTTP"
      + self_link                       = (known after apply)
      + session_affinity                = (known after apply)
      + timeout_sec                     = 30

      + backend {
          + balancing_mode  = "UTILIZATION"
          + capacity_scaler = 1
          + group           = (known after apply)
          + max_utilization = 0.8
        }

      + cdn_policy {
          + signed_url_cache_max_age_sec = (known after apply)

          + cache_key_policy {
              + include_host           = (known after apply)
              + include_protocol       = (known after apply)
              + include_query_string   = (known after apply)
              + query_string_blacklist = (known after apply)
              + query_string_whitelist = (known after apply)
            }
        }
    }

I get the following errors when trying to apply:

Error: Provider produced inconsistent final plan

When expanding the plan for
module.cluster-lb.google_compute_backend_service.public to include new values
learned so far during apply, provider "google" produced an invalid new value
for .backend: block set length changed from 1 to 2.

This is a bug in the provider, which should be reported in the provider's own
issue tracker.


Error: Provider produced inconsistent final plan

When expanding the plan for
module.cluster-lb.google_compute_backend_service.monitor to include new values
learned so far during apply, provider "google" produced an invalid new value
for .backend: block set length changed from 1 to 2.

This is a bug in the provider, which should be reported in the provider's own
issue tracker.


Error: Provider produced inconsistent final plan

When expanding the plan for
module.cluster-lb.google_compute_backend_service.private to include new values
learned so far during apply, provider "google" produced an invalid new value
for .backend: block set length changed from 1 to 2.

This is a bug in the provider, which should be reported in the provider's own
issue tracker.

Steps to Reproduce

Create a backend_service and try to pass multiple groups to it by dynamically generating using a dyanmic block or other loop method. Use my coder as an example?
Try to plan and see if you get multiple backends specified
Apply and see if you get errors.

Important Factoids

I’ve recently been upgrading to 0.12, so I really don’t know if my dyanmic block is the right solution, or if I can use a for_each instead or some combination. I’ve found it quite hard toi distinguish from the limited examples when each variation / combination of: for, for_each and dynamic should be used.

My code works perfectly when there is only on instance group in the list. But I only tried that to prove out the code if TF compliant. My real world use case always has many instance groups to add.

Notice on my private backend service, I have explicitly set all the other block options to null. This is because when I did successfully build with one instance group, the subsequent application failed because the attributes were not set. So on re-application those parameters seem to not be optional anymore, hence the null values. Thanks to the author of this comment for the example.

I also tried turning my input list into the format:

[ { group = URL}, {group = URL } ...]

References

#3498

About this issue

Original URL
State: open
Created 5 years ago
Reactions: 29
Comments: 15 (4 by maintainers)

Commits related to this issue

add sensitive_params to bigquery_data_transfer_config (#3937) * suppress diff for secret_access_key on bigquery data transfer params * add sensitiveParams for secret access key * add customize diff... — committed to modular-magician/terraform-provider-google by modular-magician 4 years ago
add sensitive_params to bigquery_data_transfer_config (#3937) (#7174) * suppress diff for secret_access_key on bigquery data transfer params * add sensitiveParams for secret access key * add custom... — committed to hashicorp/terraform-provider-google by modular-magician 4 years ago
add two attempts for run-terraform - dynamic backend issue https://github.com/hashicorp/terraform-provider-google/issues/3937 — committed to pivotal/docs-platform-automation by nhsieh 3 years ago

Most upvoted comments

Similar issue using dynamic over backend block

resource "google_compute_backend_service" "default" {
  project     = google_project.this.project_id
  name        = "${var.slug}-backend"
  port_name   = "istio-http"
  protocol    = "HTTP"
  timeout_sec = 30
  dynamic "backend" {
    for_each = module.gke-cluster.cluster.instance_group_urls
    content {
      group = backend.value
      balancing_mode               = null
      capacity_scaler              = null
      description                  = null
      max_connections              = null
      max_connections_per_instance = null
      max_rate                     = null
      max_rate_per_instance        = null
      max_utilization              = null
    }
  }
  health_checks = [google_compute_health_check.this.self_link]
  enable_cdn = false
  depends_on = [module.gke-cluster.node_pools]
}

error after apply:

Error: Provider produced inconsistent final plan

When expanding the plan for google_compute_backend_service.default to include
new values learned so far during apply, provider "google" produced an invalid
new value for .backend: planned set element
cty.ObjectVal(map[string]cty.Value{"balancing_mode":cty.UnknownVal(cty.String),
"capacity_scaler":cty.UnknownVal(cty.Number),
"description":cty.UnknownVal(cty.String), "group":cty.UnknownVal(cty.String),
"max_connections":cty.UnknownVal(cty.Number),
"max_connections_per_endpoint":cty.NullVal(cty.Number),
"max_connections_per_instance":cty.UnknownVal(cty.Number),
"max_rate":cty.UnknownVal(cty.Number),
"max_rate_per_endpoint":cty.NullVal(cty.Number),
"max_rate_per_instance":cty.UnknownVal(cty.Number),
"max_utilization":cty.UnknownVal(cty.Number)}) does not correlate with any
element in actual.

This is a bug in the provider, which should be reported in the provider's own
issue tracker.

If I apply it just one more time does works

vigohe on Oct 18, 2019

We’re seeing this issue with our integration tests, because idempotency is one thing we’re testing for, would prefer not to simply reapply. Interestingly, I don’t think we’d been seeing this with 2.17 provider, but are getting it consistently with 2.20.1. Will try to investigate further.

wyardley on Jan 28, 2020

I have the same issue:

Error: Provider produced inconsistent final plan

When expanding the plan for
module.https.google_compute_backend_service.https to include new
values learned so far during apply, provider "google" produced an invalid new
value for .backend: block set length changed from 1 to 3.

This is a bug in the provider, which should be reported in the provider's own
issue tracker.

I also use dynamic block, funny thing is that is I apply again, it works… nevertheless it’s quite annoying.

jaceq on Sep 20, 2019

@ogreface - do you generate var.lb_service_groups by pulling information from data providers?

Not data providers technically, but they are references to other blocks in the same module. Glad you have a solution though!

ogreface on Feb 4, 2020

Actually think I have resolved my issue after some poking around.

tl;dr

Issue was passing the results of a data lookup (on the k8s cluster) as an output in one module(gcloud-k8s), and trying to use those as the input to another module (gcloud-lb-custom).

longer read

I had a setup as such:

environment
|--main.tf
|--inputs.tfvars
terraform-modules
|-- gcloud-k8s
     |-- main.tf
     |-- outputs.tf
|-- gcloud-lb-custom
     |-- main.tf
     |-- variables.tf

What I was doing

In each environment, I’d call my module (gcloud-k8s) to build a cluster. At the end of said module I had a data lookup on the cluster which depended on all node pool creations. This would become the output K8S_INSTANCE_GROUP_URLS

Then I’d build the load balancer through my next module (gcloud-lb-custom) which would take in input variable backend_group_list. Obviously when calling that module, I’d fill that input with the other modules output:

backend_group_list = module.cluster.K8S_INSTANCE_GROUP_URLS

This has been erroring ever since upgrading to 0.12. It used to work in 0.11. Hence raising this issue.

What I changed to see if my loop was correct

I basically took the output from tf output and set that in variables.tf for the loadbalancer module (gcloud-lb-custom). When I tf plan everything planned correctly. When I removed an instance group, the plan reconfigured the backends correctly going from 3 backends to 2 in this instance.

This made me think the issue was something to do with me passing input to one module from the output of another.

What I’m now doing

I’ve moved that data lookup into the lb module (gcloud-lb-custom) and that lookup is configured via two other outputs from the cluster module (gcloud-k8s):

module "cluster-lb" {
  source = "../terraform-modules/gcloud-lb-custom"
  cluster      = module.cluster.K8S_NAME
  cluster_zone = module.cluster.K8S_ZONE
  ...
}

Inside the cluster module:

data "google_container_cluster" "hack" {
  name       = var.cluster
  zone       = var.cluster_zone
  project    = var.project
}

And further down in the module, use that lookup to pass in the list of instance_group_urls, so my dynamic backend looks like:

  dynamic "backend" {
    for_each = data.google_container_cluster.hack.instance_group_urls
    content {
      group = backend.value
      // adding null values otherwise reapplication fails
      balancing_mode               = "UTILIZATION"
      capacity_scaler              = 1
      description                  = null
      max_connections              = 0
      max_connections_per_instance = 0
      max_rate                     = 0
      max_rate_per_instance        = 0
      max_utilization              = 0.8
    }
  }

It seems to work fairly well now so far.

I did also upgrade the google provider to latest:

Terraform v0.12.20
+ provider.google v3.7.0
+ provider.google-beta v3.7.0
+ provider.null v2.1.2
+ provider.random v2.2.1

TIL

Probably not the first or last time I’ll be bitten by passing things from one module to another. Arguably its cleaner to fetch the urls inside the load balancer module but I would have thought the output would be stored in state and used during the plan (probably misunderstanding internal workings of terraform plan).

As a side effect, I have yet to see that error message again, but will be doing lots of testing around this. If anyone else has the issue, hopefully some of the example above will help you find a solution.

hawksight on Feb 4, 2020

@paddycarver can you take a look? You’ve got the key + probably more context on dynamic than I do.

rileykarson on Oct 24, 2019