terraform-provider-google: GKE AutoPilot Failure For Node Count

Community Note

  • Please vote on this issue by adding a ๐Ÿ‘ reaction to the original issue to help the community and maintainers prioritize this request.
  • Please do not leave +1 or me too comments, they generate extra noise for issue followers and do not help prioritize the request.
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment.
  • If an issue is assigned to the modular-magician user, it is either in the process of being autogenerated, or is planned to be autogenerated soon. If an issue is assigned to a user, that user is claiming responsibility for the issue. If an issue is assigned to hashibot, a community member has claimed the issue already.

Terraform Version

Terraform v1.1.0

Affected Resource(s)

google_container_cluster

Terraform Configuration Files

provider "google" {
  project = var.project_id
  region  = var.region
}

resource "google_container_cluster" "primary" {
  name             = "${var.project_id}-gke"
  location         = var.region
  enable_autopilot = true
}

Debug Output

Panic Output

Expected Behavior

GKE AutoPilot cluster should spin up correctly

Actual Behavior

Terraform throws the following error:

โ”‚ Error: googleapi: Error 400: Max pods constraint on node pools for Autopilot clusters should be 32., badRequest
โ”‚ 
โ”‚   with module.gke-cluster.google_container_cluster.primary,
โ”‚   on gke-cluster/main.tf line 10, in resource "google_container_cluster" "primary":
โ”‚   10: resource "google_container_cluster" "primary" {

Steps to Reproduce

  1. terraform apply

Important Factoids

Provider version 4.3.0 works as expected, but I couldnโ€™t see anything obvious when glancing at the diff. Seems likely to be related to max_pods_constraint, but all that looks to the untrained eye like Azure or AWS stuff, somehow.

  • b/248291029

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Reactions: 71
  • Comments: 23

Commits related to this issue

Most upvoted comments

An alternative workaround is to set the ip_allocation_policy. Could even be empty like so:

resource "google_container_cluster" "primary" {
  name             = "${var.project_id}-gke"
  location         = var.region
  ip_allocation_policy {
  }
  enable_autopilot = true
}

Ran into this problem too as soon as I started testing with auto-pilot. Surprised, it hasnโ€™t been fixed for so long.

An alternative workaround is to set the ip_allocation_policy. Could even be empty like so:

resource "google_container_cluster" "primary" {
  name             = "${var.project_id}-gke"
  location         = var.region
  ip_allocation_policy {
  }
  enable_autopilot = true
}

Works with pulumi too.

  ipAllocationPolicy: {},

Thanks!

We are aware of the issue and there is a related pull request in the works

https://github.com/GoogleCloudPlatform/magic-modules/pull/5540

Hey folks, a fix has just been committed for this issue. Thanks for your patience!!

The change will be included the 4.72.0 provider release pending no revert or speedbumps.

ver 4.63โ€ฆ still need workaround why is that long??

v4.60.x is still an issue. I am facing it and reporting it.

Logs are here: https://gist.github.com/kylekurz/45d872721ed58e2b6d4ff70f76b26e0c

The configuration provided above is all that is needed to trigger this, if youโ€™re on provider version 4.5.0. If I back the provider down to 4.3.0, it works as expected.