terraform-provider-google: Error reading service account after creation - 403 Halts Execution

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request.
  • Please do not leave +1 or me too comments, they generate extra noise for issue followers and do not help prioritize the request.
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment.
  • If an issue is assigned to the modular-magician user, it is either in the process of being autogenerated, or is planned to be autogenerated soon. If an issue is assigned to a user, that user is claiming responsibility for the issue. If an issue is assigned to hashibot, a community member has claimed the issue already.

Terraform Version

Running in Github Runner:

provider.google v4.69.1

Affected Resource(s)

google_service_account

Terraform Configuration Files

# Copy-paste your Terraform configurations here.
#
# For large Terraform configs, please use a service like Dropbox and share a link to the ZIP file.
# For security, you can also encrypt the files using our GPG public key:
#    https://www.hashicorp.com/security
#
# If reproducing the bug involves modifying the config file (e.g., apply a config,
# change a value, apply the config again, see the bug), then please include both:
# * the version of the config before the change, and
# * the version of the config after the change.
provider "google" {
  project = var.project_id
  region  = var.region
  zone    = var.zone
}

resource "google_service_account" "service_account" {
  account_id   = var.service_account_id
  display_name = "Constellation service account"
  description  = "Service account used inside Constellation"
}
....

Here is the complete configuration: https://github.com/edgelesssys/constellation/blob/main/cli/internal/terraform/terraform/iam/gcp/main.tf

Debug Output

Not yet able to reproduce with debug output, since only happens sporadically.

Error: Error reading service account after creation: googleapi: Error 403: Permission 'iam.serviceAccounts.get' denied on resource (or it may not exist).
Details:
[
  ***
    "@type": "type.googleapis.com/google.rpc.ErrorInfo",
    "domain": "iam.googleapis.com",
    "metadata": ***
      "permission": "iam.serviceAccounts.get"
    ***,
    "reason": "IAM_PERMISSION_DENIED"
  ***
]
, forbidden

  with google_service_account.service_account,
  on main.tf line 16, in resource "google_service_account" "service_account":
  16: resource "google_service_account" "service_account" ***


Attempting to roll back.
Rollback succeeded.
Error: exit status 1

Panic Output

Expected Behavior

Should create account without error.

Actual Behavior

Halts due to 403.

Steps to Reproduce

  1. terraform apply

Important Factoids

Only fails at times, in the CI pipeline. Authentication with GCP is done with Github Action and roles must be correct since it works most of the times. (https://github.com/edgelesssys/constellation/blob/main/.github/actions/login_gcp/action.yml)

Issue is very similar to #10227.

References

b/298050821

b/291928614

About this issue

  • Original URL
  • State: closed
  • Created a year ago
  • Reactions: 6
  • Comments: 24

Most upvoted comments

@benhxy I’ve just realized that an older plugin version was running in the CI. So false alarm. Sorry

@elchead thanks for looking at this and also contacting GCP support. We are experiencing the same issue and agree, the API should not return 403 on the SA creation but 404 instead, especially when you have all your permissions properly granted.

In our case, we experience the issue in our certification test CI pipeline, when creating multiple GCP resources in randomly picked regions.

Service account creation is eventually consistent. We don’t have a great solution, but mitigation in our user guide for SA: https://registry.terraform.io/providers/hashicorp/google/latest/docs/guides/google_project_service#newly-activated-service-errors

While this is not the most common error we’ve seen regarding it, due to its seemingly intermittent nature I think it has the same cause. Sleeps/delays and more retries are the only thing we can offer on the Terraform side.

On this note, I saw that there is actually a polling to wait to solve the eventual consistency problem: HEAD/google/services/resourcemanager/resource_google_service_account.go#L135C17-L135C17 Doesn’t that make the suggested wait-delay in the official docs superfluous? registry.terraform.io/providers/hashicorp/google/4.73.0/docs/resources/google_service_account.html

@elchead The polling waits on 404’s, not 403’s. We wouldn’t want to add retrying on 403’s unless we are sure that doing so would be helpful. I see in https://github.com/edgelesssys/constellation/blob/main/cli/internal/terraform/terraform/iam/gcp/main.tf#L25C22-L25C24 that you have a sleep already. I wonder if increasing the time would decrease the frequency of the error?

@c2thorn I understand that SA creation is eventually consistent, and the consequence it has for consuming that resource (e.g. for assigning IAM roles via google_project_iam_member). But Terraform fails during the creation of the SA resource itself and not due to trying to consume it. So I think that the sleep timer is of no help here, but correct me if I’m wrong. The GCP support is looking into why it returns 403 in the SA GET request. Until they have fixed that, tolerating 403 in the poller seems like the only way to fix that as far as I see.