terraform-provider-aws: Cycle error for replacement of aws_api_gateway_deployment with lifecycle create_before_destroy set to true and API Gateway resources in depends_on section

Community Note

Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
Please do not leave “+1” or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
If you are interested in working on this issue or have submitted a pull request, please leave a comment

Terraform Version

Terraform v0.12.18
+ provider.aws v2.42.0

Affected Resource(s)

aws_api_gateway_deployment

Terraform Configuration Files

I’m not copying all API Gateway resources’ configuration as it’s pretty standard but happy to share configuration of whole API Gateway if requested

resource "aws_api_gateway_deployment" "deployment" {
  depends_on = [
    aws_api_gateway_rest_api.api,
    aws_api_gateway_resource.api_email_health,
    aws_api_gateway_method.api_email_health_get,
    aws_api_gateway_integration.api_email_health_get_integration,
    aws_api_gateway_method.api_email_health_options,
    aws_api_gateway_integration.api_email_health_options_integration,
    aws_api_gateway_integration_response.api_email_health_options_integration_response,
    aws_api_gateway_method_response.api_email_health_options_response,
    aws_api_gateway_resource.api_email_templates,
    aws_api_gateway_method.api_email_templates_get,
    aws_api_gateway_integration.api_email_templates_get_integration,
    aws_api_gateway_method.api_email_templates_options,
    aws_api_gateway_integration.api_email_templates_options_integration,
    aws_api_gateway_integration_response.api_email_templates_options_integration_response,
    aws_api_gateway_method_response.api_email_templates_options_response,
    aws_api_gateway_resource.api_email_emails,
    aws_api_gateway_method.api_email_emails_post,
    aws_api_gateway_integration.api_email_emails_post_integration,
    aws_api_gateway_method.api_email_emails_options,
    aws_api_gateway_integration.api_email_emails_options_integration,
    aws_api_gateway_integration_response.api_email_emails_options_integration_response,
    aws_api_gateway_method_response.api_email_emails_options_response,
    aws_api_gateway_resource.api_email
  ]

  rest_api_id = aws_api_gateway_rest_api.api.id

  stage_description = "Deployed at ${timestamp()}"

  stage_name = var.aws_spotlight_environment

  lifecycle {
    create_before_destroy = true
  }
}

Expected Behavior

As resource aws_api_gateway_deployment is configured as depends_on all API Gateway resources/methods/integrations/responses, it shouldn’t be created before all resources in API Gateway are provisioned so outcome should be (and was this way till recently): old API Gateway resources are destroyed, new are created, new deployment created, old deployment destroyed We force replacement of aws_api_gateway_deployment so current API Gateway state is always deployed to main stage

This was behaviour in Terraform 0.11.x

Actual Behavior

Cycle Error

Error: Cycle: aws_api_gateway_integration.api_email_health_get_integration (destroy), aws_api_gateway_integration.api_email_health_options_integration (destroy), aws_api_gateway_integration_response.api_email_health_options_integration_response (destroy),
aws_api_gateway_method_response.api_email_health_options_response (destroy), aws_api_gateway_method.api_email_health_options (destroy), aws_api_gateway_resource.api_email_health (destroy), aws_api_gateway_deployment.deployment, aws_api_gateway_deployment.deployment (destroy deposed 359e79c1),
aws_api_gateway_method.api_email_health_get (destroy)

Removal off create_before_destroy = true in lifecycle of resource aws_api_gateway_deployment helps but causes it to fail anyway on different error:

Error: error deleting API Gateway Deployment (bdq86u): BadRequestException: Active stages pointing to this deployment must be moved or deleted

If I remove depends_on section instead, I have situations that deployment happens before all API methods are properly configured. Example:

Error: Error creating API Gateway Deployment: BadRequestException: No integration defined for method

I tried adding separate resource for stage aws_api_gateway_stage but problem persists

Steps to Reproduce

Create API Gateway with aws_api_gateway_deployment which depends on API Gateway resources and is recreated with every terraform apply
Run terraform apply
Change one or more API Gateway resources which forces them to be destroyed and recreated (ie change API Gateway resource path)
Run terraform apply

About this issue

Original URL
State: open
Created 5 years ago
Reactions: 133
Comments: 55 (11 by maintainers)

Commits related to this issue

docs/service/apigateway: aws_api_gateway_deployment usage overhaul to discourage stage_name and further encourage create_before_destroy Reference: https://github.com/hashicorp/terraform-provider-aws/... — committed to hashicorp/terraform-provider-aws by bflad 3 years ago
docs/service/apigateway: aws_api_gateway_deployment usage overhaul to discourage stage_name and further encourage create_before_destroy Reference: https://github.com/hashicorp/terraform-provider-aws/... — committed to hashicorp/terraform-provider-aws by bflad 3 years ago
docs/service/apigateway: aws_api_gateway_deployment usage overhaul to discourage stage_name and further encourage create_before_destroy (#17230) * docs/service/apigateway: aws_api_gateway_deployment ... — committed to hashicorp/terraform-provider-aws by bflad 3 years ago

Most upvoted comments

Hi all! 👋 Just a quick note to let you know this is on our radar and we will be taking a look in the near future to arrive at a resolution.

+22

breathingdust on Oct 8, 2020

Hi folks 👋 You may have noticed me poking around a few other API Gateway v1 issues and pull requests earlier today to warm up for this one. I wanted to fully context switch into this service and ensure we had a clear runway for any code changes that need to get in so we didn’t break other existing contributions.

Apologies for the long delay here and the very frustrating behavior with the API Gateway v1 functionality with regards to deployment. Those aspects of this AWS service, which is unique compared to others, has consistently challenged Terraform’s abilities to model it successfully and our ability to document recommended configuration patterns in a discoverable manner. At the end of this, beyond just fixing the reported issue(s) here, it seems necessary that the maintainers take some extra steps to add more robust service-level and use-case examples are added into the examples directory of the repository (with links from the resource-level reference pages) and/or expand the Learn platform content (e.g. Serverless Applications with AWS Lambda and API Gateway). If you all have other ideas in this manner, it would be great to discuss them. That aside, let’s dive into this.

First and foremost, I would like to ensure that I’m understanding and covering expectations for the followers here. At a high level, the problem statement seems to be:

The API Gateway REST API is wholly configured via Terraform. The configuration of the API itself (resources/methods/integrations/etc.) is either using the available aws_api_gateway_* Terraform resources or the OpenAPI specification import ability of the aws_api_gateway_rest_api resource body argument.
Attempting to update the configuration of the API resources/methods/integrations/etc. causes errors (such as BadRequestException: Active stages pointing to this deployment must be moved or deleted) or requires resource recreation that causes potential downtime.

And what is expected out of this effort, which will be a focus of mine until its complete:

A recommended configuration pattern should exist where an API configuration can be applied without errors, downtime, and manual steps on initial Terraform execution (all new resources) and subsequent Terraform executions (mixture of resource additions, updates, and deletions). This recommended configuration pattern should be well documented.
If this configuration pattern is not possible today due to shortcomings in the Terraform AWS Provider, that maintainer efforts are made to remediate the situation via resource changes. This may include adding new arguments, updating the underlying logic in one or more resources, or worst case creating a new resource.
If Terraform AWS Provider resource changes somehow cannot completely alleviate the issue (such as the potential need for AWS API changes or Terraform CLI dependency graph or configuration language changes), that feature requests are submitted appropriately, linked here, and warnings are added to the resource documentation about unimplemented or problematic use cases.

If I’m missing anything up until this point, please let me know.

To begin these efforts, I will need to reproduce the issues by having self-contained API Gateway configurations ready that match the problem statement along with reproduction steps. The initial report has some good details and I should be able to assemble an all Terraform resource configuration with some minor effort on my part tomorrow morning. https://github.com/hashicorp/terraform-provider-aws/issues/11344#issuecomment-699612070 has a starting configuration for the OpenAPI case. I will reach out if I am having trouble in this regard. In the meantime, if you also have a self-contained configuration handy that displays these issues and would like investigated, please feel free to reach out or post a link to a Gist/repository. I cannot promise I’ll be able to look at or solve every configuration scenario, but the extra context could be valuable.

It is very late for me now (almost 3am) so I’ll pick this up again first thing in the morning. Before I go though, for those attempting to use the resource lifecycle create_before_destroy behavior please note that in the more recent versions of Terraform CLI it seems more sensitive to needing that configuration being applied to every resource in that portion of the dependency graph to have the ordering successfully applied. This means not just the aws_api_gateway_deployment or aws_api_gateway_stage resources where it seems intuitive, but also the upstream aws_api_gateway_* resources that are being updated. I only mention this because as an older practitioner of Terraform, it has tripped me up as seeming different than before. I will try to write up more how to debug issues like that tomorrow.

+21

bflad on Jan 13, 2021

TL;DR

Use Terraform AWS Provider version 3.25.0 and later when possible
Use aws_api_gateway_stage instead of aws_api_gateway_deployment resource stage_name argument
Use lifecycle block create_before_destroy = true argument inside aws_api_gateway_deployment resource configuration
Use time_static resource instead of timestamp() function, if saving the current time is necessary

Hi again, folks 👋 Here are some updates.

Terraform AWS Provider version 3.25.0, released today, includes some fixes (https://github.com/hashicorp/terraform-provider-aws/pull/17099 / https://github.com/hashicorp/terraform-provider-aws/pull/17209) for the aws_api_gateway_rest_api resource to better respect configuration via OpenAPI if you are working in that model. The resource should no longer show plan differences for “missing” Terraform configuration that was sourced from the OpenAPI specification. It should also now handle any Terraform configuration beyond the body and name arguments as overrides to any OpenAPI specification. Hopefully this should help remove some previously frustrating behavior in that resource.

Now let’s turn the focus towards API Gateway REST API Deployments. After some extensive testing, it seemed like most issues captured here and in other similar issues relate around the aws_api_gateway_deployment resource also attempting to manage a stage. Terraform and resources are typically designed with a 1:1 mapping and this type of “shadow” resource management has historically been the source of confusion and headaches. The maintainers are now very cognizant not to introduce more of these types of resources, but of course we are stuck with any existing ones until they can be fixed or removed. In the future we may deprecate the problematic behavior.

The good news is that these deployment problems lean towards being fixable via configuration and documentation updates. I’ll provide an outline of these below, which should hopefully guide you towards less problematic Terraform environments. You can find proposed API Gateway documentation changes and a new end-to-end example configuration (which I was using to verify my recommendations) here: https://github.com/hashicorp/terraform-provider-aws/pull/17230

I’ll also briefly touch on timestamp() function usage, since that is not a recommended pattern and can make Terraform edge cases even sharper.

As a quick overview of API Gateway’s lifecycle expectations and how they map to the various Terraform resources, REST APIs can be configured via two methods:

Importing an OpenAPI specification: Using the aws_api_gateway_rest_api resource body argument with other arguments serving as overrides
Via other Terraform resources: Using the aws_api_gateway_resource, aws_api_gateway_method, aws_api_gateway_integration, etc. resources

Once the REST API is configured, the aws_api_gateway_deployment resource can be used along with the aws_api_gateway_stage resource to snapshot and publish the REST API. Stages can be optionally managed further with the aws_api_gateway_base_path_mapping, aws_api_gateway_domain, and aws_api_method_settings resources.

Both configuration methods achieve the same end goal and operators can choose which style is preferable for their environment or use cases. However from a deployment standpoint, it is worth noting up front that it is much simpler in Terraform to setup the OpenAPI deployment properly. This is because a direct 1:1 configuration dependency can be setup. The Terraform resource method for configuring REST APIs is not going anywhere or any less supported, just additional care needs to be put in place to set it up properly for deployments.

The deeper explanation here is that Terraform currently only knows about differences when a state value has changed and only performs a node operation when there is a local state value change. There are configuration methods for creating edges on the graph (e.g. attribute references and depends_on), but there is not a method (configuration, internally, or protocol-wise) to remotely trigger another node to do something. In practice, this means the local node (aws_api_gateway_deployment resource) can only do something when it has local changing state values. Our workaround for this in Terraform Providers is adding a conventional triggers map argument that accepts arbitrary keys and values that can implement local value changes. Collecting and acting on node changes from other nodes has not been a design focus in Terraform before as far as I know, but maybe this can be investigated in the future to improve the user experience in this area.

REST API Deployment with OpenAPI

Here is a recommended starter configuration with this method:

resource "aws_api_gateway_rest_api" "example" {
  body = jsonencode({
    openapi = "3.0.1"
    info = {
      title   = "example"
      version = "1.0"
    }
    paths = {
      "/path1" = {
        get = {
          x-amazon-apigateway-integration = {
            httpMethod           = "GET"
            payloadFormatVersion = "1.0"
            type                 = "HTTP_PROXY"
            uri                  = "https://ip-ranges.amazonaws.com/ip-ranges.json"
          }
        }
      }
    }
  })

  name = "example"
}

resource "aws_api_gateway_deployment" "example" {
  rest_api_id = aws_api_gateway_rest_api.example.id

  triggers = {
    redeployment = sha1(jsonencode(aws_api_gateway_rest_api.example.body))
  }

  lifecycle {
    create_before_destroy = true
  }
}

resource "aws_api_gateway_stage" "example" {
  deployment_id = aws_api_gateway_deployment.example.id
  rest_api_id   = aws_api_gateway_rest_api.example.id
  stage_name    = "example"
}

There will soon be an end-to-end example available in the repository, which is based off this snippet and expands to include other downstream API Gateway resources to ensure they work as expected. Below you can see this in action, successfully deploying REST API updates without error:

$ terraform apply

An execution plan has been generated and is shown below.
Resource actions are indicated with the following symbols:
  + create

Terraform will perform the following actions:

  # aws_acm_certificate.example will be created
  + resource "aws_acm_certificate" "example" {
      + arn                       = (known after apply)
      + certificate_body          = (known after apply)
      + domain_name               = (known after apply)
      + domain_validation_options = (known after apply)
      + id                        = (known after apply)
      + private_key               = (sensitive value)
      + status                    = (known after apply)
      + subject_alternative_names = (known after apply)
      + validation_emails         = (known after apply)
      + validation_method         = (known after apply)
    }

  # aws_api_gateway_base_path_mapping.example will be created
  + resource "aws_api_gateway_base_path_mapping" "example" {
      + api_id      = (known after apply)
      + domain_name = (known after apply)
      + id          = (known after apply)
      + stage_name  = "example"
    }

  # aws_api_gateway_deployment.example will be created
  + resource "aws_api_gateway_deployment" "example" {
      + created_date  = (known after apply)
      + execution_arn = (known after apply)
      + id            = (known after apply)
      + invoke_url    = (known after apply)
      + rest_api_id   = (known after apply)
      + triggers      = {
          + "redeployment" = "e042aae1faf8de8d7c7c98c063a986025f058c69"
        }
    }

  # aws_api_gateway_domain_name.example will be created
  + resource "aws_api_gateway_domain_name" "example" {
      + arn                      = (known after apply)
      + certificate_upload_date  = (known after apply)
      + cloudfront_domain_name   = (known after apply)
      + cloudfront_zone_id       = (known after apply)
      + domain_name              = (known after apply)
      + id                       = (known after apply)
      + regional_certificate_arn = (known after apply)
      + regional_domain_name     = (known after apply)
      + regional_zone_id         = (known after apply)
      + security_policy          = (known after apply)

      + endpoint_configuration {
          + types = [
              + "REGIONAL",
            ]
        }
    }

  # aws_api_gateway_method_settings.example will be created
  + resource "aws_api_gateway_method_settings" "example" {
      + id          = (known after apply)
      + method_path = "*/*"
      + rest_api_id = (known after apply)
      + stage_name  = "example"

      + settings {
          + cache_data_encrypted                       = (known after apply)
          + cache_ttl_in_seconds                       = (known after apply)
          + caching_enabled                            = (known after apply)
          + data_trace_enabled                         = (known after apply)
          + logging_level                              = (known after apply)
          + metrics_enabled                            = true
          + require_authorization_for_cache_control    = (known after apply)
          + throttling_burst_limit                     = -1
          + throttling_rate_limit                      = -1
          + unauthorized_cache_control_header_strategy = (known after apply)
        }
    }

  # aws_api_gateway_rest_api.example will be created
  + resource "aws_api_gateway_rest_api" "example" {
      + api_key_source               = (known after apply)
      + arn                          = (known after apply)
      + binary_media_types           = (known after apply)
      + body                         = jsonencode(
            {
              + info    = {
                  + title   = "api-gateway-rest-api-openapi-example"
                  + version = "1.0"
                }
              + openapi = "3.0.1"
              + paths   = {
                  + /path1 = {
                      + get = {
                          + x-amazon-apigateway-integration = {
                              + httpMethod           = "GET"
                              + payloadFormatVersion = "1.0"
                              + type                 = "HTTP_PROXY"
                              + uri                  = "https://ip-ranges.amazonaws.com/ip-ranges.json"
                            }
                        }
                    }
                }
            }
        )
      + created_date                 = (known after apply)
      + description                  = (known after apply)
      + disable_execute_api_endpoint = (known after apply)
      + execution_arn                = (known after apply)
      + id                           = (known after apply)
      + minimum_compression_size     = -1
      + name                         = "api-gateway-rest-api-openapi-example"
      + policy                       = (known after apply)
      + root_resource_id             = (known after apply)

      + endpoint_configuration {
          + types            = [
              + "REGIONAL",
            ]
          + vpc_endpoint_ids = (known after apply)
        }
    }

  # aws_api_gateway_stage.example will be created
  + resource "aws_api_gateway_stage" "example" {
      + arn           = (known after apply)
      + deployment_id = (known after apply)
      + execution_arn = (known after apply)
      + id            = (known after apply)
      + invoke_url    = (known after apply)
      + rest_api_id   = (known after apply)
      + stage_name    = "example"
    }

  # tls_private_key.example will be created
  + resource "tls_private_key" "example" {
      + algorithm                  = "RSA"
      + ecdsa_curve                = "P224"
      + id                         = (known after apply)
      + private_key_pem            = (sensitive value)
      + public_key_fingerprint_md5 = (known after apply)
      + public_key_openssh         = (known after apply)
      + public_key_pem             = (known after apply)
      + rsa_bits                   = 2048
    }

  # tls_self_signed_cert.example will be created
  + resource "tls_self_signed_cert" "example" {
      + allowed_uses          = [
          + "key_encipherment",
          + "digital_signature",
          + "server_auth",
        ]
      + cert_pem              = (known after apply)
      + dns_names             = [
          + "example.com",
        ]
      + early_renewal_hours   = 0
      + id                    = (known after apply)
      + key_algorithm         = "RSA"
      + private_key_pem       = (sensitive value)
      + ready_for_renewal     = true
      + validity_end_time     = (known after apply)
      + validity_period_hours = 12
      + validity_start_time   = (known after apply)

      + subject {
          + common_name  = "example.com"
          + organization = "ACME Examples, Inc"
        }
    }

Plan: 9 to add, 0 to change, 0 to destroy.

Changes to Outputs:
  + domain_url       = (known after apply)
  + stage_invoke_url = (known after apply)

Do you want to perform these actions?
  Terraform will perform the actions described above.
  Only 'yes' will be accepted to approve.

  Enter a value: yes

tls_private_key.example: Creating...
tls_private_key.example: Creation complete after 0s [id=c1129fc488709c4293493669e43d40b60144999d]
tls_self_signed_cert.example: Creating...
tls_self_signed_cert.example: Creation complete after 0s [id=199729227385231255426302845367097804347]
aws_api_gateway_rest_api.example: Creating...
aws_acm_certificate.example: Creating...
aws_api_gateway_rest_api.example: Creation complete after 2s [id=halquax36h]
aws_api_gateway_deployment.example: Creating...
aws_acm_certificate.example: Creation complete after 3s [id=arn:aws:acm:us-west-2:123456789012:certificate/35cc4fc5-072f-4543-99d1-a1336ac05a41]
aws_api_gateway_domain_name.example: Creating...
aws_api_gateway_deployment.example: Creation complete after 1s [id=tj62g3]
aws_api_gateway_stage.example: Creating...
aws_api_gateway_stage.example: Creation complete after 1s [id=ags-halquax36h-example]
aws_api_gateway_method_settings.example: Creating...
aws_api_gateway_method_settings.example: Creation complete after 1s [id=halquax36h-example-*/*]
aws_api_gateway_domain_name.example: Creation complete after 3s [id=example.com]
aws_api_gateway_base_path_mapping.example: Creating...
aws_api_gateway_base_path_mapping.example: Creation complete after 1s [id=example.com/]

Apply complete! Resources: 9 added, 0 changed, 0 destroyed.

Outputs:

domain_url = "curl -H 'Host: example.com' https://d-orixhuv0o9.execute-api.us-west-2.amazonaws.com/path1 # may take a minute to become available on initial deploy"
stage_invoke_url = "curl https://halquax36h.execute-api.us-west-2.amazonaws.com/example/path1"

$ curl -s https://halquax36h.execute-api.us-west-2.amazonaws.com/example/path1 | jq '.createDate'
"2021-01-21-00-44-18"

$ curl -H 'Host: example.com' -s https://d-orixhuv0o9.execute-api.us-west-2.amazonaws.com/path1 | jq '.createDate'
"2021-01-21-00-44-18"

$ terraform apply -var 'rest_api_path=/path2'
tls_private_key.example: Refreshing state... [id=c1129fc488709c4293493669e43d40b60144999d]
tls_self_signed_cert.example: Refreshing state... [id=199729227385231255426302845367097804347]
aws_api_gateway_rest_api.example: Refreshing state... [id=halquax36h]
aws_acm_certificate.example: Refreshing state... [id=arn:aws:acm:us-west-2:123456789012:certificate/35cc4fc5-072f-4543-99d1-a1336ac05a41]
aws_api_gateway_deployment.example: Refreshing state... [id=tj62g3]
aws_api_gateway_domain_name.example: Refreshing state... [id=example.com]
aws_api_gateway_stage.example: Refreshing state... [id=ags-halquax36h-example]
aws_api_gateway_base_path_mapping.example: Refreshing state... [id=example.com/]
aws_api_gateway_method_settings.example: Refreshing state... [id=halquax36h-example-*/*]

An execution plan has been generated and is shown below.
Resource actions are indicated with the following symbols:
  ~ update in-place
+/- create replacement and then destroy

Terraform will perform the following actions:

  # aws_api_gateway_deployment.example must be replaced
+/- resource "aws_api_gateway_deployment" "example" {
      ~ created_date  = "2021-01-22T02:59:46Z" -> (known after apply)
      ~ execution_arn = "arn:aws:execute-api:us-west-2:123456789012:halquax36h/" -> (known after apply)
      ~ id            = "tj62g3" -> (known after apply)
      ~ invoke_url    = "https://halquax36h.execute-api.us-west-2.amazonaws.com/" -> (known after apply)
      ~ triggers      = { # forces replacement
          ~ "redeployment" = "e042aae1faf8de8d7c7c98c063a986025f058c69" -> "e6742b53b5eed7039e6fec056113bb049954d64b"
        }
        # (1 unchanged attribute hidden)
    }

  # aws_api_gateway_rest_api.example will be updated in-place
  ~ resource "aws_api_gateway_rest_api" "example" {
      ~ body                         = jsonencode(
          ~ {
              ~ paths   = {
                  - /path1 = {
                      - get = {
                          - x-amazon-apigateway-integration = {
                              - httpMethod           = "GET"
                              - payloadFormatVersion = "1.0"
                              - type                 = "HTTP_PROXY"
                              - uri                  = "https://ip-ranges.amazonaws.com/ip-ranges.json"
                            }
                        }
                    } -> null
                  + /path2 = {
                      + get = {
                          + x-amazon-apigateway-integration = {
                              + httpMethod           = "GET"
                              + payloadFormatVersion = "1.0"
                              + type                 = "HTTP_PROXY"
                              + uri                  = "https://ip-ranges.amazonaws.com/ip-ranges.json"
                            }
                        }
                    }
                }
                # (2 unchanged elements hidden)
            }
        )
        id                           = "halquax36h"
        name                         = "api-gateway-rest-api-openapi-example"
        tags                         = {}
        # (8 unchanged attributes hidden)

        # (1 unchanged block hidden)
    }

  # aws_api_gateway_stage.example will be updated in-place
  ~ resource "aws_api_gateway_stage" "example" {
      ~ deployment_id         = "tj62g3" -> (known after apply)
        id                    = "ags-halquax36h-example"
        tags                  = {}
        # (8 unchanged attributes hidden)
    }

Plan: 1 to add, 2 to change, 1 to destroy.

Changes to Outputs:
  ~ domain_url       = "curl -H 'Host: example.com' https://d-orixhuv0o9.execute-api.us-west-2.amazonaws.com/path1 # may take a minute to become available on initial deploy" -> "curl -H 'Host: example.com' https://d-orixhuv0o9.execute-api.us-west-2.amazonaws.com/path2 # may take a minute to become available on initial deploy"
  ~ stage_invoke_url = "curl https://halquax36h.execute-api.us-west-2.amazonaws.com/example/path1" -> "curl https://halquax36h.execute-api.us-west-2.amazonaws.com/example/path2"

Do you want to perform these actions?
  Terraform will perform the actions described above.
  Only 'yes' will be accepted to approve.

  Enter a value: yes

aws_api_gateway_rest_api.example: Modifying... [id=halquax36h]
aws_api_gateway_rest_api.example: Modifications complete after 1s [id=halquax36h]
aws_api_gateway_deployment.example: Creating...
aws_api_gateway_deployment.example: Creation complete after 1s [id=9vc6zm]
aws_api_gateway_stage.example: Modifying... [id=ags-halquax36h-example]
aws_api_gateway_stage.example: Modifications complete after 1s [id=ags-halquax36h-example]
aws_api_gateway_deployment.example: Destroying... [id=tj62g3]
aws_api_gateway_deployment.example: Destruction complete after 0s

Apply complete! Resources: 1 added, 2 changed, 1 destroyed.

Outputs:

domain_url = "curl -H 'Host: example.com' https://d-orixhuv0o9.execute-api.us-west-2.amazonaws.com/path2 # may take a minute to become available on initial deploy"
stage_invoke_url = "curl https://halquax36h.execute-api.us-west-2.amazonaws.com/example/path2"

$ curl -s https://halquax36h.execute-api.us-west-2.amazonaws.com/example/path2 | jq '.createDate'
"2021-01-21-00-44-18"

$ curl -H 'Host: example.com' -s https://d-orixhuv0o9.execute-api.us-west-2.amazonaws.com/path2 | jq '.createDate'
"2021-01-21-00-44-18"

REST API Deployment with Terraform Resources

Here is a recommended starter configuration with this method:

resource "aws_api_gateway_rest_api" "example" {
  name = "example"
}

resource "aws_api_gateway_resource" "example" {
  parent_id   = aws_api_gateway_rest_api.example.root_resource_id
  path_part   = "example"
  rest_api_id = aws_api_gateway_rest_api.example.id
}

resource "aws_api_gateway_method" "example" {
  authorization = "NONE"
  http_method   = "GET"
  resource_id   = aws_api_gateway_resource.example.id
  rest_api_id   = aws_api_gateway_rest_api.example.id
}

resource "aws_api_gateway_integration" "example" {
  http_method = aws_api_gateway_method.example.http_method
  resource_id = aws_api_gateway_resource.example.id
  rest_api_id = aws_api_gateway_rest_api.example.id
  type        = "MOCK"
}

resource "aws_api_gateway_deployment" "example" {
  rest_api_id = aws_api_gateway_rest_api.example.id

  triggers = {
    # NOTE: The configuration below will satisfy ordering considerations,
    #       but not pick up all future REST API changes. More advanced patterns
    #       are possible, such as using the filesha1() function against the
    #       Terraform configuration file(s) or removing the .id references to
    #       calculate a hash against whole resources. Be aware that using whole
    #       resources will show a difference after the initial implementation.
    #       It will stabilize to only change when resources change afterwards.
    redeployment = sha1(jsonencode([
      aws_api_gateway_resource.example.id,
      aws_api_gateway_method.example.id,
      aws_api_gateway_integration.example.id,
    ]))
  }

  lifecycle {
    create_before_destroy = true
  }
}

resource "aws_api_gateway_stage" "example" {
  deployment_id = aws_api_gateway_deployment.example.id
  rest_api_id   = aws_api_gateway_rest_api.example.id
  stage_name    = "example"
}

As you can see the triggers is much more complicated as we need to collect changes from many more sources of configuration to implement it properly. The two additional configuration options about potentially using the filesha1() function against the configuration file itself or hashing whole resources are both widely used in the broader ecosystem, but they add some additional complexity/caveats. The HashiCorp Community Forums is likely a better place to discuss those types of configuration choices, where there are far more people ready to help than those watching the issues in this code repository.

As an aside about the timestamp() function, please note that it uses a special implementation (overriding the Terraform expectation that plan and apply values must exactly match) which generally translates to it sometimes introducing strange behavior into Terraform plan differences. If you need a static time value in Terraform configurations (e.g. when an API Gateway was deployed), a preferable solution is the time_static resource. Since it participates in the Terraform operation graph just like other resources and can store time with a stable value, it should be much more predictable.

Here is an illustrative example (aws_api_gateway_deployment resource already has a created_date attribute):

terraform {
  required_version = "0.14.5"
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "3.25.0"
    }
    time = {
      source  = "hashicorp/time"
      version = "0.6.0"
    }
  }
}

provider "aws" {
  region = "us-east-2"
}

variable "name" {
  default     = "tf-aws-11344-time"
  description = "Name and OpenAPI title for REST API"
  type        = string
}

variable "path" {
  default     = "/test"
  description = "OpenAPI path to test updates"
  type        = string
}

resource "aws_api_gateway_rest_api" "example" {
  body = jsonencode({
    openapi = "3.0.1"
    info = {
      title   = var.name
      version = "1.0"
    }
    paths = {
      (var.path) = {
        get = {
          x-amazon-apigateway-integration = {
            httpMethod           = "GET"
            payloadFormatVersion = "1.0"
            type                 = "HTTP_PROXY"
            uri                  = "https://ip-ranges.amazonaws.com/ip-ranges.json"
          }
        }
      }
    }
  })

  name = var.name

  endpoint_configuration {
    types = ["REGIONAL"]
  }
}

resource "aws_api_gateway_deployment" "example" {
  rest_api_id = aws_api_gateway_rest_api.example.id

  triggers = {
    redeployment = sha1(jsonencode(aws_api_gateway_rest_api.example.body))
  }

  lifecycle {
    create_before_destroy = true
  }
}

resource "aws_api_gateway_stage" "example" {
  deployment_id = aws_api_gateway_deployment.example.id
  description   = "Deployed at ${time_static.deploy.rfc3339}"
  rest_api_id   = aws_api_gateway_rest_api.example.id
  stage_name    = "example"
}

resource "time_static" "deploy" {
  triggers = {
    redeployment = aws_api_gateway_deployment.example.id
  }
}

You can see updates by running a command similar to terraform apply -var 'path=/new' after the initial terraform apply.

Hopefully all this information helps. If these recommendations are not working as expected on Terraform AWS Provider version 3.25.0 or later, please reach out. We will be looking for reproducing configurations and plan output in those cases. 👍

+13

bflad on Jan 22, 2021

@breathingdust you assigned @bflad to this over 2 months ago. Since then there has been zero visible activity, no updates, no documentation updates warning people away from using Terraform for API Gateway in a Production environment, nothing.

As I indicated in a private message to Hashicorp directly, I am happy to do a Medium article on how Terraform AWS Support for API Gateway is not ready for Production and should be avoided if possible. I think that is now the only responsible course of action since from reading documentation nothing would indicate to the casual reader that the only way to update an OpenAPI AWS API Gateway is to completely destroy and recreate your entire infrastructure on each minor change.

Hashicorp/Terraform AWS Team should do the responsible thing and update the public documentation to indicate this SEVERE fault and warn people away from using their solution in real world environments. The fact that you have STILL not done this is a huge stain.

Clearly this issue is not important to you, but it is VERY VERY important to the teams (like ours) stupid enough to get suckered into using this broken implementation. I think you need to be proactive to immediately ensure that more teams don’t get harmed by this lack of support.

+13

shederman on Dec 23, 2020

Completely wiping out the value of declarative Infrastructure as Code. If we should manually do a whole bunch of extra work to the top level every single time some minor change in a bottom level happens, what’s the point of Terraform?

+11

shederman on Sep 27, 2020

@shederman are you seriously threatening an open source project? Get a hold of yourself. Yes, the deployment resource is problematic. My company uses api gateway with terraform in production very successfully. If I need to remove an integration, I do a manual step of removing the deployment from the state file first. That’s it. Yes, I’d like that to not be the case, but I’m not threatening the developers that are the ones most likely to fix this. If you hate terraforming api gateways, stop doing it.

grimm26 on Dec 23, 2020

hello everybody i did find a solution, terraform handel resources in singleton mode, it means on resource with a specific name should exist only one time in a tf state, in the case of apigateway deployment, a deployment cant be modified, its a partucularity of aws, and it is quite normal it is like a tag. my solution is to remove the resource from the tfstate after each apply terraform state rm aws_api_gateway_deployment.gw_deploy_dev and now i can see the history of terrform deployments on my Api
i hope it will help you, corona virus is a mess but thanks to the time that i had i could made a reverse engineering of the apigw, but in the end i think that Terraform should add new type of ressource based of the design pattern Prototype

walidmansia on Jun 24, 2020

If like me you’ve come to this issue because you got a cycle error while having implemented the recommended way of doing things in the docs (summarised here), then here follows how I solved things. I was getting this cycle error when running a terraform plan to remove a resource from the body of the API gateway REST API resource (openAPI definition). I spotted the cause of the cycle fairly easily - a lambda behind a separate API Gateway (let’s call it B) referenced the invoke_url of the current stage of the main API Gateway (let’s call it A) in its environment variables. The deployment resource of both API Gateways had a lifecycle policy of create_before_destroy, which is a must have to ensure uptime. This caused the cycle, and as such I broke apart the cycle by manually assembling the invoke_url based on the ID of the REST API resource of API Gateway A and the variable that was used as the stage name in API Gateway A. Great stuff, but unfortunately I simply had a new cycle to contend with, though a shorter one and one where only resources for the API Gateway A were present, mentioning some deposed resources. Basically what I had here is that the remote state still had this coupling between the two API Gateway deployments (because of the stage invoke_url reference), whereas locally I didn’t have it. To solve this, what I did was to change the lifecycle policy of the aws_api_gateway_deployment resource of API Gateway A in the same PR as the change to remove the resource from it:

  lifecycle {
    create_before_destroy = true
    ignore_changes = [
      triggers
    ]
  }

What the above does is to simply not trigger a deployment while still removing the resource from the API Gateway in remote state. PR merged, terraform apply executed and in the next PR I simply removed the ignore_changes block to go back to normal 🎉

ricoli on Apr 25, 2023

I also encountered the same issue. I tried two possible compromise solutions.

Wait for a while until all the dependent resources are created

I tried the following solution and I could change method and resource at least. The drawback is that this will trigger deployment every time you apply even if you don’t have any change in the dependent resources.

resource "aws_api_gateway_deployment" "deployment" {
- depends_on = [
-   module.method.lambda-integration
- ]

  rest_api_id = aws_api_gateway_rest_api.api.id

  triggers = {
-   redeployment = sha1(join(",", list(
-     jsonencode(module.method.lambda-integration), # I was using lambda integration as a trigger of deployment.
-   )))
+   redeployment = timestamp()
  }

  provisioner "local-exec" {
    command = "sleep 30"
  }

  lifecycle {
    create_before_destroy = true
  }
}

Pass variable for trigger In this way, we can control when to recreate deployment, but you need to separate the resource update and deployment trigger. If you put them in one apply, creating and destroying deployment will start before completing to update the dependent resources.
```
resource "aws_api_gateway_deployment" "deployment" {
  rest_api_id = aws_api_gateway_rest_api.api.id

  triggers = {
    redeployment = var.release-date
  }

  lifecycle {
    create_before_destroy = true
  }
}
```

nakamasato on Jun 10, 2020

We also ran into this problem, and solved it by removing create_before_destroy from the deployment, and manually running terraform taint on the stage resource to force it to be recreated, which got rid of the other error you mention.

Isn’t it manual wrangling to solve problem? We use CD software to deploy our TF code so we would prefer avoid such workarounds. Plus our stage is active as its attached to Custom Domain Name so we can’t have it destroyed or have not existing deployment.

Currently we use null resource with some sleep command and deployment resource explicitly set to depends on that null resource as form of workaround. Deployment resource itself isn’t set to depend on any API Gateway resources but delay gives time to all of required resources (methods, integrations and so on) to be provisioned before deployment is created (example below uses PowerShell as language for command because that’s what we use in our company mostly)

resource "null_resource" "wait_for_all_resources" {
  triggers = {
    timestamp = timestamp()
  }
  provisioner "local-exec" {
    command     = "Start-Sleep -Seconds 60"
    interpreter = ["PowerShell", "-Command"]
  }
}

martyna-autumn on Jan 2, 2020

In my testing, if you trigger deployments off changes in id as per the example, that means that the resource will be destroyed and recreated to reflect the ID change (e.g. you change the method from a POST to a PUT). This leads to the following situation:

aws_api_gateway_integration.api_integration: Creation complete after 1s [id=agi-mn2mqickwg-fdd064-PUT]
aws_api_gateway_deployment.deployment: Creating...
aws_api_gateway_deployment.deployment: Creation complete after 1s [id=b5r1ui]
aws_api_gateway_deployment.deployment: Destroying... [id=s0ir4f]
aws_api_gateway_deployment.deployment: Destruction complete after 1s
aws_api_gateway_integration.api_integration: Destroying... [id=agi-mn2mqickwg-fdd064-POST]

Essentially, the API gets deployed before the old integration is destroyed, which means your API deployment will contain both the old integration and the new one at the same time. This might not be desired, so unless I’ve missed something it’s worth taking care when triggering deployments off resources that are getting destroyed and recreated as opposed to just being modified in place.

rbowater on Jan 26, 2021

Since it seems this code has zero value, I will post the code that is not working. We have made numerous changes to try and get this working, and not one has worked. This particular variation builds the API Gateway just fine, but any slight change (e.g. to what parameters we validate) results in “Error: error deleting API Gateway Deployment (ufn1gl): BadRequestException: Active stages pointing to this deployment must be moved or deleted”

The only way to make this work in tooling (fully automated) is to entirely destroy the entire API gateway and recreate it, resulting in a completely new URL. I would not be happy with that solution in a Development environment; in a Production one it’s a joke.

The defects related to our issue are:

https://github.com/hashicorp/terraform/issues/10674
https://github.com/terraform-providers/terraform-provider-aws/issues/10105 - The suggested and documented solution does not work at all, so the bug got reopened later
https://github.com/hashicorp/terraform/issues/22792
https://github.com/terraform-providers/terraform-provider-aws/issues/11344 - Suggested solution is to manually remove items from state on each deploy!
https://stackoverflow.com/questions/38910937/terraform-not-deploying-api-gateway-stage - Suggested solution is to deploy on every run!

locals {
  private_config_map = { type = "PRIVATE", vpc_endpoint_ids = var.vpc_endpoint_ids }
  regional_config_map = { type = "REGIONAL", vpc_endpoint_ids = null }
}

/* ---------------------------
 * API GATEWAY
 * --------------------------- */
resource "aws_api_gateway_rest_api" "main" {
  name            = var.name

  dynamic "endpoint_configuration" {
    for_each = var.private == true ? list(local.private_config_map) : list(local.regional_config_map)

    content {
      types             = [endpoint_configuration.value["type"]]
      vpc_endpoint_ids  = endpoint_configuration.value["vpc_endpoint_ids"]
    }
  }

  api_key_source  = "HEADER"
  body            = var.body
  tags            = var.tags

  lifecycle {
    ignore_changes = [
      policy
    ]
  }
}

/* ---------------------------
 * SETTINGS
 * --------------------------- */
resource "aws_api_gateway_method_settings" "main" {
  rest_api_id = aws_api_gateway_rest_api.main.id
  stage_name  = aws_api_gateway_deployment.deploy.stage_name
  method_path = "*/*"

  settings {
    metrics_enabled    = true
    logging_level      = "INFO"
    data_trace_enabled = true
  }
}

/* ---------------------------
 * MAIN STAGE
 * --------------------------- */
resource "aws_api_gateway_stage" "main" {
  stage_name            = "main"
  description           = "Main Stage for deploying functionality"
  rest_api_id           = aws_api_gateway_rest_api.main.id
  deployment_id         = aws_api_gateway_deployment.deploy.id
  xray_tracing_enabled  = var.xray_tracing_enabled

  variables             = var.variables

  access_log_settings {
    destination_arn = var.cloudwatch_log_arn
    format          = "\"{\"requestId\":\"$context.requestId\",\"ip\":\"$context.identity.sourceIp\",\"caller\":\"$context.identity.caller\",\"user\":\"$context.identity.user\",\"requestTime\":$context.requestTimeEpoch,\"httpMethod\":\"$context.httpMethod\",\"resourcePath\":\"$context.resourcePath\",\"status\":$context.status,\"protocol\":\"$context.protocol\",\"path\":\"$context.path\",\"stage\":\"$context.stage\",\"xrayTraceId\":\"$context.xrayTraceId\",\"userAgent\":\"$context.identity.userAgent\",\"responseLength\":$context.responseLength}\""
  }

  lifecycle {
    ignore_changes = [
      deployment_id
    ]
  }

  tags = var.tags

  depends_on = [aws_api_gateway_deployment.deploy]
}

/* ---------------------------
 * DEPLOYMENT
 * --------------------------- */
resource "aws_api_gateway_deployment" "deploy" {
  rest_api_id = aws_api_gateway_rest_api.main.id
  stage_name  = "deploy"
  stage_description = "Deployed at ${timestamp()}"

  triggers = {
    redeployment = sha1(join(",", list(
      jsonencode(var.body)
    )))
  }

  lifecycle {
    create_before_destroy = true
  }
}

shederman on Sep 27, 2020

Just putting it here, in case that helps somebody: I followed the example of bflad (using the REST API Deployment with OpenAPI version), but still had this cycle error.

I finally found that I had some aws_lambda_permission resources to bind lambdas with API gateway that were being updated at the same time as the deployment resource.

After adding depends_on = [aws_api_gateway_deployment.example] on my permission resources, the deployment went fine (ex below):

resource "aws_lambda_permission" "lambda_permission_example" {
  statement_id  = "AllowExecutionFromAPIGateway"
  action        = "lambda:InvokeFunction"
  function_name = "lambda name example"
  principal     = "apigateway.amazonaws.com"
  source_arn    = "${aws_api_gateway_rest_api.example_api_gateway.execution_arn}/*/POST/whatever/*"

  depends_on = [aws_api_gateway_deployment.example]
}

Pepert on Oct 22, 2021

Nobody is threatening anybody. My concern is that people (like us) are using this assuming it will work in Production and it just won’t. The team do not seem to be telling anybody about this.

I think it’s pretty bad form to know about such a serious issue and not indicate it in their documentation. I think it SHOULD be indicated in their documentation, and I asked them to do that MONTHS ago, and they still haven’t.

So what is the responsible thing to do? Ignore this and wait for however long it takes while more and more users get sucked into the same hole? Ask them to update their documentation? Tell people to avoid it because it’s broken? And I don’t hate terraforming API Gateway, I want to be able to but am blocked by this critical bug.

shederman on Dec 23, 2020

Any traction on this?

teemal on Dec 17, 2020

@Glen-Moonpig your solution sounds interesting. The one piece I would dispute is that Terraform supports API Gateway. Terraform is supposed to be a tool to manage infrastructure as code - this is a production focused tool. If Terraform cannot create and manage components like API Gateway without causing production outages not required in normal operation of the component, then I would argue quite vehemently that it is not in fact supported.

Especially since this has been unresolved in one shape or form for over a year.

I am using Terraform to deploy and maintain API Gateways in numerous projects. I have not had any production outages. There are very simple ways to handle this particular scenario. You can just break your changes down into multiple applys and they will go through fine. Terraform 0.13.3/0.14 might resolve the cycle issue as there are various changes around cycles and plans.

Glen-Moonpig on Sep 29, 2020

Nobody is threatening anybody.

I definitely detected a Medium blog post threat 😂

I’m sure a pull request would be appreciated if you fancy mucking in @shederman …

I’ve been using Terraform for API Gateway in production for a couple of years with daily deployments and its working very well for me. I appreciate all the efforts people contribute to this project also. Thanks everyone ❤️ Happy Christmas 🎄 🎅

Glen-Moonpig on Dec 23, 2020

@bflad Is there any progress on this issue? Given it is a major issue blocking all usage of AWS API Gateway via Open API in real world Production environments?

shederman on Dec 19, 2020

@riley-clarkson Do you get any service interruptions like that? We have mission-critical services running on API Gateway and the idea of destroying stages on every deploy is not a popular one I can tell you!

shederman on Oct 1, 2020

@Glen-Moonpig your solution sounds interesting. The one piece I would dispute is that Terraform supports API Gateway. Terraform is supposed to be a tool to manage infrastructure as code - this is a production focused tool. If Terraform cannot create and manage components like API Gateway without causing production outages not required in normal operation of the component, then I would argue quite vehemently that it is not in fact supported.

Especially since this has been unresolved in one shape or form for over a year.

shederman on Sep 29, 2020

I also had this issue, the following solution worked well for me. I’m using random_uuid resource to produce a value that is passed to triggers block in aws_api_gateway_deployment resource. The random_uuid is re-generated when keepers values change, which can be set to anything e.g jsonencode(aws_api_gateway_method.method) and jsonencode(aws_api_gateway_integration.integration). It is important to make sure that aws_api_gateway_deployment is created after everything, I achieved it by extracting it into a module and using mandatory variable.

variable "required_resources" {
  type        = list(string)
  description = "Change in these values trigger redeployment"
}

resource "aws_api_gateway_deployment" "deployment" {
  rest_api_id = var.rest_api_id
  stage_name  = var.stage

  # hack to force redeployment every time this hash changes
  triggers = {
    redeployment = sha1(join(",", var.required_resources)
  }

  # false by default, just for clarity
  lifecycle {
    create_before_destroy = false
  }
}

The above resource is placed in its own module.

resource "random_uuid" "deployment_trigger" {
  depends_on = [aws_api_gateway_integration.integration, aws_api_gateway_method.method]
  keepers = {
    # Generate a new id every time something happens to these resources
    method      = jsonencode(aws_api_gateway_method.method)
    integration = jsonencode(aws_api_gateway_integration.integration)
    path        = var.resource_path
  }
}

# some other gateway stuff...

module "deployment" {
  source      = "../modules/api-gateway-deployment"
  rest_api_id = aws_api_gateway_rest_api.api.id
  stage       = var.stage

  required_resources = [
    random_uuid.deployment_trigger.id,
    random_uuid.deployemnt_trigger_for_another_method.id,
# add random uuid for each method/integration
  ]
}

I placed stuff required for adding new method into its own module as well so I don’t have to write "random_uuid" "deployment_trigger" multiple times. This seems to be working fine for consecutive deployments and changes to api gateway integration/method.

I published modules I use, they are very basic and might not work for all projects but code can be adapted for your needs. https://github.com/vladcar/terraform-aws-serverless-common-api-gateway-method https://github.com/vladcar/terraform-aws-serverless-common-api-gateway-deployment

vladcar on Jun 23, 2020