serverless-application-model: Bug: SAM deploy tries to delete and recreate AWS::Serverless::API domain when switching away from Fn:If to hardcoded value

Description:

I have a lambda function with an API Gateway that is deployed to two environments. In each environment I want to specify a different Domain Name. To do so I used conditional statements within the AWS::Serverless::Api resource type:

ApiGatewayApi:
    Type: AWS::Serverless::Api
    Properties:
      StageName: prod
      Domain:
        DomainName: !If [inDev, test.dev.hello.world, test.hello.world]
        CertificateArn: !If [inDev, 123, 321]
        EndpointConfiguration: EDGE
        Route53:
          HostedZoneId: !If [inDev, abcxyz, xyzabc]

This worked fine but then we were asked to set up the template.yaml to handle a third environment. To do this I decided to stop using the conditional and instead use parameters that are passed in via the parameter_overrides option in samconfig.toml. This means that the above resource block now looks like:

ApiGatewayApi:
    Type: AWS::Serverless::Api
    Properties:
      StageName: prod
      Domain:
        DomainName: !Ref DomainName
        CertificateArn: !Ref EdgeCertificateArn
        EndpointConfiguration: EDGE
        Route53:
          HostedZoneId: !Ref Route53HostedZoneId

Note that the domain name is unchanged for the two environments that already existed. I then try to deploy this to our dev environment via sam deploy --config-env dev --config-file ./samconfig.toml --tags createdby=awssam team=abc --resolve-image-repos --resolve-s3 --no-confirm-changeset --no-fail-on-empty-changeset. Again, nothing is changed other than how I’m getting the data into the template.

What I expect to happen is that there will be no changes because I’m deploying using the dev config-env which already existed and for which I changed no values. I only moved values out of the conditional and into the parameter_overrides.

What actually happens is the changeset reports the following:

CloudFormation stack changeset
-----------------------------------------------------------------------------------------------------------------------------------------
Operation                          LogicalResourceId                  ResourceType                       Replacement                      
-----------------------------------------------------------------------------------------------------------------------------------------
+ Add                              ApiGatewayApiDeployment567d98957   AWS::ApiGateway::Deployment        N/A                              
                                   0                                                                                                      
+ Add                              ApiGatewayDomainName5a4c9e240d     AWS::ApiGateway::DomainName        N/A                              
* Modify                           ApiGatewayApiBasePathMapping       AWS::ApiGateway::BasePathMapping   True                             
* Modify                           ApiGatewayApiprodStage             AWS::ApiGateway::Stage             False                            
* Modify                           RecordSetGroup0d3ed29639           AWS::Route53::RecordSetGroup       False                            
- Delete                           ApiGatewayApiDeploymentff19363ec   AWS::ApiGateway::Deployment        N/A                              
                                   c                                                                                                      
- Delete                           ApiGatewayDomainName4148406711     AWS::ApiGateway::DomainName        N/A                              
-----------------------------------------------------------------------------------------------------------------------------------------

This is problematic for a couple of reasons.

  1. If this plan were to work it would involve downtime for our service since the domain would need to be deleted and recreated.
  2. The plan doesn’t actually work because SAM will first try to create the custom domain, only to error out because it already exists.

I have also tried this by modifying the AWS::Serverless::Api resource just like so:

ApiGatewayApi:
    Type: AWS::Serverless::Api
    Properties:
      StageName: prod
      Domain:
        DomainName: test.dev.hello.world
        CertificateArn: !If [inDev, 123, 321]
        EndpointConfiguration: EDGE
        Route53:
          HostedZoneId: !If [inDev, abcxyz, xyzabc]

Where above I simply hardcode the DomainName (again, this is after having already deployed with the conditional setup prior). Even this setup will trigger the changeset above where it wants to delete the existing custom domain and create a new one.

Steps to reproduce:

I went ahead and replicated this behavior using the hello world SAM app with modification. What you’ll need to do is initialize the hello world app and replace the template.yaml and samconfig.toml with the below code. Obviously, you’ll need to update the Domain properties to actual values for your test case.

AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31
Description: >
  test-sam

  Sample SAM Template for test-sam

Globals:
  Function:
    Timeout: 3
    MemorySize: 128
    Tracing: Active
  Api:
    TracingEnabled: true

Parameters:
  Environment:
    Type: String
    Description: Name of environment
    AllowedValues:
      - dev
      - prod

Conditions:
  inDev:
    !Equals [!Ref Environment, dev]

Resources:
  #############################################################################
  # API Gateway with Custom Domain Name
  #############################################################################
  ApiGatewayApi:
    Type: AWS::Serverless::Api
    Properties:
      StageName: prod
      Domain:
        DomainName: !If [inDev, test.dev.hello.world, test.hello.world]
        CertificateArn: !If [inDev, 123, 321]
        EndpointConfiguration: EDGE
        Route53:
          HostedZoneId: !If [inDev, abcxyz, xyzabc]
  
  #############################################################################
  # Lambda for Service
  #############################################################################
  HelloWorldFunction:
    Type: AWS::Serverless::Function
    Properties:
      CodeUri: hello_world/
      Handler: app.lambda_handler
      Runtime: python3.9
      Architectures:
      - x86_64
      Events:
        HelloWorld:
          Type: Api
          Properties:
            Path: /hello
            Method: GET
            RestApiId: !Ref ApiGatewayApi

and here is the samconfig.toml. Note, that you don’t need the parameter overrides to replicate the bug.

version = 0.1
[dev]
[dev.deploy]
[dev.deploy.parameters]
stack_name = "sam-test"
s3_prefix = "sam-test"
region = "us-east-1"
confirm_changeset = true
capabilities = ["CAPABILITY_IAM", "CAPABILITY_NAMED_IAM"]
parameter_overrides = [
    "Environment=dev",
    "DomainName=test.dev.hello.world",
    "EdgeCertificateArn=123",
    "Route53ZoneId=abcxyz",
]

[prod]
[prod.deploy]
[prod.deploy.parameters]
stack_name = "sam-test"
s3_prefix = "sam-test"
region = "us-east-1"
confirm_changeset = true
capabilities = ["CAPABILITY_IAM", "CAPABILITY_NAMED_IAM"]
parameter_overrides = [
    "Environment=prod",
    "DomainName=test.hello.world",
    "EdgeCertificateArn=321",
    "Route53ZoneId=xyabc",
]

After updating the samconfig.toml and template.yaml you’ll need to do the following steps:

  1. Deploy the application to the dev environment.
  2. Change DomainName in AWS::Serverless::Api to be test.dev.hello.world (the same name it was deployed with before)
  3. Build and try to deploy again

Observed result:

Initiating deployment
=====================

2023-03-08 14:09:49,686 | Collected default values for parameters: {}
2023-03-08 14:09:49,700 | Sam customer defined id is more priority than other IDs. Customer defined id for resource ApiGatewayApi is ApiGatewayApi
2023-03-08 14:09:49,700 | Sam customer defined id is more priority than other IDs. Customer defined id for resource HelloWorldFunction is HelloWorldFunction
2023-03-08 14:09:49,700 | 0 stacks found in the template
2023-03-08 14:09:49,700 | Collected default values for parameters: {}
2023-03-08 14:09:49,712 | Sam customer defined id is more priority than other IDs. Customer defined id for resource ApiGatewayApi is ApiGatewayApi
2023-03-08 14:09:49,712 | Sam customer defined id is more priority than other IDs. Customer defined id for resource HelloWorldFunction is HelloWorldFunction
2023-03-08 14:09:49,712 | 2 resources found in the stack 
        Uploading to sam-test/538be5ddbb1414e39664b8ea7dc96ed1.template  1609 / 1609  (100.00%)


Waiting for changeset to be created..

CloudFormation stack changeset
-----------------------------------------------------------------------------------------------------------------------------------------
Operation                          LogicalResourceId                  ResourceType                       Replacement                      
-----------------------------------------------------------------------------------------------------------------------------------------
+ Add                              ApiGatewayApiDeployment567d98957   AWS::ApiGateway::Deployment        N/A                              
                                   0                                                                                                      
+ Add                              ApiGatewayDomainName5a4c9e240d     AWS::ApiGateway::DomainName        N/A                              
* Modify                           ApiGatewayApiBasePathMapping       AWS::ApiGateway::BasePathMapping   True                             
* Modify                           ApiGatewayApiprodStage             AWS::ApiGateway::Stage             False                            
* Modify                           RecordSetGroup0d3ed29639           AWS::Route53::RecordSetGroup       False                            
- Delete                           ApiGatewayApiDeploymentff19363ec   AWS::ApiGateway::Deployment        N/A                              
                                   c                                                                                                      
- Delete                           ApiGatewayDomainName4148406711     AWS::ApiGateway::DomainName        N/A                              
-----------------------------------------------------------------------------------------------------------------------------------------


Changeset created successfully. arn:aws:cloudformation:us-east-1:xxx:changeSet/samcli-deploy1678313390/cb3b2f9e-e8d0-469b-810d-d6c5c5731237


2023-03-08 14:10:03 - Waiting for stack create/update to complete

CloudFormation events from stack operations (refresh every 0.5 seconds)
-----------------------------------------------------------------------------------------------------------------------------------------
ResourceStatus                     ResourceType                       LogicalResourceId                  ResourceStatusReason             
-----------------------------------------------------------------------------------------------------------------------------------------
CREATE_IN_PROGRESS                 AWS::ApiGateway::DomainName        ApiGatewayDomainName5a4c9e240d     -                                
CREATE_IN_PROGRESS                 AWS::ApiGateway::Deployment        ApiGatewayApiDeployment567d98957   -                                
                                                                      0                                                                   
CREATE_FAILED                      AWS::ApiGateway::DomainName        ApiGatewayDomainName5a4c9e240d     test.dev.xxx.xxx already    
                                                                                                         exists in stack                  
                                                                                                         arn:aws:cloudformation:us-       
                                                                                                         east-1:xxx:stack/sam-te 
                                                                                                         st/16f30000-bdf3-11ed-977a-12beb 
                                                                                                         d4450e9                          
CREATE_FAILED                      AWS::ApiGateway::Deployment        ApiGatewayApiDeployment567d98957   Resource creation cancelled      
                                                                      0                                                                   
UPDATE_ROLLBACK_IN_PROGRESS        AWS::CloudFormation::Stack         sam-test                           The following resource(s) failed 
                                                                                                         to create:                       
                                                                                                         [ApiGatewayDomainName5a4c9e240d, 
                                                                                                         ApiGatewayApiDeployment567d98957 
                                                                                                         0].                              
UPDATE_ROLLBACK_COMPLETE_CLEANUP   AWS::CloudFormation::Stack         sam-test                           -                                
_IN_PROGRESS                                                                                                                              
DELETE_COMPLETE                    AWS::ApiGateway::DomainName        ApiGatewayDomainName5a4c9e240d     -                                
DELETE_COMPLETE                    AWS::ApiGateway::Deployment        ApiGatewayApiDeployment567d98957   -                                
                                                                      0                                                                   
UPDATE_ROLLBACK_COMPLETE           AWS::CloudFormation::Stack         sam-test                           -                                
-----------------------------------------------------------------------------------------------------------------------------------------

2023-03-08 14:13:07,379 | Execute stack waiter exception
Traceback (most recent call last):
  File "/opt/homebrew/Cellar/aws-sam-cli/1.76.0/libexec/lib/python3.8/site-packages/samcli/lib/deploy/deployer.py", line 502, in wait_for_execute
    waiter.wait(StackName=stack_name, WaiterConfig=waiter_config)
  File "/opt/homebrew/Cellar/aws-sam-cli/1.76.0/libexec/lib/python3.8/site-packages/botocore/waiter.py", line 55, in wait
    Waiter.wait(self, **kwargs)
  File "/opt/homebrew/Cellar/aws-sam-cli/1.76.0/libexec/lib/python3.8/site-packages/botocore/waiter.py", line 375, in wait
    raise WaiterError(
botocore.exceptions.WaiterError: Waiter StackUpdateComplete failed: Waiter encountered a terminal failure state: For expression "Stacks[].StackStatus" we matched expected path: "UPDATE_ROLLBACK_COMPLETE" at least once
2023-03-08 14:13:07,384 | Telemetry endpoint configured to be https://aws-serverless-tools-telemetry.us-west-2.amazonaws.com/metrics
2023-03-08 14:13:07,475 | Sending Telemetry: {'metrics': [{'commandRun': {'requestId': '9543398a-8c4d-4d4f-befc-1c2ad451d024', 'installationId': 'dda226e3-9b79-4e59-84a4-1c7253bce103', 'sessionId': '6666cf63-41ac-47e3-9766-19f1b6d116df', 'executionEnvironment': 'CLI', 'ci': False, 'pyversion': '3.8.16', 'samcliVersion': '1.76.0', 'awsProfileProvided': True, 'debugFlagProvided': True, 'region': 'us-east-1', 'commandName': 'sam deploy', 'metricSpecificAttributes': {'projectType': 'CFN', 'gitOrigin': None, 'projectName': 'c705de491dcb53c849e84aa5634de3748cc3f96f7126d125eef2aa054399d24d', 'initialCommit': None}, 'duration': 200522, 'exitReason': 'DeployFailedError', 'exitCode': 1}}]}
2023-03-08 14:13:08,016 | Telemetry response: 200
Error: Failed to create/update the stack: sam-test, Waiter StackUpdateComplete failed: Waiter encountered a terminal failure state: For expression "Stacks[].StackStatus" we matched expected path: "UPDATE_ROLLBACK_COMPLETE" at least once

Expected result:

I expected that there would be no changes on the changeset because I am not changing values, only the way the values are passed into the template (hardcoded vs using a conditional statement)

Additional environment details (Ex: Windows, Mac, Amazon Linux etc)

{
  "version": "1.76.0",
  "system": {
    "python": "3.8.16",
    "os": "macOS-13.2-arm64-arm-64bit"
  },
  "additional_dependencies": {
    "docker_engine": "20.10.23",
    "aws_cdk": "Not available",
    "terraform": "Not available"
  }
}

Thank you! Happy to answer any clarifying questions.

About this issue

  • Original URL
  • State: closed
  • Created a year ago
  • Reactions: 2
  • Comments: 19 (9 by maintainers)

Most upvoted comments

I came up with a flow that worked on my test application. Going to try it later this week on the main service.

The process will be as follows:

  1. Build a new API gateway, attach it to the lambda, and build a cloudfront distribution that points to the new API gateway.
  2. Deploy ^ through from Stg -> Dev -> Prod
  3. Manually convert all existing records to be weighted records
  4. Build a record set group that creates a weighted DNS record pointing to the cloudfront distribution
  5. Deploy ^ through Stg -> Dev -> Prod

At this point there should now be two DNS records for this service in each environment. Manually migrate traffic from the old DNS record to the new DNS record. Once the migration has completed in all environments do the following:

Delete the old API from the SAM template Delete the lambda events pointing to the old API from the SAM template Deploy ^ through from Stg -> Dev -> Prod

This is similar in spirit to deconstructing the Aws::Serverless::Api into its constituent parts. However, I avoid setting up a custom domain name and instead use cloudfront, which allows me to run the API in parallel with itself, do a migration, and then close the “old” one.

I’m going to give it a shot later this week and close this issue if I’m successful.

@aahung this is a fair point and I actually think I can get away with doing that here.

I was deleting resources from the stack by commenting them out of the untransformed template.yaml and then running build and deploy. This is a bit like taking a hammer to the problem. I’ll revisit doing this work to the template directly.

What is the best was to modify the transformed template directly? Should I download it, remove the necessary resources, then reimport it via the GUI?

Thank you.

Cool, thanks guys. I’ll chip away at it and let you know if I have follow-up questions.

How can I get the translated template without doing a full stack deployment?

There’s a few ways.

Get transformed template of deployed stack

If your stack is deployed, you can get the transformed template from the CloudFormation console (Template tab, enable View processed template).

Or using the AWS CLI, assuming your stack is named <my-stack>:

aws cloudformation get-template --query TemplateBody --change-set-name "$(aws cloudformation describe-stacks --query 'Stacks[0].ChangeSetId' --output text --stack-name <my-stack>)"

Transform template locally

If you want to transform a template locally, you can use the script included in our repository:

git clone https://github.com/aws/serverless-application-model.git
cd serverless-application-model
python3 -m venv .venv
source .venv/bin/activate
make init

Then:

bin/sam-translate.py --template-file template.yaml

Note however that transforming using that script won’t always work, as it assumes the input template is in same format as what AWS::Serverless-2016-10-31 receives (e.g. after sam package and all local paths are replaced with proper URIs to resources in AWS).

Transform template without full deployment

If you want a more faithful transformation, but without actually creating the resources in the template, you can create a change set (not execute it) and get the transformed template.

If it’s too tedious to do it through the console, you could whip up a script such as the following (untested, for inspiration only, not production-ready):

import json
import sys
import uuid

import boto3


def transform(template: str) -> str:
    cfn = boto3.client("cloudformation")
    name = f"transform-{uuid.uuid4()}"
    change_set = cfn.create_change_set(
        TemplateBody=template,
        StackName=name,
        ChangeSetName=name,
        ChangeSetType="CREATE",
        Capabilities=[
            "CAPABILITY_IAM",
            "CAPABILITY_AUTO_EXPAND",
        ],
    )
    change_set_id = change_set["Id"]
    waiter = cfn.get_waiter("change_set_create_complete")
    waiter.wait(
        ChangeSetName=change_set_id,
        WaiterConfig={
            "Delay": 1,
        },
    )
    transformed = cfn.get_template(ChangeSetName=change_set_id)
    cfn.delete_stack(StackName=name)
    return json.dumps(transformed["TemplateBody"])


def main():
    print(transform(sys.stdin.read()))


if __name__ == "__main__":
    main()

And then transform with:

python transform.py < sam-template.yaml > cfn-template.json