terraform-provider-aws: source_code_hash does not update
This issue was originally opened by @joerggross as hashicorp/terraform#20152. It was migrated here as a result of the provider split. The original body of the issue is below.
Terraform Version
v0.11.11
Terraform Configuration Files
data "aws_s3_bucket_object" "lambda_jar_hash" {
bucket = "${var.lambda_s3_bucket}"
key = "${var.lambda_s3_key}.sha256"
}
resource "aws_lambda_function" "lambda_function_s3" {
s3_bucket = "${var.lambda_s3_bucket}"
s3_key = "${var.lambda_s3_key}"
s3_object_version = "${var.lambda_s3_object_version}"
function_name = "${var.lambda_function_name}"
role = "${var.lambda_execution_role_arn}"
handler = "${var.lambda_function_handler}"
source_code_hash = "${base64encode(data.aws_s3_bucket_object.lambda_jar_hash.body)}"
runtime = "java8"
memory_size = "${var.lambda_function_memory}"
timeout = "${var.lambda_function_timeout}"
description = "${var.description}"
reserved_concurrent_executions = "${var.reserved_concurrent_executions}"
}
Debug Output
…
~ module.comp-price-import-data-reader-scheduled-lambda.aws_lambda_function.lambda_function_s3 last_modified: “2019-01-30T11:58:32.826+0000” => <computed> source_code_hash: “6HVMIk6vxvBy4AApmHbQis5Av2uQeSJh3XRosmKtv0U=” => “ZTg3NTRjMjI0ZWFmYzZmMDcyZTAwMDI5OTg3NmQwOGFjZTQwYmY2YjkwNzkyMjYxZGQ3NDY4YjI2MmFkYmY0NQ==”
Plan: 0 to add, 1 to change, 0 to destroy.
Crash Output
~ module.comp-price-import-data-reader-scheduled-lambda.aws_lambda_function.lambda_function_s3 last_modified: “2019-01-30T11:58:32.826+0000” => <computed> source_code_hash: “6HVMIk6vxvBy4AApmHbQis5Av2uQeSJh3XRosmKtv0U=” => “ZTg3NTRjMjI0ZWFmYzZmMDcyZTAwMDI5OTg3NmQwOGFjZTQwYmY2YjkwNzkyMjYxZGQ3NDY4YjI2MmFkYmY0NQ==”
Plan: 0 to add, 1 to change, 0 to destroy.
Expected Behavior
We generate an additional file in the s3 bucket along with the lambda jar file to be deployed in s3. The additional file contains a SHA256 hash of the deployed jar file. The hash value of the file is set to the source_code_hash property of the lamba function, by using the bas64 encode function.
We would expect that the hash is stored in the tfsate and reused when applying the scripts, so that the lambda jar file is not redeployed unless the hash changes.
Actual Behavior
We applied the scripts different times without changing the jar or hash file in s3. Nevertheless terraform always redeployes the jar. The output (see above) is always the same (“6HVMIk6vxvBy4AApmHbQis5Av2uQeSJh3XRosmKtv0U=” => “ZTg3NTRjMjI0ZWFmYzZmMDcyZTAwMDI5OTg3NmQwOGFjZTQwYmY2YjkwNzkyMjYxZGQ3NDY4YjI2MmFkYmY0NQ==”). It seems the the given hash is never stored in the tfstate.
About this issue
- Original URL
- State: open
- Created 5 years ago
- Reactions: 35
- Comments: 25 (5 by maintainers)
Commits related to this issue
- Workaround for https://github.com/terraform-providers/terraform-provider-aws/issues/7385 — committed to michallorens/aws-lambda-python-layer by michallorens 4 years ago
- CDK for Terraform * Created a working version of the Lambda function * Created a zip file of the Lambda code and uploaded to the S3 code bucket * Added a CloudWatch Group and IAM role to be used by... — committed to FormidableLabs/pulumi-terraform-comparison by archetypalsxe 2 years ago
We’re seeing the exact same issue,
source_code_hashis never updated in the tfstate when applying so the lambda resource always requires updating no matter how many times we apply:I’m reporting the same concern, too.
The main problem is the purpose of
source_code_hashisn’t clear. The documentation ofaws_lambda_functionstates thatsource_code_hashis an argument that seems to have an impact to deployment, but that doesn’t seem to be the case.Looking at the source code, it is a computed field. After a successful deploy, the value of
source_code_hashis overwritten by the response from AWS’s API (code) by callingresourceAwsLambdaFunctionRead().In short, the value assigned to
source_code_hashdoesn’t affect deployment and is always overwritten, unless otherwise it matches, the hash returned by AWS API.What we need
We need a way to deterministically trigger lambda deployments (e.g. after code change is detected) without presumptions that everyone uses the same process in packaging their code.
Is
source_code_hashthe correct attribute to use for this? Yes and no. I’d be nice to keep the hash returned by AWS’s API, but probably we’d need another attribute similar tosource_code_hashthat meets our need.Suggestion
source_code_hashis clearly defined as an output.Optional:truefrom the schema forsource_code_hashchange_trigger_hashthat is optional and not computed. Suggestions for better name are welcome.change_trigger_hashis null, then plan and apply would work as how they are working nowchange_trigger_hashis not null, then compare current value to previous value. They they are the same, include change in plan. Otherwise, ignore resource change.@aeschright does this sound like something that we can do? I’ll submit a PR if yes
=========================== Update: Upon looking further,
source_code_hashindeed triggers a change which makes my suggestion invalid. I’ll try an idea out which I hope would workI had a very similar problem where the statefile was not getting an updated
source_code_hashafter an apply. @Miggleness pointed me in the right direction by noting that the value insource_code_hashis overwritten by AWS. This means that the hash you use in your lambda resource definition must be computed the same way that AWS computes the hash. Otherwise, you will always have a different value in yoursource_code_hash, and your lambda will always be redeployed.So when you see something like:
The value on the left is the AWS calculated hash, and the value on the right is the value you are providing terraform in your lambda definition.
If you calculate the hash yourself with shell, use the following algorithm:
If you calculate it with a python script, use something like the following:
I’m experiencing this with v0.12.20 and aws provider v2.65.0 with a zip file that’s referenced from an s3 bucket.
I’m using the
etagfrom the s3 object as the input for the hash, which shouldn’t change unless we upload a new version.When I run apply twice in a row, the input hash is always the same, but the new hash is not being persisted to the state and the next run shows the same output.
An easier (alternative) way to update lambda function on code change, when sourced from S3, would be to set S3 bucket versioning and set lambda zip version:
Dear all,
we do have the issue as described above. The code looks similar. Each time we run
terraform applythe Lambda function is redeployed, even if nothing has changed. I have looked at the output of terraform and can confirm, that the hash ofsource_code_hashis not updated in the state file.I am experiencing the same issue, specifically inside a CI/CD pipeline. It does not occur on OSX and it does not occur in Docker on OSX when the project directory is mounted from OSX.
However, with the same Docker image, TF version, and AWS Provider version, the hashes in the CI pipeline never match. The one generated by
filebase64sha256("../lambda/index.zip")match between runs, however, the ones stored in state are completely different each time.I thought this was an issue of something else getting hashed, such as a timestamp or similar, but the generated hash is the same. Somehow, that hash that gets computed doesn’t get stored under source_code_hash.
This is actually quite a nasty problem because when the Lambda is used with CloudFront, the latter redeploys each time - because AWS thinks that a new version of the Lambda has been created. This then adds an additional at least 3, but often 10+ minutes to CD pipeline.
If someone is still running into this issue, a fix that worked for me was the etag in the
aws_s3_objectresource block.This tag triggers an update on the zip file on deployment.
If you have the lambda resource block pointing to the right bucket and key, the lambda should get updated. I have tried several steps involving manually zipping up the lambdas and using the
archivedata block but on deployment it never detected the change even with source hash.etagon theaws_s3_bucket_objectdid the trick.EDIT: Thanks to @heldersepu for pointing this out. If you are using KMS encryption
source_hashwill be a better alternative.I’ve done a little digging into this issue as I recently encountered it.
In my use case I generate the zip files frequently, even if the underlying contents don’t change, the meta data changes in the zip file cause a different hash.
I tried to generate the hash of the contents outside of the zip and set it as the source code hash to get around this.
From my observations it appears that the source_code_hash field get’s set in the state file from the filename field regardless of the content supplied to it. ie:
filebase64sha256(aws_lambda_function.func.filename).My case what slightly different, but same effect: I am building a
aws_lambda_layer_versionresource. I was trying to havesource_code_hash = filebase64sha256("poetry.lock"), because that was the only file that is there both build and deployment time. I wanted to be smart and make terraform skip deploying a new layer if the poetry.lock did not change.Then I faced the same issue like described here (and several forum posts).
I also ended up storing the hash alongside the zip file in s3 during build time. I calculate the hash like @AGiantSquid suggests, using
cat layer.zip | openssl dgst -binary -sha256 | openssl base64, and then uploaded it to s3 alongside the zip file.In the deployment terraform code I tried to apply the following, somewhat ugly construct:
My terraform plan finally has stopped marking my layer to be replaced.
The problem is, that it stopped for the wrong reasons. Now it does not update even if the actual zipped dependencies change!It really works, whenever the layer zip’s hash changes in s3, a new layer version will be produced. Make sure to upload the layer withcontent_typetext/plainto avoid much frustration.I have also looked into the fact that zipping twice the same content causes different hashes (no matter that the zips’ content are the same)
, so I am starting to get clueless.I use the
deterministic_zippip package to make sure my zip’s effective content doesn’t change, yet allow myself to upload all the time a new zip file, like @joerggross, too. (https://github.com/bboe/deterministic_zip)My 2 cents, and I admit, I was facepalming myself as well, when I found out, but look at the documentation: https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/lambda_function#source_code_hash
source_code_hash- (Optional) Used to trigger updates. Must be set to a base64-encoded SHA256 hash of the package file specified with eitherfilenameors3_key. The usual way to set this is${filebase64sha256("file.zip")}(Terraform 0.11.12 or later) or${base64sha256(file("file.zip"))}(Terraform 0.11.11 and earlier), where “file.zip” is the local filename of the lambda layer source archive.This is not a new entry (I went there and looked up in the git history, this piece of information was there for more than 4 years, https://github.com/hashicorp/terraform-provider-aws/commit/992d6978ce734d50124e3bed00c4022c106b3085). Even it being old enough so I can shame myself not noticing it during implementation, I totally align my opinion with @Miggleness about making this a little bit less prone to mess up.
I am not yet sure, though, how to ergonomically eliminate the hashing function there. Because I feel the
source_code_hashparameter in its current form allows too much moving parts, causing all of us to naively drop in different calculations. Whereas the documentation clearly states that it must be set to a specific hash function of a specific file. An option could be the resource calculating the hash, lowering the level of abstraction of the parameter we provide to thesource_package_fileonly, that can be a zip, jar, etc whatever we need to deploy.Using a version id will not work for us, because we want to use snapshot-versions during development time, without always deploying and referencing a new version number.