terraform-provider-aws: Order is lost for data `aws_iam_policy_document` when applied to S3 buckets, iam roles, kms keys, etc

Community Note

  • Please vote on this issue by adding a šŸ‘ reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave ā€œ+1ā€ or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

See also #21968

Terraform Version

āœ— terraform --version
Terraform v0.12.20
+ provider.aws v2.46.0

Affected Resource(s)

  • aws_ecr_repository_policy
  • aws_kms_key
  • aws_s3_bucket
  • aws_launch_configuration
  • aws_lb_listener
  • aws_codeartifact_domain_pernission_policy

Terraform Configuration Files

If running locally, please add your own real roles to local.secrets_roles otherwise the kms policy will fail.

provider "aws" {
  region = "us-east-1"
}

data "aws_caller_identity" "current" {}

locals {
  id = data.aws_caller_identity.current.account_id
  secrets_roles = [
    "asnip",
    "esnip",
    "lsnip",
    "psnip",
    "wsnip",
    "csnip",
#    "default",
  ]

  secrets_roles_arns = sort([
    for role in local.secrets_roles :
    "arn:aws:iam::${local.id}:role/${role}"
  ])

  secrets_admin = [
    "arn:aws:iam::${local.id}:role/sre"
  ]
}

data "aws_iam_policy_document" "secrets" {
  policy_id = "secrets-policy"

  statement {
    sid       = "Enable IAM User Permissions"
    effect    = "Allow"
    resources = ["*"]
    actions = [
      "kms:Create*",
      "kms:Describe*",
      "kms:Enable*",
      "kms:List*",
      "kms:Put*",
      "kms:Update*",
      "kms:Revoke*",
      "kms:Disable*",
      "kms:Get*",
      "kms:Delete*",
      "kms:ScheduleKeyDeletion",
      "kms:CancelKeyDeletion",
    ]

    principals {
      type        = "AWS"
      identifiers = local.secrets_admin
    }
  }

  statement {
    sid       = "Allow use of the key"
    effect    = "Allow"
    resources = ["*"]

    actions = [
      "kms:Encrypt",
      "kms:Decrypt",
      "kms:ReEncrypt*",
      "kms:GenerateDataKey*",
      "kms:DescribeKey",
      "kms:CreateGrant",
      "kms:ListGrants",
      "kms:RevokeGrant"
    ]

    principals {
      type = "AWS"

      identifiers = local.secrets_roles_arns
    }
  }
}

resource "aws_kms_key" "secrets_test" {
  description         = "test Default key for secrets encryption."
  key_usage           = "ENCRYPT_DECRYPT"
  is_enabled          = true
  enable_key_rotation = false

  policy = data.aws_iam_policy_document.secrets.json
}

resource "aws_kms_alias" "secrets_test" {
  name          = "alias/secrets-test"
  target_key_id = aws_kms_key.secrets_test.key_id
}

output "role_arns" {
  value = local.secrets_roles_arns
}

output "policy_json" {
  value = data.aws_iam_policy_document.secrets.json
}

output "key_policy_json" {
  value = aws_kms_key.secrets_test.policy
}

In terraform console

local.secrets_roles_arns is alphabetical

[
  "arn:aws:iam::1234567890:role/asnip",
  "arn:aws:iam::1234567890:role/csnip",
  "arn:aws:iam::1234567890:role/esnip",
  "arn:aws:iam::1234567890:role/lsnip",
  "arn:aws:iam::1234567890:role/psnip",
  "arn:aws:iam::1234567890:role/wsnip",
]

data.aws_iam_policy_document.secrets.json is reverse alphabetical which is expected (https://github.com/terraform-providers/terraform-provider-aws/issues/6107)

...
      "Principal": {
        "AWS": [
          "arn:aws:iam::1234567890:role/wsnip",
          "arn:aws:iam::1234567890:role/psnip",
          "arn:aws:iam::1234567890:role/lsnip",
          "arn:aws:iam::1234567890:role/esnip",
          "arn:aws:iam::1234567890:role/csnip",
          "arn:aws:iam::1234567890:role/asnip"
        ]
      }
...

aws_kms_key.secrets_test.policy returns the same output which is perfect

      "Principal": {
        "AWS": [
          "arn:aws:iam::1234567890:role/wsnip",
          "arn:aws:iam::1234567890:role/psnip",
          "arn:aws:iam::1234567890:role/lsnip",
          "arn:aws:iam::1234567890:role/esnip",
          "arn:aws:iam::1234567890:role/csnip",
          "arn:aws:iam::1234567890:role/asnip"
        ]
      }

After applying the above, I added a default role to local.secrets_roles and I see this in the plan

...
                      ~ Principal = {
                          ~ AWS = [
                              - "arn:aws:iam::1234567890:role/asnip",
                              - "arn:aws:iam::1234567890:role/psnip",
                                "arn:aws:iam::1234567890:role/wsnip",
                              - "arn:aws:iam::1234567890:role/csnip",
                              - "arn:aws:iam::1234567890:role/esnip",
                              + "arn:aws:iam::1234567890:role/psnip",
                                "arn:aws:iam::1234567890:role/lsnip",
                              + "arn:aws:iam::1234567890:role/esnip",
                              + "arn:aws:iam::1234567890:role/default",
                              + "arn:aws:iam::1234567890:role/csnip",
                              + "arn:aws:iam::1234567890:role/asnip",
                            ]
                        }
...

Debug Output

Expected Behavior

Order should be retained. I wouldn’t mind the reverse alphabetical order at the very least.

Actual Behavior

Random order

Steps to Reproduce

  1. Create a few roles (or reuse them)
  2. Set the local.secrets_roles with actual roles
  3. terraform apply
  4. Change local.secrets_roles by adding another existing role
  5. terraform plan and you will see random order

Important Factoids

I edited a policy in AWS console, changed the order of principal aws to alphabetical, saved it, and also saw a random order. Perhaps it is the AWS API that is reordering them?

References

Workaround

After upgrading to Terraform 0.14.x, my most recent workaround is by supplying an output

output "key_policy_json" {
  value = data.aws_iam_policy_document.secrets.json
}

Then in the terraform plan or terraform apply, it will hide all the unchanged and only show the few that changed.

              ~ {
                  ~ Principal = {
                      ~ AWS = [
                            # (250 unchanged elements hidden)
                            "arn:aws:iam::1234567890:role/csnip",
                          + "arn:aws:iam::1234567890:role/default",
                            "arn:aws:iam::1234567890:role/esnip",
                            # (125 unchanged elements hidden)
                        ]
                    }
                    # (4 unchanged elements hidden)
                },

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Reactions: 256
  • Comments: 45 (13 by maintainers)

Most upvoted comments

Hi all šŸ‘‹ Just letting you know that this is issue is featured on this quarters roadmap. We will be a looking for a pattern to implement that can help resolve both this and related issues.

If a PR exists to close the issue a maintainer will review and either make changes directly, or work with the original author to get the contribution merged. If you have written a PR to resolve the issue please ensure the ā€œAllow edits from maintainersā€ box is checked. Thanks for your patience and we are looking forward to getting this merged soon!

This is the best write-up I have see for what I think is going on here: https://github.com/hashicorp/terraform-plugin-framework/issues/70. The suppressEquivalentAwsPolicyDiffs DiffSuppressFunc correctly returns that 2 policy documents are equal but the value of the attribute set during resource Read has elements in a different order and the Terraform CLI reports a change.

Note to (future) self - Related:

A solution is to add a helper like

func SetPolicyIfNotEquivalent(d *schema.ResourceData, key, new string) error {
    old := d.Get(key).(string)

    equivalent, err := awspolicy.PoliciesAreEquivalent(old, new)

    if err != nil {
        return err
    }

    if !equivalent {
        d.Set(key, new)
    }

    return nil
}

so that the new policy value is only set if it’s not equivalent to the prior value.

I’m experiencing a similar problem with Terraform 0.15.4 .

Terraform 0.15.4 will now report the changes made outside of Terraform as part of the plan result. Some list orders is not retained, so Terraform always reports list orders have been changed like below.

This problem has no real impact, but the plan results are unnecessarily long, and visibility is degraded.

Note: Objects have changed outside of Terraform

Terraform detected the following changes made outside of Terraform since the
last "terraform apply":

  # aws_iam_role.example has been changed
  ~ resource "aws_iam_role" "example" {
      ~ assume_role_policy    = jsonencode(
          ~ {
              ~ Statement = [
                  ~ {
                      ~ Principal = {
                          ~ AWS = [
                              - "arn:aws:iam::XXXXXXXXXXXX:user/aaaaa",
                              + "arn:aws:iam::XXXXXXXXXXXX:user/ccccc",
                                "arn:aws:iam::XXXXXXXXXXXX:user/bbbbb",
                              + "arn:aws:iam::XXXXXXXXXXXX:user/aaaaa",
                                "arn:aws:iam::XXXXXXXXXXXX:user/ddddd",
                              - "arn:aws:iam::XXXXXXXXXXXX:user/ccccc",
                            ]
                        }
                        # (3 unchanged elements hidden)
                    },
                ]
                # (1 unchanged element hidden)
            }
        )
        id                    = "example"
        name                  = "example"
        tags                  = {}
        # (8 unchanged attributes hidden)

        # (1 unchanged block hidden)
    }

Even though the api returns it in a random order, couldn’t the apply save the policy in alphabetical order in the tfstate and then have the plan retrieve it from the aws api and then alphabetize it again to diff it thereby avoiding the reorder shown in the current diff?

This issue affects 1.0 as well.

This issue does not appear to be limited to aws_kms_key, it may be for assume_role_policy attributes in general. As of Terraform 0.15.4 this also occurs for the assume_role_policy attribute of aws_iam_role resources if there is >1 federated principal defined. Running terraform plan over and over on an unchanging AWS resource randomly calculates different diffs in the ā€œObjects have changed outside of Terraformā€ section.

  # aws_iam_role.foo has been changed
  ~ resource "aws_iam_role" "foo" {
      ~ assume_role_policy    = jsonencode(
          ~ {
              ~ Statement = [
                  ~ {
                      ~ Principal = {
                          ~ Federated = [
                              - "arn:aws:iam::ACCOUNTID:saml-provider/saml-provider-a",
                                "arn:aws:iam::ACCOUNTID:saml-provider/saml-provider-b",
                              + "arn:aws:iam::ACCOUNTID:saml-provider/saml-provider-a",
                            ]
                        }
                        # (3 unchanged elements hidden)
                    },
                ]
                # (1 unchanged element hidden)
            }
        )
        id                    = "foo"
        name                  = "foo"
        # (7 unchanged attributes hidden)
    }

I think all that’s needed is a smarter diff function for policy attributes in various AWS resources. So that first of all both local and remote policy documents are normalised by sorting principals and actions, and by wrapping singular items in lists, and only then the two normalised policies should be structurally compared. That would have eliminated almost all false drifts that we are seeing on a daily basis. I am currently upgrading to Terraform 1.0 and this non-deterministic ordering results in drifts being reported all the time, even after an explicit terraform refresh. This is going to be an even bigger source of noise from now on.

Just another ā€œme, tooā€. This is tehncially AWS’s fault if it’s reordering lists inside JSON; that’s supposed to be a no-no. But since these particular lists are semantically treated as sets, probably Terraform should special-case the comparison so that it doesn’t care about order.

we noticed similar issue as well

It still affects 1.0.1 as well. For me all resources (usually policies) which has any ARN list like the example below are affected:

...
        Principal = {
            AWS = [
                "arn:aws:iam::123456789012:user/user-0",
                "arn:aws:iam::123456789012:user/user-1",                            ]
          }
...

It still occurs on every plan for a random selection of resources even after applying changes but I’ll be switching to one of the newly reported defects that cover my specific remaining symptoms as suggested by YakDriver.

Thanks all for the great work on the larger issue group, it’s clearly been a lot of work!

This functionality has been released in v3.70.0 of the Terraform AWS Provider. Please see the Terraform documentation on provider versioning or reach out if you need any assistance upgrading.

For further feature requests or bug reports with this functionality, please create a new GitHub issue following the template. Thank you!

Ordering is easy but…

ā€œOrderā€ in the context of this issue has two caveats:

  1. We’re talking about the order of a JSON array within JSON data. With aws_iam_policy_document, we have some say in the order because we’re generating the JSON. But, with the plain policy argument set to jsonencode() or a heredoc (šŸ™€), we don’t do anything with the order. That’s not to say we couldn’t but so far we haven’t crossed that bridge. Hopefully, we won’t have to.
  2. AWS is the data store for the JSON data and does not preserve order. We can preserve order in state but as far as how things show up in the AWS console, we have no control. Even after an import, currently we would simply use the order as provided by AWS.

What are we solving?

I see there being two problems:

  1. avoiding perpetual diffs
  2. providing a minimal diff in the case of an actual out-of-band drift detection

Our approach will be to fix # 1 and then see where we’re at with # 2. Currently the two problems are intertwined in the issues to the point they are convulting each other. Once pain points are alleviated (hopefully) with #21968, we’ll need your input to determine what problems still remain.

Wouldn’t automagically sorting lists in AWS API response AND expected principals before comparing solve this? This should be easy šŸ˜‰

Confirming it works like that for ANY policy. Suggestion about sorting principals before storing and comparing should do the trick here.

@nitrocode can you change the title so it’s not misleading?

May not even be limited to assume_role_policy. I’m also seeing this on the policy attribute of an aws_s3_bucket. (I’m also using the jsonencode approach.)

I’m experiencing a similar problem with Terraform 0.15.4 .

Same.

Using Terraform 0.15.4 and applying for a directory with 30+ ecr repository resources with policies. Those policies are having their Principals section randomly sorted by AWS IAM, but neither lifecycle ignore_changes nor terraform refresh get rid of the changes on subsequent applies, which is somewhat difficult since the output stream is very verbose and makes it harder to grok the updated changes.

After looking into this issue and looking into the jen20 library for awspolicyequivalency and https://github.com/hashicorp/terraform-provider-aws/blob/main/aws/resource_aws_kms_key.go#L205 nothing in Terraform was causing the policy re ordering and the Terraform outputs looked as I’d expect.

As it turns out, KMS just saves in a random order. This can be replicated with a plain old aws_kms_key (read: not related to the loops/lists as above) passing in a policy, and later appending a new principal.

Can also replicate this in the AWS console:

  • Create a key with a policy
  • Edit the policy
  • Save the policy
  • The ordering has randomly/non deterministically changed.

@anthonyAdhese We hoped to alleviate the majority of a pain points with this push. I ask that you give 3.70.0 a try when it is available. If you’re still seeing issues, please open a new issue! We will need a complete config to be able to reproduce the issue.

This issue is still present on 1.0.9

Seeing this on version 1.0.5 with aws_iam_role in Principal > Service.

I don’t think this is an issue with the version of Terraform, but the version of the AWS provider itself, as this is wholly within the aws_iam_policy_document. Until this is fixed here, upgrading Terraform will not help?

same issue on aws_ecr_repository_policy

We skipped 0.15 and went to 1.0 but I can confirm that on 0.14 I did not see this issue and after upgrading to 1.0 I now see this in a number of places. Sorting the lists in the terraform code does not help.

Same issue here, this makes the idea of checking the plan nearly impossible without some programmatic filtering on the client side, as this bug generates thousands of lines of redundant plan code (in our case) and everything else gets lost in the noise.

Terraform 15.5

Same issue here with 0.14.11and aws_codeartifact_domain_permissions_policy

The form of the defect described here https://github.com/hashicorp/terraform-provider-aws/issues/11801#issuecomment-848984552 is still occurring.

$ terraform --version
Terraform v1.1.1
on darwin_amd64
+ provider registry.terraform.io/hashicorp/aws v3.70.0

hi any updates here?

How AWS stores resources/objects in their backend is quite different than what they present to users, and for many years I’ve suspected that when appropriate they either store some data in unordered sets or have code that somewhere along the line treats some data as unordered lists. Unfortunately JSON does not support unordered sets, resulting in what should be an unordered list instead being returned to us as a reordered list. This phenomenon is not unique to AWS, and can sometimes be seen when diffing JSON-formatted Terraform statefiles.

Instructing Terraform to ignore reordering is one solution, however IMO it’s potentially worth natively supporting the concept unordered sets. As for identifying them, perhaps we can utilize the results of existing unit tests?

This issue is still present on 1.0.7

As @chroju points out the issue is just state change detection and not the execution plan

Unless you have made equivalent changes … the following plan may include actions to undo or respond to these changes.

So you often see

Plan: 0 to add, 0 to change, 0 to destroy.

It feels like the distinction between an ordered list and an unordered set could be first-class consideration in both to handle all these cases where dubious APIs by AWS and others return things in random order

It might help if a core Hashicorp person could explain the rationale this is only in the plan phase as it seems intentional.

Mitigations could include

  • a configuration option to hide noop changes in state change detection
  • a mechanism for providers to declare if an array is an ordered list or unordered set to opt into this behaviour

Same problem here with an S3 Bucket Policy. Terraform 15.5, hashicorp/aws v3.44.0

even the AWS UI does this … If you refresh the KMS key view page a couple of times you will notice the principals in the policy returned in random orders.

can confirm this same issue exists in 1.0.2 as well. Ordering is not respected and subsequent runs will throw warnings for differing ā€œchangesā€ each time, even though no actual changes have been made.

+1 for this, seeing it in both KMS resource-based-policies and IAM resources with terraform 0.15.4.