terraform-provider-aws: aws_elasticsearch_domain fails on initial apply due to aws_cloudwatch_log_resource_policy

Community Note

  • Please vote on this issue by adding a ๐Ÿ‘ reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave โ€œ+1โ€ or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Terraform CLI and Terraform AWS Provider Version

Terraform v0.12.24
+ provider.aws v3.0.0
+ provider.external v1.2.0
+ provider.vault v2.12.2

Affected Resource(s)

  • aws_elasticsearch_domain
  • aws_cloudwatch_log_resource_policy

Terraform Configuration Files

data "aws_caller_identity" "current" {}

resource "aws_elasticsearch_domain" "es" {
  domain_name           = var.domain_name
  elasticsearch_version = var.elasticsearch_version

  advanced_options = var.advanced_options

  ebs_options {
    ebs_enabled = var.ebs_volume_size > 0 ? true : false
    volume_size = var.ebs_volume_size
    volume_type = var.ebs_volume_type
    iops        = var.ebs_volume_type == "IOPS" ? var.ebs_iops : null
  }

  encrypt_at_rest {
    enabled    = var.encrypt_at_rest_enabled
    kms_key_id = var.encrypt_at_rest_kms_key_id == "" ? module.kms.arn : var.encrypt_at_rest_kms_key_id
  }

  cluster_config {
    instance_count           = var.instance_count
    instance_type            = var.instance_type
    dedicated_master_enabled = var.dedicated_master_enabled
    dedicated_master_count   = var.dedicated_master_enabled ? var.dedicated_master_count : null
    dedicated_master_type    = var.dedicated_master_enabled ? var.dedicated_master_type : null
    zone_awareness_enabled   = var.zone_awareness_enabled

    zone_awareness_config {
      availability_zone_count = var.zone_awareness_enabled ? var.availability_zone_count : null
    }
  }

  node_to_node_encryption {
    enabled = var.node_to_node_encryption_enabled
  }

  vpc_options {
    security_group_ids = concat(var.security_group_ids, [aws_security_group.elasticsearch_sg.id])
    subnet_ids         = length(var.subnet_ids) > 1 ? slice(var.subnet_ids, 0, var.availability_zone_count) : var.subnet_ids
  }

  snapshot_options {
    automated_snapshot_start_hour = var.automated_snapshot_start_hour
  }

  domain_endpoint_options {
    enforce_https       = var.enforce_https
    tls_security_policy = var.tls_security_policy
  }

  dynamic "cognito_options" {
    for_each = var.cognito_options
    content {
      enabled          = cognito_options.value.enabled
      user_pool_id     = cognito_options.value.user_pool_id
      identity_pool_id = cognito_options.value.identity_pool_id
      role_arn         = cognito_options.value.role_arn
    }
  }

  dynamic "log_publishing_options" {
    for_each = { for k, v in var.log_publishing_options : k => v if lookup(v, "enabled") == true }
    content {
      enabled                  = log_publishing_options.value.enabled
      log_type                 = log_publishing_options.value.log_type
      cloudwatch_log_group_arn = aws_cloudwatch_log_group.es_logs[log_publishing_options.key].arn
    }
  }

  tags = merge(
    var.tags,
    {
      Name    = var.domain_name,
      service = var.service,
      team    = var.team,
      phi     = var.phi
    },
  )

  depends_on = [aws_iam_service_linked_role.es]
}

resource "aws_cloudwatch_log_resource_policy" "aes_cloudwatch_log_resource_policy" {
  count           = length({ for k, v in var.log_publishing_options : k => v if lookup(v, "enabled") == true }) > 0 ? 1 : 0
  policy_name     = "${title(replace(var.domain_name, "-", ""))}-CloudwatchResourcePolicy"
  policy_document = data.aws_iam_policy_document.cloudwatch.json
}

data "aws_iam_policy_document" "cloudwatch" {
  statement {
    actions = [
      "logs:PutLogEvents",
      "logs:PutLogEventsBatch",
      "logs:CreateLogStream",
    ]
    effect = "Allow"
    principals {
      type        = "Service"
      identifiers = ["es.amazonaws.com"]
    }
    resources = [
      # for k, v in aws_cloudwatch_log_group.es_logs : "${v.arn}:*" This never works
      for k, v in var.log_publishing_options : "arn:aws:logs:us-east-1:${data.aws_caller_identity.current.account_id}:log-group:/aws/aes/${var.domain_name}/${k}:*" # this almost never works, but seems to have worked once
      # "arn:aws:logs:us-east-1:${data.aws_caller_identity.current.account_id}:log-group:*" This works 100% of the time based on my tests
    ]
  }
}

resource "aws_cloudwatch_log_group" "es_logs" {
  for_each          = { for k, v in var.log_publishing_options : k => v if lookup(v, "enabled", false) == true }
  name              = "/aws/aes/${var.domain_name}/${each.key}"
  retention_in_days = lookup(each.value, "retention_in_days", 14)

  tags = merge(
    var.tags,
    {
      Name    = "/aws/aes/${var.domain_name}/${each.key}"
      service = var.service,
      team    = var.team,
      phi     = var.phi
    },
  )
}

Debug Output

Expected Behavior

The module should have run to completion and created the resources, including the es domain, on the first apply.

Actual Behavior

On first apply, Terraform exits with error:

Error: Error creating ElasticSearch domain: ValidationException: The Resource Access Policy specified for the CloudWatch Logs log group /aws/aes/example-domain/search does not grant sufficient permissions for Amazon Elasticsearch Service to create a log stream. Please check the Resource Access Policy.

However, if you then run the terraform apply again, it passes without issue.

In addition, it seems to function properly if you modify the Resource Access Policy to point to a more open set of permissions, something like "arn:aws:logs:us-east-1:${data.aws_caller_identity.current.account_id}:log-group:*" allows it to function, but we should be able to lock the policy down more than that.

Steps to Reproduce

  1. terraform apply
  2. Error occurs
  3. terraform apply, runs to completion and creates functioning resources.

This makes it extremely difficult to run in CI.

Important Factoids

References

  • #6606 shows a solution, but it requires opening up the policy to all of cloudwatch, instead of just the log groups created for this particular resource.

About this issue

  • Original URL
  • State: open
  • Created 4 years ago
  • Reactions: 13
  • Comments: 17

Most upvoted comments

Has anyone got the fix here? I am facing the same issue and if I do โ€œarn:aws:logs:*โ€ it works, so donโ€™t know whatโ€™s happening here

This is still happening and using a arn:aws:logs:* seems to work alright but I canโ€™t seem to find out the reason. Iโ€™ve tried different dependencies, local waits - nothing helps.

any updates regarding this issue?