cert-manager: DNS01 Challenge via AWS Route53 on AWS EKS 1.15 not working

Describe the bug: When doing Let’s Encrypt validation via Route53 running on EKS 1.15 with IRSA getting AccessDenied

E0708 12:18:28.403757       1 controller.go:143] cert-manager/controller/challenges "msg"="re-queuing item  due to error processing" "error"="error instantiating route53 challenge solver: unable to assume role: AccessDenied: User: arn:aws:sts::XXXXXXXXXXXX:assumed-role/ajetstack-cert-manager/1594210708202645062 is not authorized to perform: sts:AssumeRole on resource: arn:aws:iam::XXXXXXXXXXXX:role/jetstack-cert-manager\n\tstatus code: 403, request id: XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX" "key"="example/example-internal.our-domain.com-4004189744-1292225756-1302369230" 

Expected behaviour: Issued LE cert via DNS01 validation

Steps to reproduce the bug: Create following cluster-issuer and ingress ClusterIssuer object:

---
apiVersion: cert-manager.io/v1alpha2
kind: ClusterIssuer
metadata:
  name: r53-letsencrypt-prod
  namespace: cert-manager
spec:
  acme:
    email: DevOpsSupport@our-domain.com
    privateKeySecretRef:
      name: r53-letsencrypt-prod
    server: https://acme-v02.api.letsencrypt.org/directory
    solvers:
    - selector:
        dnsZones:
          - "our-domain.com"
      dns01:
        route53:
          region: eu-central-1
          role: arn:aws:iam::XXXXXXXXXXXX:role/jetstack-cert-manager

Ingress object:

---
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  annotations:
    cert-manager.io/acme-challenge-type: dns01
    cert-manager.io/cluster-issuer: r53-letsencrypt-prod
    kubernetes.io/ingress.class: nginx-ingress-internal
    nginx.ingress.kubernetes.io/proxy-body-size: 10m
  labels:
    app.kubernetes.io/component: jenkins-master
    app.kubernetes.io/instance: jenkins
    app.kubernetes.io/managed-by: Tiller
    app.kubernetes.io/name: jenkins
    helm.sh/chart: jenkins-1.3.6
  name: jenkins-internal
  namespace: jenkins
spec:
  rules:
  - host: jenkins.our-domain.com
    http:
      paths:
      - backend:
          serviceName: jenkins
          servicePort: 8080
  tls:
  - hosts:
    - jenkins.our-domain.com
    secretName: jenkins.our-domain.com

Anything else we need to know?: IRSA working properly, tested. Same role for R53 working good from CLI

IAM policy(veeery broad):

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "",
            "Effect": "Allow",
            "Action": [
                "route53:ListResourceRecordSets",
                "route53:ChangeResourceRecordSets"
            ],
            "Resource": "arn:aws:route53:::hostedzone/*"
        },
        {
            "Sid": "",
            "Effect": "Allow",
            "Action": "route53:GetChange",
            "Resource": "arn:aws:route53:::change/*"
        },
        {
            "Sid": "",
            "Effect": "Allow",
            "Action": "route53:ListHostedZones*",
            "Resource": "*"
        }
    ]
}

EDIT Thrust relationship:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "",
      "Effect": "Allow",
      "Principal": {
        "Federated": "arn:aws:iam::XXXXXXXXXXXX:oidc-provider/oidc.eks.eu-central-1.amazonaws.com/id/XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"
      },
      "Action": "sts:AssumeRoleWithWebIdentity",
      "Condition": {
        "StringEquals": {
          "oidc.eks.eu-central-1.amazonaws.com/id/XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX:sub": "system:serviceaccount:cert-manager:cert-manager"
        }
      }
    }
  ]
}

Plus trust relationships working as needed (tested with pod)

SA object:

---
apiVersion: v1
kind: ServiceAccount
metadata:
  annotations:
    arn:aws:iam::XXXXXXXXXXXX:role/jetstack-cert-manager
  creationTimestamp: "2020-07-08T12:12:55Z"
  labels:
    app: cert-manager
    app.kubernetes.io/component: controller
    app.kubernetes.io/instance: cert-manager
    app.kubernetes.io/managed-by: Tiller
    app.kubernetes.io/name: cert-manager
    helm.sh/chart: cert-manager-v0.15.2
  name: cert-manager
  namespace: cert-manager
  resourceVersion: "46367502"
  selfLink: /api/v1/namespaces/cert-manager/serviceaccounts/cert-manager
  uid: 45d3a11b-feaa-4b7a-a91e-146454ad770a
secrets:
- name: cert-manager-token-nfqpv

Environment details::

  • Kubernetes version: v1.15.11
  • Cloud-provider/provisioner: AWS EKS
  • cert-manager images: deployed from quay.io official ones
  • cert-manager version: v0.15.2 (did not worked on 12.x, 13.x, 14.x aswell)
  • Install method: helm
  • LBL: ELB Internal /kind bug

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Reactions: 3
  • Comments: 21 (3 by maintainers)

Commits related to this issue

Most upvoted comments

I was running into the same issue, I added this to my values.yaml

extraArgs:
  - --issuer-ambient-credentials

It appears that by default the namespace Issuers do not allow using “ambient” credentials, this let’s them.

I got it working. I’ve described my installation procedure below. Hope can help solve your issues too @johndietz and @JanisOrlovs .

CertManager

This section describes how to install and setup cert-manager on an AWS EKS cluster. It will integrate with Route53 DNS for issuing DNS-based certificate challenges.

IAM Open ID Connect provider for the cluster

This is required for associating k8s Service Accounts with AWS IAM roles and policies. This is considered more secure than allowing the worker nodes to access AWS services (e.g. S3 Buckets, Route53 DNS, etc.).

eksctl utils associate-iam-oidc-provider --cluster name-of-the-cluster

More info about the command above can be found here.

AWS IAM Policy

Create an AWS IAM policy named AllowExternalDNSUpdates and attach the following permissions to it:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "VisualEditor0",
            "Effect": "Allow",
            "Action": "route53:ChangeResourceRecordSets",
            "Resource": "arn:aws:route53:::hostedzone/*"
        },
        {
            "Sid": "VisualEditor1",
            "Effect": "Allow",
            "Action": [
                "route53:GetChange",
                "route53:ListHostedZones",
                "route53:ListResourceRecordSets",
                "route53:ListHostedZonesByName"
            ],
            "Resource": "*"
        }
    ]
}

Less can probably do but the above works well if you want to use the same policy with e.g. external-dns.

Check the cert-manager docs if you wish to limit the policy to specific Route53 resources.

Create the AWS IAM Role

Create an IAM Service Account thats allowed to perform DNS updates (Route53). This is required in order to use DNS-based certificate challenges.

aws iam create-role --role-name cert-manager 

Attach the AllowExternalDNSUpdates policy to the role

This assumes that a policy named AllowExternalDNSUpdates already exists at arn:aws:iam::XXXXXXXXXXXX:policy/AllowExternalDNSUpdates (created in an earlier step)

aws iam attach-role-policy --policy-arn arn:aws:iam::XXXXXXXXXXXX:policy/AllowExternalDNSUpdates --role-name cert-manager

Trust relationship

Assign a trust relationship to the newly created IAM role in AWS web console, allowing you to map kubernetes service accounts to AWS IAM roles. It should look something like this, with the values in <value> replaced accordingly:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Federated": "arn:aws:iam::<aws-account-id>:oidc-provider/oidc.eks.eu-north-1.amazonaws.com/id/<eks-hash>"
      },
      "Action": "sts:AssumeRoleWithWebIdentity",
      "Condition": {
        "StringEquals": {
          "oidc.eks.eu-north-1.amazonaws.com/id/<eks-hash>:sub": "system:serviceaccount:cert-manager:cert-manager"
        }
      }
    }
  ]
}

You can find the value at Principal.Federated at https://console.aws.amazon.com/iam/home?region=eu-north-1#/providers (replace eu-north-1 with your region). You can extract the <aws-account-id> and <eks-hash> from that string too.

Cert manager custom values

Create a file called cert-manager.values.yaml This will cause the cert-manager ServiceAccount to be annotated with the newly created cert-manager AWS IAM role.

securityContext:
  enabled: "true"
serviceAccount:
  annotations:
    eks.amazonaws.com/role-arn: arn:aws:iam::<aws-account-id>:role/cert-manager

Note the securityContext part which is a fix if you experience the issue described in issue #2147.

Install CertManager from Helm template

Start by checking that you have the correct <aws-account-id> in cert-manager.values.yaml, then run the command below: Add the jetstack/cert-manager helm repo first if it’s not present.

helm template cert-manager jetstack/cert-manager \
  --namespace cert-manager \
  --version v0.15.2 \
  --set installCRDs=true \
  -f cert-manager.values.yaml | kubectl apply -f -

ClusterIssuer

Create a ClusterIssuer which can be used to issue Lets Encrypt certificates in all namespaces.

Save the following yaml into a file called letsencrypt.clusterissuer.yaml

apiVersion: cert-manager.io/v1alpha2
kind: ClusterIssuer
metadata:
  name: letsencrypt
spec:
  acme:
    # You must replace this email address with your own.
    # Let's Encrypt will use this to contact you about expiring
    # certificates, and issues related to your account.
    email: your@email.com
    server: https://acme-v02.api.letsencrypt.org/directory
    privateKeySecretRef:
      name: letsencrypt-pk
    # Use DNS as challenge solver
    solvers:
      - dns01:
          route53:
            region: eu-north-1

Apply the manifest to your cluster:

kubectl apply -f letsencrypt.clusterissuer.yaml

What does your ClusterIssuer look like @johndietz? I was seeing some AssumeRole errors until I removed the “role” entry from my dns01 solver.

Update: tried v0.15.2 with a ClusterIssuer and it is working with IRSA, whereas before I was using an Issuer with v1.0.3 and it was not. fsGroup: 1001, and runAsUser: 1001 is also set in the securityContext of the Helm chart.

It seems that something may have broken in between releases or something w.r.t IRSA was not accurately captured in the docs.


Follow up: v1.0.3 of the helm chart is working with a ClusterIssuer and IRSA, but the same setup with a Issuer does not work

I’ve got the same issue as @metral with v1.0.3 of the chart: it works fine with a ClusterIssuer, but I get an error when using an Issuer.

Hi,

I am seeing something similar.

Error when describing challenge resources

Warning  PresentError  9s (x15 over 17s)  cert-manager  (combined from similar events): Error presenting challenge: error instantiating route53 challenge solver: unable to assume role: AccessDenied: User: arn:aws:sts::XXXXXXXXXXX:assumed-role/cert-manager/XXXXXXXXXXXXXXXXXXX is not authorized to perform: sts:AssumeRole on resource: arn:aws:iam::XXXXXXXXXXX:role/cert-manager

ClusterIssuer

apiVersion: cert-manager.io/v1alpha2
kind: ClusterIssuer
metadata:
  name: letsencrypt
spec:
  acme:
    # You must replace this email address with your own.
    # Let's Encrypt will use this to contact you about expiring
    # certificates, and issues related to your account.
    email: my@email.com
    server: https://acme-v02.api.letsencrypt.org/directory
    privateKeySecretRef:
      name: letsencrypt-pk
    # Use DNS as challenge solver
    solvers:
      - selector:
          dnsZones:
            - "sub.domain.com"
            - "sub.sub.domain.com"
            - "sub.sub.domain.com"
        dns01:
          route53:
            region: eu-north-1
            role: arn:aws:iam::XXXXXXXXXXX:role/cert-manager

Trust Relationship

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Federated": "arn:aws:iam::XXXXXXXXXXXX:oidc-provider/oidc.eks.eu-north-1.amazonaws.com/id/XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"
      },
      "Action": "sts:AssumeRoleWithWebIdentity",
      "Condition": {
        "StringEquals": {
          "oidc.eks.eu-north-1.amazonaws.com/id/XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX:sub": "system:serviceaccount:cert-manager:cert-manager"
        }
      }
    }
  ]
}

Service Account

I’ve annotated the cert-manager service account as described here.

apiVersion: v1
kind: ServiceAccount
metadata:
  annotations:
    eks.amazonaws.com/role-arn: arn:aws:iam::XXXXXXXXXXXX:role/cert-manager
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"v1","kind":"ServiceAccount","metadata":{"annotations":{"eks.amazonaws.com/role-arn":"arn:aws:iam::XXXXXXXXXXXX:role/cert-manager"},"labels":{"app":"cert-manager","app.kubernetes.io/component":"controller","app.kubernetes.io/instance":"cert-manager","app.kubernetes.io/managed-by":"Helm","app.kubernetes.io/name":"cert-manager","helm.sh/chart":"cert-manager-v0.14.1"},"name":"cert-manager","namespace":"cert-manager"}}
  creationTimestamp: "2020-07-10T15:23:17Z"
  labels:
    app: cert-manager
    app.kubernetes.io/component: controller
    app.kubernetes.io/instance: cert-manager
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: cert-manager
    helm.sh/chart: cert-manager-v0.14.1
  name: cert-manager
  namespace: cert-manager
  resourceVersion: "582016"
  selfLink: /api/v1/namespaces/cert-manager/serviceaccounts/cert-manager
  uid: c66be7a3-2267-4ca2-8b41-31e78bba856e
secrets:
- name: cert-manager-token-wk9tw

Security Context

I also tried updating the security context as described here.

spec:
    securityContext:
    enabled: true
    fsGroup: 1001
    containers: ...

Environment

  • Kubernetes version: v1.16
  • Istio version: 1.6.2
  • Cloud-provider/provisioner: AWS EKS
  • cert-manager version: v0.14.1
  • Install method: kubectl apply --validate=false -f https://github.com/jetstack/cert-manager/releases/download/v0.14.1/cert-manager.yaml followed by annotating cert-manager service account.

Thanks for clarifying, @anarsen. Took a while to figure out how to force it to use the eks.amazonaws.com/role-arn: arn:aws:iam::<aws-account-id>:role/cert-manager from the annotation in stead of an assumed role.