aws-cdk: (certificatemanager): DnsValidatedCertificate timeout while waiting for certificate approval
Describe the bug Creating certificates via certificate manager and route54 DNS validation fails with a timeout. Error message:
Failed to create resource. Resource is not in the state certificateValidated
Expected behavior The lambda waiting for the approval should probably wait more than the hardcoded 5 minutes right now.
Version:
- OS: linux
- Programming Language: typescript
- CDK Version: 0.33.x
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Reactions: 29
- Comments: 34 (7 by maintainers)
Commits related to this issue
- fix(certificatemanager): Increase wait time for DNS validation Allow the Lambda function to wait up to 9 minutes and 20 seconds before bailing out waiting for the domain to be validated. It used to b... — committed to aws/aws-cdk by RomainMuller 5 years ago
- fix(certificatemanager): Increase wait time for DNS validation Allow the Lambda function to wait up to 9 minutes and 20 seconds before bailing out waiting for the domain to be validated. It used to b... — committed to aws/aws-cdk by RomainMuller 5 years ago
- fix(certificatemanager): Increase wait time for DNS validation (#2961) Allow the Lambda function to wait up to 9 minutes and 20 seconds before bailing out waiting for the domain to be validated. It ... — committed to aws/aws-cdk by RomainMuller 5 years ago
- feat(certificatemanager): deprecate DnsValidatedCertificate (#21982) Now that the official CloudFormation resource `AWS::CertificateManager::Certificate` (CDK's `Certificate` construct) supports DNS ... — committed to aws/aws-cdk by corymhall a year ago
I had this same issue happen, and it turned out that my domain had a different set of name servers than the created hosted zone.
To fix it manually: You can update the name servers for a domain to match the hosted zone in the top right of the domain information on the R53 console (on the left menu click on “registered domains” then click on your domain in the list).
AWS docs for updating name servers here: https://docs.aws.amazon.com/Route53/latest/DeveloperGuide/domain-name-servers-glue-records.html
As for the CDK, the HostedZone construct should probably be updated to use the name servers that the domain is configured for so that multiple hosted zones can be created for the same domain.
It is also worth noting that I had transferred the domain from a different AWS account, and had no existing hosted zones. Not sure how the existing implementation determines what name servers to use for a hosted zone, but maybe this is why it is failing to use the correct ones?
Still a problem; Requested at 2020-01-16T10:33:04UTC Issued at 2020-01-16T10:46:21UTC Can the delay duration be a variable so we can specify a value?
I’m hitting this also again and interestingly it seems that it depends on the region - we are using eu-central-1 for everything besides the cognito certificates (they must be issued in us-east-1 for custom domains). In eu-central-1 the approval goes through in sec/mins - for us-east-1 it takes hours. I don’t know what could be the follow up problems, but what if we add an option to skip the validation of the certificate issue status - is this possible at all?
Lately, certificate validation often takes more than 10 minutes. In the worst case it took about 42 minutes, as far as I tested. It would be better if the waiter params could be specified in
DnsValidatedCertificateProps.Here is a potential solution: Make sure that your Hosted Zone (the one you are writing the CNAME record-set to) is registered. Meaning: when you type in the “zone name” (i.e vincent.subdomain.domain.co.za) of the Hosted Zone in NSLookup it should return the 4 name servers. If it does not, then you cannot validate a certificate with that domain name (hosted zone)
I’m sorry but I believe this can only be properly fixed by Amazon internal team.
The problem is that DnsValidatedCertificate works by creating a custom resource with lambda that adds those records and then waits for validation. But since this is a lambda, there is a max run time of 15 minutes. Yet based on comments above, validating certificates may take hours on us-east-1. I’ve been currently waiting on validation for 49 minutes and it’s still not validated.
As to why we have to use the DnsValidatedCertificate: We are a team in Europe, with our main region being Ireland: eu-west-1. There are many certificates that require certs placed in N. Virginia: us-east-1. That rules out the regular acm.Certificate class because that class will only deploy to the main region.
We also don’t want a separate stack that deploys into us-east-1 because then you cannot export certificate ARN and import it into another stack. Fn::importValue only works within the same region.
Workarounds: The only workaround right now is to deploy it in a separate stack into us-east-1, then have a second stack that exports certificate values which are hard-coded as strings (manual step) and then have a third stack which actually uses those values.
One other workaround is to retry stack deployment early in the morning when it seems to get validated in time - but that is highly unreliable.
Solutions: Well ideally you could internally push for making certificate validations faster in that region and guarantee validations under 15 minutes. Or implement an API to do cross-region certificate creations, so CloudFormation would support this scenario natively (without the lambda). Or don’t force us to deploy certificates to a specific region (us-east-1), then we could all happily use the acm.Certificate class.
I’ve never really used CustomResource, so don’t know much about that. But is there a way to run something else than a lambda that might run for longer?
If you can’t do any of that, you could at least make the stack deployments idempotent. Problem is that the custom resource lambda fails and triggers a rollback, which orphans the certificate and new re-deployment doesn’t use the original cert that might be already validated. There would be no problem if I could: deploy a stack, wait for it to fail due to lambda timeout, wait until certificate is valdiated, re-deploy - and it will pickup the original certificate and successfully complete.
Does it really need to fail and trigger rollback? How come the main acm.Certificate within one region works?
At the very least this issue should be documented on the cdk page for DnsValidatedCertificate construct.
What is the solution for this issue now? As the ACM timing out causing the rollback of the whole cdk stack that I’m deploying. And I need to add a cert into the CloudFrontConfig.
I dont understand how increasing the wait time to 9mins was a valid solution? That does not solve the problem at all.
@BillyBunn, might be a long shot, but I switched to Certificate and my deploy started hanging as well. I never let it time out but I noticed in my gmail spam folder I had a bunch of emails from AWS re: Certificate Approval with a link that I had to click to approve the certificate. I marked them not as spam and tried again; clicking the approve link seemed to do the trick.
I switched back to the DNS validated cert afterward, and that one seems to work if I wait for the hostedZone to get created, then use its name servers to update the
name serverssection under registered domains via the UI. The deploy hangs while I do that but then seems to finish up.@njlynch Unfortunately I’m experiencing the same timeout issue, even with the
Certificateconstruct. I’ve tried using both.DnsvalidatedCertificatetimed out after a few minutes withCertificatetimed out after a few hours withAlso, both ways are unable to delete the failed stack because of DNS record sets created in the same deployment that pointed at a CloudFront alias (probably should be a separate issue).
Ran into this trying to deploy a static site (S3 bucket, CloudFront distribution, Route53 hosted zone, ACM certificate) with a domain registered already with Route53. I have noticed also what @acdoussan mentioned—the name servers for the registered domain do not match the hosted zone NS records made by
PublicHostedZone.Anything obvious that is causing this? My code:
Edit: Can recreate with simply this
For those experiencing this issue:
Unless you absolutely need cross-region certificate issuance (e.g., requesting a us-east-1 certificate from another region for CloudFront), then converting to use the
Certificateconstruct (as @AbendGithub notes above) is your best bet. TheCertificateconstruct does not have the same time-out constraints asDnsValidatedCertificateand uses CloudFormation’s internal workflow system for provisioning and validating.If you must use
DnsValidatedCertificate, give yourself the best possible chance of success by creating and deploying your Route53 HostedZone first, validating the domain with tools likedig,nslookup, etc., and only then adding the certificate to the deployment. See https://docs.aws.amazon.com/acm/latest/userguide/troubleshooting-DNS-validation.html for a list of common DNS validation troubleshooting tips. In particular, if something like% dig yourhostname.example.comdoes not return the 4 name servers associated with your hosted zone prior to starting the deployment, your certificate will never validate.For those running with this problem, use instead the Certificate construct. It allows you to achieve the very same thing without time limit. Something like this:
@papiro This AWS page is how.
The answer you want is in there