aws-load-balancer-controller: [v2] Error when there are multiple certificates for a host
It is perfectly fine for there two be two certificates available for a particular host. I have this right now because I am in mid-migration and have both a wildcard, and host-specific, cert. In these cases, the controller should either add both certificates, or add (in my opinion) the least specific certificate.
{"level":"error","ts":1604259893.3354607,"logger":"controller","msg":"Reconciler error","controller":"ingress","name":"dex","namespace":"kube-system","error":"multiple certificate found for host: dex.admin.rex.sh, certARNs: [arn:aws:acm:us-west-2:355508092300:certificate/c6d7d07b-0aec-49ae-aff6-3616afb4b0a7 arn:aws:acm:us-west-2:355508092300:certificate/eeba7334-1671-4ad9-bc2f-d458ef972fdb]"}
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Reactions: 31
- Comments: 34 (1 by maintainers)
I would suggest that in this situation, the controller should pick a certificate in a deterministic way, but log a warning saying that there were other choices. That’s better than it just throwing it’s hands up and walking away.
ALB supports multiple certificates, so why doesn’t the controller just add them all?
/remove-lifecycle stale
This is an issue especially when you are using wildcard subdomains. You might want *.test.company.com but *.company.com already exists. There’s no reason you can’t add both.
So, for an org using wildcard subdomains, we have to manually specify the ARN for every ingress we provision?
edit: I found a workaround. The wildcard subdomain had a SAN that conflicted with the wildcard domain.
*.test.company.com with a SAN of test.company.com – conflicts with *.company.com
Recreated the ACM certificate of *.test.company.com with no SAN, and the ingress controller added both *.test.company.com and *.company.com. The routes used in this case are *.test.company.com and test.company.com.
Still, I think it would be better if there was some deterministic logic and any choice made, instead of throwing the “could not auto-load certificates from ACM: multiple certificate found for host” error and giving up.
AWS has an algorithm for choosing the best cert if multiple are provided. I see no reason for us to do the same login on our part.
Their algorithm*
It can, it just does not. It has no criteria. Example criteria would be: Length of domain name, expiry date, etc. If it makes a choice based on some criteria, it may not be the exact criteria everyone who uses AWSv2 would choose. However, it would be a choice, that is predictable, that breaks things less than having AWSv2 give up.
I’m hindered by this issue.
I have two wildcards certs in ACM, one more specific than the other, and the controller complains that it found multiple valid certificates for a setting that accepts multiple certificates. Why should I have to hardcode something to exactly what is being autodiscovered?
I’d like to move this on to the next environment without making environment specific settings, and this is issue prevents me from doing that.
Hi @matthewmrichter , Thank you for providing your solution. Actually, This is what I currently have, and I was set my certificates in 1
ingressresource, but my problem occurred once I’m using additionalingressresource which contains the annotationalb.ingress.kubernetes.io/listen-ports: '[{"HTTPS": 443}], thealb-ingresswill detect this annotation and will try to auto discover the right certificate for it.What I’m looking for is to disable the auto discovery for certificate so I will have full control on my ALB configuration.
@postmaxin I agree we can do it if we have a deterministic way. (e.g. pure sort based on lexical order of certificate arn). will it suits your need?
Is there any chance anyone will ever look at this? All I want is for
alb.ingress.kubernetes.io/certificate-arnto actually WORK and restrict the certs it looks at to that list.We’re dealing with this also. There’s no reason it can’t just add all the certs that match, but we need to completely toss out the cert discovery functionality to use this controller it seems.
@M00nF1sh this doesn’t seem like the same behavior as V1.
We have ingresses which match two certificates -
*.example.comandfoo.example.com. These ingresses were created with V1 and the controller added the*.example.comcert via the discovery feature. Now, with V2 we get the error regarding the multiple certificates (for the existing ingresses), so this does seem like there were breaking changes somewhere.As a side note, why can’t the controller just load all of the certificates it finds?
Off topic but @dsaydon90 - Look into the
alb.ingress.kubernetes.io/group.nameannotation. What we have done for this at my organization is to have a helm deployment of dummy “model” ingresses that have the global stuff (certs, SG’s etc) along with thealb.ingress.kubernetes.io/group.nameannotation. Then simply create ingresses in the other apps that use that annotation. AWS-LBC will merge the new TG’s into the “model” ingresses with the matchinggroup.name. This way we don’t have to track all the annotations for every chart using the LB.@M00nF1sh , There is any way to disable the Certificate Discovery?
I can’t see any way to disable it on Docs: https://kubernetes-sigs.github.io/aws-load-balancer-controller/v2.2/guide/ingress/cert_discovery/
I really think that if I could manage all my ALB configuration level (such as
certificates,security groups,ssl policy,schemeand etc…) in 1 place like a globalingressand all other target group configuration level on my otheringressresources it would be great.I have the use case of rotating my certificate in a multi step process where I
It would be convenient for me to have the controller pick the latest certificate (I’m taking inspiration from terraform’s
aws_acm_certificate’smost_recentargument on data sources)Seems like the most sensible and straightforward deterministic way is to select the most recent valid cert arn, or provide an annotation to do the same.
This was a frustrating bug/behavior to find at the last minute when trying to go-live.
At the very least, could the AWS LBC not attempt auto-discovery beyond the list you give with the
certificate-arnannotation? That would limit the opportunity for confusion. Not that it should matter.I see the same issue, despite having the
alb.ingress.kubernetes.io/certificate-arnannotation configured. The “old” certificate was used by the controller, after adding the new certificate and changing the configuration, it gets stuck.After thinking hard, I noticed this: I have two ingress objects sharing an ingress group. As alb.ingress.kubernetes.io/certificate-arn is listed as “merge”, I annotated only one Ingress in the group, assuming it would be used for all ingresses in the group. But the controller logged this message for the other ingress in the same group without the annotation.
So apparently the “determine-the-right-certificate” code runs for each ingress object individually, not for the merged annotations of an ingress group?