istio: httpsRedirect breaks cert-manager acme challenge, which was working in Istio 1.6

Bug description After upgrading from 1.6.2 to 1.7.2 then cert-manager acme challenges failed to complete. I finally worked out that if I deleted the istio gateway resource for that host then the certificate request completed.

apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: letsencrypt-istio
spec:
  acme:
    server: https://acme-v02.api.letsencrypt.org/directory
    email: hostmaster@example.com
    privateKeySecretRef:
      name: letsencrypt-istio
    solvers:
    - selector: {}
      http01:
        ingress:
          class: istio

The order, challenge and ingress are all created and the acme pod starts but it just never completes.

[ ] Docs [ ] Installation [X] Networking [ ] Performance and Scalability [ ] Extensions and Telemetry [ ] Security [ ] Test and Release [X] User Experience [ ] Developer Infrastructure

Expected behavior Certificate requests complete successfully as they did in version 1.6.2

Version (include the output of istioctl version --remote and kubectl version --short and helm version if you used Helm) $ istioctl version --remote client version: 1.7.2 control plane version: 1.7.2 data plane version: 1.7.2 (57 proxies)

$ kubectl version --short Client Version: v1.17.3 Server Version: v1.17.3

How was Istio installed? istioctl upgrade -f istio-operator.yaml

Environment where bug was observed (cloud vendor, OS, etc) Kubernetes 1.17.3 on-premise on Centos 7

About this issue

  • Original URL
  • State: open
  • Created 4 years ago
  • Reactions: 13
  • Comments: 34 (12 by maintainers)

Commits related to this issue

Most upvoted comments

For anyone looking for a solution, here might be one. This has also be mentioned in https://discuss.istio.io/t/how-to-redirect-to-https-except-for-well-known-acme-challenge/9011/4.

apiVersion: cert-manager.io/v1alpha2
kind: Certificate
metadata:
  name: my-cert
  namespace: istio-system
  annotations:
    cert-manager.io/issue-temporary-certificate: "true"

Enabling cert-manager.io/issue-temporary-certificate generates a self-signed certificate until a valid one is available. Let’s Encrypt states that they follow HTTPS redirects without checking the certificate.

Our implementation of the HTTP-01 challenge follows redirects, up to 10 redirects deep. It only accepts redirects to “http:” or “https:”, and only to ports 80 or 443. It does not accept redirects to IP addresses. When redirected to an HTTPS URL, it does not validate certificates (since this challenge is intended to bootstrap valid certificates, it may encounter self-signed or expired certificates along the way). Source: https://cert-manager.io/docs/usage/certificate/#temporary-certificates-whilst-issuing

@coldguy101

Thank you for your post. It didn’t quite work for me, but it got me to this solution which is working for me. Here is the complete declaration.

apiVersion: networking.istio.io/v1beta1
kind: Gateway
metadata:
  name: kiali
  namespace: monitoring
spec:
  selector:
    istio: ingressgateway
  servers:
    - port:
        number: 80
        name: http
        protocol: HTTP
      hosts:
        - kiali.sandbox.acceleratelearning.com
    - port:
        number: 443
        name: https
        protocol: HTTPS
      tls:
        mode: SIMPLE
        credentialName: kiali-certificate
      hosts:
        - kiali.sandbox.acceleratelearning.com
---
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
  name: kiali
  namespace: monitoring
spec:
  hosts:
    - "*"
  gateways:
    - kiali
  http:
    # The redirect for https has to happen in the virtual service,
    # otherwise it will prevent the cert-manager http01 challenge from working
    - match:
        - scheme:
            exact: http
      redirect:
        scheme: https
    - route:
        - destination:
            host: kiali
            port:
              number: 20001

Definitely. This is actually not low priority in my opinion, but its really hard to fix unfortunately

To supplement @sneko 's answer above, we can do a better match for http01 requests.

FROM:

  http:
    # The redirect for https has to happen in the virtual service,
    # otherwise it will prevent the cert-manager http01 challenge from working
    - match:
        - scheme:
            exact: http
          uri:
            # This is not a perfect regex but RE2 does not allow negations so we have to hack a it
            # Note: this pattern will redirect to HTTPS all URIs NOT starting with `/.well-` (which should be the best case since your business URIs are unlikely to start with that)
            # this to let pass requests to renew certs like `/.well-known/acme-challenge/...`
            regex: "^/(([^\\.].*)|(\\.[^w].*)|(\\.w[^e].*)|(\\.we[^l].*)|(\\.wel[^l].*)|(\\.well[^\\-].*))"

TO:

  http:
    - match:
        - withoutHeaders:
            :path:
              prefix: /.well-known/acme-challenge/

@coldguy101 what is the full definition you used for the VirtualService? Maybe it’s possible to do a wildcard VirtualService matching ALL 80 ports EXCEPT with the prefix /.well-known/acme-challenge/ (to do the redirection of scheme).

Also, just wanted to know if the Istio team is aware of the best practise to use right now? The issue looks important but staled?

Thank you,

@howardjohn Is there any updates on this issue? I would rather not split VS and gateway into four resources as you show above if update is on its way.

Thanks.

This worked for us, we changed tls httpsRedirect to false (or commented it out) in the gateway and then added this as the first rule under http in the virtual service

  - match:
      - scheme:
          exact: http
    redirect:
      scheme: https

We have so far changed about 50 configs to this in the last 6 months and not had a problem yet

The solution from @sazzle2611 is the correct one. I’ve written a bit more in detail about why and how it works: https://gruchalski.com/posts/2023-07-09-istio-cert-manager-lets-encrypt-and-https-redirect/.

— BELOW A POTENTIAL SIMPLE HACKY SOLUTION 😃 —

Every 3 months I’m like struggling hard with Istio and CertManager and I’m so sad after years it’s still a real pain even for really simple use cases (using httpsRedirect: true for example). Hope a proper solution will come out one day!

My solution:

  • As stated above, we have to do the HTTPS redirection on the VirtualService level since the gateway cannot specify rules of matching
  • As I specified 3 months ago https://github.com/istio/istio/issues/27643#issuecomment-1095306328 the idea is to do the HTTPS redirection on all URIs except for /.well-known/acme-challenge/ (due to limitation of the RE2 regex It’s a bit hacky but working!)

gateway.yaml

apiVersion: networking.istio.io/v1alpha3
kind: Gateway
metadata:
  name: main-gateway
  namespace: mynamespace
spec:
  selector:
    istio: ingressgateway
  servers:
    - port:
        number: 80
        name: http-ingress
        protocol: HTTP2
      hosts:
        - "mynamespace/www.mydomain.com"
      tls:
        httpsRedirect: false # The HTTPS redirections will be handled in the VirtualService layer to not conflict with certificates renewing...
    - port:
        number: 443
        name: https-ingress
        protocol: HTTPS
      tls:
        mode: SIMPLE
        privateKey: sds
        serverCertificate: sds
        credentialName: ingress-cert
      hosts:
        - "mynamespace/www.mydomain.com"
---
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: workaround-to-renew-certs
  namespace: mynamespace
spec:
  hosts:
    - "*"
  gateways:
    - main-gateway
  http:
    # The redirect for https has to happen in the virtual service,
    # otherwise it will prevent the cert-manager http01 challenge from working
    - match:
        - scheme:
            exact: http
          uri:
            # This is not a perfect regex but RE2 does not allow negations so we have to hack a it
            # Note: this pattern will redirect to HTTPS all URIs NOT starting with `/.well-` (which should be the best case since your business URIs are unlikely to start with that)
            # this to let pass requests to renew certs like `/.well-known/acme-challenge/...`
            regex: "^/(([^\\.].*)|(\\.[^w].*)|(\\.w[^e].*)|(\\.we[^l].*)|(\\.wel[^l].*)|(\\.well[^\\-].*))"
      redirect:
        scheme: https
        redirectCode: 302 # 302 to not mess with risky permanent redirections since a workaround

🏎️💨

instead of negation can you just match everything? Then when cert manager creates a rule it will be more specific and take precedence. I haven’t tested but I think it could work

This worked for us, we changed tls httpsRedirect to false (or commented it out) in the gateway and then added this as the first rule under http in the virtual service

  - match:
      - scheme:
          exact: http
    redirect:
      scheme: https

We have so far changed about 50 configs to this in the last 6 months and not had a problem yet

Just ran into this problem also. My work-around was to stop using

tls: {
    httpsRedirect: true
},

in my Istio Gateway spec, and instead use a VirtualService matcher to do the redirection.

    {
      // This rule exists because of the Istio issue on the httpsRedirect flag above
      match: [
        {
          port: 80,
          uri: {
            prefix: "/"
          },
        },
      ],
      redirect: {
        scheme: "https"
      },
    },
...

Doing this allowed cert-manager to insert its own HTTP routes, and when those get cleaned up automatically, revert back to the standard redirect.

Your theory sounds correct to me just based on my observation.

Here is my fully Gateway config:

apiVersion: networking.istio.io/v1beta1
kind: Gateway
metadata:
  name: website-mydomain-com
  namespace: mine
spec:
  selector:
    istio: istio-ingressgateway-mine
  servers:
  - hosts:
    - mydomain.com
    - www.mydomain.com
    port:
      name: http
      number: 80
      protocol: HTTP
    tls:
      httpsRedirect: true
  - hosts:
    - mydomain.com
    - www.mydomain.com
    port:
      name: https
      number: 443
      protocol: HTTPS
    tls:
      credentialName: website-mydomain-com-cert
      minProtocolVersion: TLSV1_2
      mode: SIMPLE
      privateKey: sds
      serverCertificate: sds

And in case it’s useful the ingress that was generated by cert-manager:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  annotations:
    kubernetes.io/ingress.class: istio
    nginx.ingress.kubernetes.io/whitelist-source-range: 0.0.0.0/0,::/0
  creationTimestamp: "2021-01-04T02:08:56Z"
  generateName: cm-acme-http-solver-
  generation: 1
  labels:
    acme.cert-manager.io/http-domain: "2332887661"
    acme.cert-manager.io/http-token: "964497066"
    acme.cert-manager.io/http01-solver: "true"
spec:
  rules:
  - host: mydomain.com
    http:
      paths:
      - backend:
          service:
            name: cm-acme-http-solver-pslzv
            port:
              number: 8089
        path: /.well-known/acme-challenge/pUB07WsWON9TLYcACp-jpLwLnxBKNQZmvIywbfjTC8U
        pathType: ImplementationSpecific
status:
  loadBalancer:
    ingress:
    - ip: 192.168.1.250

Thank you for your attention on this, it is much appreciated!

I think to do it we would need to stop using RequireTLS in envoy, and instead use route_action.https_redirect and a match on :scheme. This is not only a big change, but its broken in envoy currently: https://github.com/envoyproxy/envoy/issues/14587. So would likely be a while until fixed.