vault-k8s: vault-k8s and istio service mesh don't work together

I did the steps described here and it worked great.

The problem is when I add istio to the namespace. vault-agent-init container can’t correctly start because there’s no network available yet.

Is there a way to use just the vault-agent sidecar and not use the vault-agent-init container? Any configuration that can be done to execute the command from the vault-agent-init inside the vault-agent sidecar?

I found this comment in the container_init_sidecar.go code and I’m not sure if its safe to execute everything inside the sidecar container.

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Reactions: 18
  • Comments: 35 (7 by maintainers)

Most upvoted comments

@gabrielanavarro I’m not aware of any security issues caused by disabling the init container. The main issue will be whether your application expects the secrets from vault to be already rendered when your application starts up. The sidecar container will render the secrets, but it’s a race between it rendering them and your application consuming them.

But to your initial problem, vault-k8s 0.3.0 now has support for a vault.hashicorp.com/agent-init-first annotation. Setting that to true should allow the vault init container to run before the istio init container so they both have a better chance of succeeding. We’d love to hear if this works for you!

Seems that istio init container is patching the pod’s iptable before vault-agent-init is executed. Since istio-proxy isn’t running, vault-agent-init cannot access the vault server. Switching init containers order in the pod’s yaml file (i.e. putting vault-agent-init first instead of istio-init) do the trick, but i don’t know if it’s possible to do this directly with the mutatingwebhook.

You can also omit the outbound port to Vault from Envoy redirection using the follow annotation: traffic.sidecar.istio.io/excludeOutboundPorts: "8200"

Of course this only works if you’re happy with connections to Vault from the init container and sidecar not being intercepted - have only tested this with Istio CNI.

tl;dr: Set vault.hashicorp.com/agent-run-as-user: "1337" if using istio-cni.

There are two ways to use istio:

1. Init container

Istio injects an istio-init init container, which sets up the networking. In this case, the ordering of the init container is relevant, because init containers before istio-init can access the network without restrictions.

2. Istio-CNI

WIth istio-cni, istio will setup the network even before the first init container starts. Istio will still inject an istio-validation init container, but it will just validate if the network setup is correct. With istio-cni the order of the init container doesn’t matter, because it tries to route all traffic through the istio-proxy-sidecar, which isn’t yet started while init container are running.

There are at least three solutions to avoid this problem (the last one might surprise you):

Disable vault-agent-init

Only init containers are a problem, because the istio-proxy-sidecar isn’t running at that point. It is possible to disable vault-agent-init and just use the vault-agent-sidecar. This has the drawback, that the application needs to wait for the vault-agent-sidecar to populate the secrets. This can be achieved with a sleep in the application entrypoint (or a more sophisticated approach).

metadata:
  annotations:
    vault.hashicorp.com/agent-pre-populate: "false"
    proxy.istio.io/config: |
      holdApplicationUntilProxyStarts: true

holdApplicationUntilProxyStarts helps, because all container including vault-agent will wait until the istio-proxy-sidecar is ready to process requests.

Istio-Annotations

It is possible to disable the routing through the istio sidecar for specific ports, for example port 8200.

metadata:
  annotations:
    traffic.sidecar.istio.io/excludeOutboundPorts: "8200"
    proxy.istio.io/config: |
      proxyMetadata:
        ISTIO_META_DNS_CAPTURE: "false"

DNS capture isn’t enabled by default, but if activated in general, you need to disable it to make dns resolution in the init container work. This might be the problem for @razvan-miron, as the connection to “172.18.0.10:53” failed, becasue port 53 is DNS. But this will disable DNS capture for the whole pod, which isn’t always desirable.

Vault-Injector-Settings

Istio has another solution the avoid routing traffic through the istio-proxy-sidecar. All traffic from uid 1337 is ignored by the istio iptables. The reason for this is, that the istio-proxy runs at that user. This is described here: https://istio.io/latest/docs/setup/additional-setup/cni/#compatibility-with-application-init-containers But if other container, like vault-agent-init, run as this user, their traffic will also be ignored.

The user of the vault container can be changed with the annotation on the pod:

metadata:
  annotations:
    vault.hashicorp.com/agent-run-as-user: "1337"

or with an environmant variable on the injector

env:
- name: AGENT_INJECT_RUN_AS_USER
  value: "1337"

or in the helm chart:

extraEnvironmentVars:
  AGENT_INJECT_RUN_AS_USER: 1337

See also https://www.vaultproject.io/docs/platform/k8s/injector/annotations#vault-hashicorp-com-agent-run-as-user

A small drawback might be, that also the traffic of the vault-agent-sidecar isn’t routed through istio. But I don’t think this is a problem in general, except you want istio metrics of the sidecar.

same issue here. vault-agent-init starts before the istio-validation init container thanks to vault.hashicorp.com/agent-init-first annotation, however it never completes and stuck with “connect : connection refused”

tested with istio 1.9 with CNI plugin installed on k8s v1.18 @semihural do you success to find a solution ?

ps: traffic.sidecar.istio.io/excludeOutboundPorts is not a desirable option.

Thanks @thechristschn the solution

metadata:
  annotations:
    vault.hashicorp.com/agent-pre-populate: "false"
    proxy.istio.io/config: |
      holdApplicationUntilProxyStarts: true

This worked for us with RedHat OpenShift ServiceMesh, we did not have to use excludeOutboundPorts or vault.hashicorp.com/agent-init-first, the Vault sidecar container was able to resolve our internal vault DNS and read the secrets. We did not have to add any delays to the app pod (your mileage may vary).

@einret

This worked for me.

  template:
    metadata:
      annotations:
        traffic.sidecar.istio.io/excludeOutboundPorts: "8200"
        vault.hashicorp.com/agent-init-first: "true"
        vault.hashicorp.com/agent-inject: "true"

@TamasNeumer @a8j8i8t8 , we are having similar issue.

When we enable istio in vault name space (where vault injector deployed, external vault) we are getting errors when application pod deployed in application name space, but in case if istio disabled everything working fine (The application name space always has istio, )

@dippynark - thanks! Adding failurePolicy: Fail to the webhook seems to have done it! Will continue to test scenarios.

It’s working perfectly