vault-k8s: Injector does not inject sidecar container

Hello, I’m trying to deploy vault with sidecar injector. I’m using this chart: https://github.com/hashicorp/vault-helm and following this manual: https://www.hashicorp.com/blog/injecting-vault-secrets-into-kubernetes-pods-via-a-sidecar/ the only difference is that I don’t use dev server mode.

Everything works fine except the injector. When I deploy an app with injector annotations, then pod starts like usual with one container and with mounted app-token secret, but there is no secondary injector container:

app-57d4f4c645-9npng
Namespace:      my-namespace
Priority:       0
Node:           node
Start Time:     Mon, 06 Jan 2020 16:19:21 +0100
Labels:         app=vault-agent-demo
                pod-template-hash=57d4f4c645
Annotations:    vault.hashicorp.com/agent-inject: true
                vault.hashicorp.com/agent-inject-secret-test: secret/data/test-secret
                vault.hashicorp.com/role: test
Status:         Running
IP:             xxxxxx
IPs:            <none>
Controlled By:  ReplicaSet/app-57d4f4c645
Containers:
  app:
    Container ID:   docker://7348a9d4a9c0c9a3d831d3f84fa078081dcc3648f469aa2b0195b55242d26613
    Image:          jweissig/app:0.0.1
    Image ID:       docker-pullable://jweissig/app@sha256:54e7159831602dd8ffd8b81e1d4534c664a73e88f3f340df9c637fc16a5cf0b7
    Port:           <none>
    Host Port:      <none>
    State:          Running
      Started:      Mon, 06 Jan 2020 16:19:22 +0100
    Ready:          True
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from app-token-kmzkr (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             True 
  ContainersReady   True 
  PodScheduled      True 
Volumes:
  app-token-kmzkr:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  app-token-kmzkr
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:          <none>

There are no errors in logs from vault-agent-injector pod :

2020-01-06T13:55:55.369Z [INFO]  handler: Starting handler..
Listening on ":8080"...
Updated certificate bundle received. Updating certs...

Here is my deployment:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: app
  namespace: my-namespace
  labels:
    app: vault-agent-demo
spec:
  selector:
    matchLabels:
      app: vault-agent-demo
  replicas: 1
  template:
    metadata:
      annotations:
        vault.hashicorp.com/agent-inject: "true"
        vault.hashicorp.com/agent-inject-secret-test: "secret/data/test-secret"
        vault.hashicorp.com/role: "test"
      labels:
        app: vault-agent-demo
    spec:
      serviceAccountName: app
      containers:
      - name: app
        image: jweissig/app:0.0.1
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: test
  namespace: my-namespace
  labels:
    app: vault-agent-demo
apiVersion: flux.weave.works/v1beta1
kind: HelmRelease
metadata:
  name: vault
  namespace: my-namespace
  annotations:
    flux.weave.works/automated: 'true'
spec:
  chart:
    path: "."
    git: git@github.com:hashicorp/vault-helm.git
    ref: master
  releaseName: vault
  values:
    replicaCount: 1
    server:
      ingress:
        enabled: true
        annotations:
          ....... 
        hosts:
          .......
        tls:
          .......

Is there any way to debug this issue?

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Reactions: 11
  • Comments: 40 (11 by maintainers)

Most upvoted comments

A lot of these issues sound like what we’ve seen happen with private GKE clusters, for example: https://github.com/hashicorp/vault-helm/issues/214#issuecomment-592702596

So if that matches your setup, please try adding a firewall rule to allow the master to access 8080 on the nodes: https://cloud.google.com/kubernetes-engine/docs/how-to/private-clusters#add_firewall_rules

If it doesn’t, then it would help to know where your k8s cluster is running and how it’s configured. If the configurations are too varied we might need to break this up into separate issues for clarity. Cheers!

A lot of these issues sound like what we’ve seen happen with private GKE clusters, for example: hashicorp/vault-helm#214 (comment)

So if that matches your setup, please try adding a firewall rule to allow the master to access 8080 on the nodes: https://cloud.google.com/kubernetes-engine/docs/how-to/private-clusters#add_firewall_rules

If it doesn’t, then it would help to know where your k8s cluster is running and how it’s configured. If the configurations are too varied we might need to break this up into separate issues for clarity. Cheers!

This worked like a charm!!! Thanks @tvoran

Hi @mateipopa, these are the correct builds. Release engineering had not completed the official build pipeline, however, they are being built internally using the dev builds.

One thing you might investigate are firewall rules on your GKE nodes. We’ve seen similar issues with injection due to 8080 being blocked: https://github.com/hashicorp/vault-k8s/issues/46

Having this same issue with GKE, opening port to my apiserver 8080 did not do the trick for me.

Alright folks -

I codified the entire process via terraform on how to get a sidecar injector working, even on external clusters. https://github.com/sethvargo/vault-on-gke/pull/98

^ Fully documented in this PR 👍

I hope this helps someone! It took me quite a bit of diving in to get this fully working out of the box!

thx for responding @jasonodonnell. well, the init logs were certainly helpful and they led me to my problem, having an incorrect secret path. thx so much, it’s working now. i am going to undo some of the AWS networking changes i made to see if they were even necessary.

vault-agent-init logs

2020-10-10T18:27:17.282Z [INFO]  sink.file: creating file sink
2020-10-10T18:27:17.282Z [INFO]  sink.file: file sink configured: path=/home/vault/.vault-token mode=-rw-r-----
2020-10-10T18:27:17.283Z [INFO]  auth.handler: starting auth handler
2020-10-10T18:27:17.283Z [INFO]  auth.handler: authenticating
2020-10-10T18:27:17.283Z [INFO]  template.server: starting template server
2020/10/10 18:27:17.283255 [INFO] (runner) creating new runner (dry: false, once: false)
2020-10-10T18:27:17.283Z [INFO]  sink.server: starting sink server
2020/10/10 18:27:17.283831 [WARN] (clients) disabling vault SSL verification
2020/10/10 18:27:17.283843 [INFO] (runner) creating watcher
2020-10-10T18:27:17.297Z [INFO]  auth.handler: authentication successful, sending token to sinks
2020-10-10T18:27:17.297Z [INFO]  auth.handler: starting renewal process
2020-10-10T18:27:17.297Z [INFO]  sink.file: token written: path=/home/vault/.vault-token
2020-10-10T18:27:17.297Z [INFO]  sink.server: sink server stopped
2020-10-10T18:27:17.297Z [INFO]  sinks finished, exiting
2020-10-10T18:27:17.297Z [INFO]  template.server: template server received new token
2020/10/10 18:27:17.297652 [INFO] (runner) stopping
2020/10/10 18:27:17.297677 [INFO] (runner) creating new runner (dry: false, once: false)
2020/10/10 18:27:17.297800 [WARN] (clients) disabling vault SSL verification
2020/10/10 18:27:17.297825 [INFO] (runner) creating watcher
2020/10/10 18:27:17.297863 [INFO] (runner) starting
2020-10-10T18:27:17.306Z [INFO]  auth.handler: renewed auth token
2020/10/10 18:27:17.314963 [WARN] (view) vault.read(secrets/dev/poc-secret): no secret exists at secrets/dev/poc-secret (retry attempt 1 after "250ms")
2020/10/10 18:27:17.572730 [WARN] (view) vault.read(secrets/dev/poc-secret): no secret exists at secrets/dev/poc-secret (retry attempt 2 after "500ms")
2020/10/10 18:27:18.080373 [WARN] (view) vault.read(secrets/dev/poc-secret): no secret exists at secrets/dev/poc-secret (retry attempt 3 after "1s")
2020/10/10 18:27:19.088366 [WARN] (view) vault.read(secrets/dev/poc-secret): no secret exists at secrets/dev/poc-secret (retry attempt 4 after "2s")
2020/10/10 18:27:21.096020 [WARN] (view) vault.read(secrets/dev/poc-secret): no secret exists at secrets/dev/poc-secret (retry attempt 5 after "4s")
2020/10/10 18:27:25.104668 [WARN] (view) vault.read(secrets/dev/poc-secret): no secret exists at secrets/dev/poc-secret (retry attempt 6 after "8s")
2020/10/10 18:27:33.112358 [WARN] (view) vault.read(secrets/dev/poc-secret): no secret exists at secrets/dev/poc-secret (retry attempt 7 after "16s")

I found the issue: as I’m running on OpenShift 3.11 (Kubernetes 1.11), the API config had to be changed so it supports admission controllers.

    MutatingAdmissionWebhook:
      configuration:
        apiVersion: v1
        disable: false
        kind: DefaultAdmissionConfig
    ValidatingAdmissionWebhook:
      configuration:
        apiVersion: v1
        disable: false
        kind: DefaultAdmissionConfig

This block must be present in the master-config.yml in the section admissionConfig.pluginConfig. After restarting the apiserver, the webhook started to kick in. But the sidecar was still not injected, because of some permission issues. Granting the consumer app’s service account cluster-admin permissions or access to the privileged SCC (equivalent of PSP) helped, but then also introduces other security issues.

I have the same issue:

  • I am running Vault chart v0.4.0
  • I verified with “openssl s_client -servername vault-agent-injector-svc.vault.svc -connect vault-agent-injector-svc.vault.svc:443” that the vault-agent-injector exposes the custom certificate and intermediate CA I provided
  • I verified that the caBundle of the MutatingWebhookConfiguration is the CA that issued the intermediate CA
  • I have the same log line in my vault-agent-monitor I followed this guide: https://github.com/hashicorp/vault-guides/tree/bd7eaa007d9124f87549986b070bbe19315895bb/operations/provision-vault/kubernetes/minikube/vault-agent-sidecar
  • I checked the logs of the Kubernetes API server and the Controller while recreating alternately the MutatingWebhookConfiguration, the vault-agent-injector pod, and the consumer app pod, but there is nothing related to the agent injector.
  • I changed AGENT_INJECT_LOG_LEVEL to ‘debug’, with no effect
  • Even changing the MutatingWebhookConfiguration failurePolicy from ‘Ignore’ to ‘Fail’ didn’t prevent the consumer app pod from being started.

I wonder if the problem is between Kubernetes not contacting the webhook, or the webhook not contacting vault.

How can I troubleshoot further?