bank-vaults: Inject secrets in block notation data within k8s secrets/configmap resources

Is your feature request related to a problem? Please describe.

  • current use-case involves attempting to install kube-prometheus-stack/alertmanager, which contains three directives we want to store/retrieve as secrets
  • the kube-prometheus-stack stores this config in a secret resource as a base64 encoded blob
  • the base64 encoded blob is a yaml encoded configuration
  • secrets do not transliterate/inject in the alertmanager config base64 encoded blob using non-inline and inline injection

kube-prometheus-stack manifest derived from what the helm chart generates:

apiVersion: v1                                                                                                                                                                                                                                                                                         
kind: Secret                                                                                                                                                                                                                                                                                           
metadata:                                                                                                                                                                                                                                                                                              
  annotations:                                                                                                                                                                                                                                                                                         
    meta.helm.sh/release-name: kube-prometheus-stack                                                                                                                                                                                                                                                   
    meta.helm.sh/release-namespace: monitoring                                                                                                                                                                                                                                                         
    vault.security.banzaicloud.io/inline-mutation: "true"                                                                                                                                                                                                                                              
    vault.security.banzaicloud.io/vault-addr: vault-tls                                                                                                                                                                                                                                                
    vault.security.banzaicloud.io/vault-path: default                                                                                                                                                                                                                                                  
    vault.security.banzaicloud.io/vault-role: kubernetes                                                                                                                                                                                                                                               
    vault.security.banzaicloud.io/vault-skip-verify: "false"                                                                                                                                                                                                                                           
  labels:                                                                                                                                                                                                                                                                                              
    app: kube-prometheus-stack-alertmanager                                                                                                                                                                                                                                                            
    app.kubernetes.io/managed-by: Helm                                                                                                                                                                                                                                                                 
    chart: kube-prometheus-stack-9.4.10                                                                                                                                                                                                                                                                
    cluster: clever                                                                                                                                                                                                                                                                                    
    heritage: Helm                                                                                                                                                                                                                                                                                     
    release: kube-prometheus-stack                                                                                                                                                                                                                                                                     
  name: alertmanager-kube-prometheus-stack-alertmanager                                                                                                                                                                                                                                                
  namespace: monitoring                                                                                                                                                                                                                                                                                
type: Opaque                                                                                                                                                                                                                                                                                           
data:                                                                                                                                                                                                                                                                                                  
  alertmanager.yaml: Z2xvYmFsOgogIHJlc29sdmVfdGltZW91dDogNW0KICBzbXRwX2F1dGhfcGFzc3dvcmQ6IHZhdWx0OnNlY3JldC9kYXRhL3BhdGgvdG8vYWxlcnRtYW5hZ2VyI3NtdHBfcGFzc3dvcmQKCiAgc210cF9hdXRoX3VzZXJuYW1lOiB2YXVsdDpzZWNyZXQvZGF0YS9wYXRoL3RvL2FsZXJ0bWFuYWdlciNzbXRwX3VzZXJuYW1lCgogIHNtdHBfZnJvbTogUHJvbWV0aGV1cyA8cHJvbWV0aGV1c0Bjb21wYW55LmRvbT4KICBzbXRwX3NtYXJ0aG9zdDogZW1haWxob3N0Ojk5OQppbmhpYml0X3J1bGVzOgotIGVxdWFsOgogIC0gYWxlcnRuYW1lCiAgLSBzZXZlcml0eQogIHNvdXJjZV9tYXRjaDoKICAgIHNldmVyaXR5OiBjcml0aWNhbAogIHRhcmdldF9tYXRjaDoKICAgIHNldmVyaXR5OiB3YXJuaW5nCnJlY2VpdmVyczoKLSBuYW1lOiBwYWdlcmR1dHktcmVjZWl2ZXIKICBwYWdlcmR1dHlfY29uZmlnczoKICAtIGRldGFpbHM6CiAgICAgIGZpcmluZzogJ3t7IHRlbXBsYXRlICJwYWdlcmR1dHkuZGVmYXVsdC5pbnN0YW5jZXMiIC5BbGVydHMuRmlyaW5nIH19JwogICAgICBuZXh0X2dlbl9jbHVzdGVyOiBjbGV2ZXIKICAgICAgbnVtX2ZpcmluZzogJ3t7IC5BbGVydHMuRmlyaW5nIHwgbGVuIH19JwogICAgICBudW1fcmVzb2x2ZWQ6ICd7eyAuQWxlcnRzLlJlc29sdmVkIHwgbGVuIH19JwogICAgICByZXNvbHZlZDogJ3t7IHRlbXBsYXRlICJwYWdlcmR1dHkuZGVmYXVsdC5pbnN0YW5jZXMiIC5BbGVydHMuUmVzb2x2ZWQgfX0nCiAgICAgIHJvdXRpbmdfa2V5OiB2YXVsdDpzZWNyZXQvZGF0YS9wYXRoL3RvL3BhZ2VyZHV0eSNyawoKLSBlbWFpbF9jb25maWdzOgogIC0gdG86IG9wc0Bjb21wYW55LmRvbQogIG5hbWU6IG9wcwpyb3V0ZToKICBncm91cF9ieToKICAtIGFsZXJ0bmFtZQogIC0gc2V2ZXJpdHkKICAtIGNsdXN0ZXIKICBncm91cF9pbnRlcnZhbDogMW0KICBncm91cF93YWl0OiAxMHMKICByZWNlaXZlcjogcGFnZXJkdXR5LXJlY2VpdmVyCiAgcmVwZWF0X2ludGVydmFsOiA5MHMKICByb3V0ZXM6CiAgLSBjb250aW51ZTogdHJ1ZQogICAgZ3JvdXBfd2FpdDogMTBzCiAgICBtYXRjaDoKICAgICAgcGFnaW5nX3N5c3RlbTogdHJ1ZQogICAgcmVjZWl2ZXI6IHBhZ2VyZHV0eS1yZWNlaXZlcgogIC0gbWF0Y2g6CiAgICAgIGFsZXJ0bmFtZTogV2F0Y2hkb2cKICAgIHJlY2VpdmVyOiAibnVsbCIKdGVtcGxhdGVzOgotIC9ldGMvYWxlcnRtYW5hZ2VyL3RlbXBsYXRlLyoudG1wbAo=

and the base64 encoded blob there is alertmanager.cfg:

global:
  resolve_timeout: 5m
  smtp_auth_password: vault:secret/data/path/to/alertmanager#smtp_password
  smtp_auth_username: vault:secret/data/path/to/alertmanager#smtp_username
  smtp_from: Prometheus <prometheus@company.dom>
  smtp_smarthost: emailhost:999
inhibit_rules:
- equal:
  - alertname
  - severity
  source_match:
    severity: critical
  target_match:
    severity: warning
receivers:
- name: pagerduty-receiver
  pagerduty_configs:
  - details:
      firing: '{{ template "pagerduty.default.instances" .Alerts.Firing }}'
      next_gen_cluster: clever
      num_firing: '{{ .Alerts.Firing | len }}'
      num_resolved: '{{ .Alerts.Resolved | len }}'
      resolved: '{{ template "pagerduty.default.instances" .Alerts.Resolved }}'
      routing_key: vault:secret/data/path/to/pagerduty#rk
- email_configs:
  - to: ops@company.dom
  name: ops
route:
  group_by:
  - alertname
  - severity
  - cluster
  group_interval: 1m
  group_wait: 10s
  receiver: pagerduty-receiver
  repeat_interval: 90s
  routes:
  - continue: true
    group_wait: 10s
    match:
      paging_system: true
    receiver: pagerduty-receiver
  - match:
      alertname: Watchdog
    receiver: "null"
templates:
- /etc/alertmanager/template/*.tmpl

The current error causing alertmanager to fail is:

level=info ts=2021-02-22T22:09:44.107Z caller=coordinator.go:119 component=configuration msg="Loading configuration file" file=/etc/alertmanager/config/alertmanager.yaml
level=error ts=2021-02-22T22:09:44.107Z caller=coordinator.go:124 component=configuration msg="Loading configuration file failed" file=/etc/alertmanager/config/alertmanager.yaml err="missing service or routing key in PagerDuty config"

Describe the solution you’d like The secrets get intercepted and injected by the operator for the alertmanager to read its config…

Describe alternatives you’ve considered the only thing I can think of doing now is using a wrapper script to extract secrets from vault and inject them in the manifest prior to applying them to the cluster

Additional context N/A

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 25 (20 by maintainers)

Most upvoted comments

I was able to get the test instance working so that was very helpful. So now I just need to find out why this doesn’t work in our currently running clusters, hence one of the main points of my creating the ticket. Is there a way to boost the verbosity of the webhook logging to see what it’s doing?

If you would be so kind and maybe able to determine what’s going on here? I would appreciate it.

When attempting to install the Secret manifest I need mutated, I continually get the following error from vault-secrets-webhook logs:

time="2021-03-27T21:06:13Z" level=debug msg="reviewing request 794b498e-c415-4af3-88b5-eff837348e8d, named: monitoring/mutate-me" app=vault-secrets-webhook                                                                                                                                                                                              
time="2021-03-27T21:06:17Z" level=debug msg="reviewing request 6a324328-63c4-4f4b-8af7-d273f4ff6505, named: loki/loki-basic-auth" app=vault-secrets-webhook
..
...
...
time="2021-03-27T21:57:17Z" level=error msg="failed to request new Vault token" app=vault-secrets-webhook err="Error making API request.\n\nURL: PUT https://vault.vault:8200/v1/auth/kubernetes/login\nCode: 500. Errors:\n\n* namespace not authorized" 
time="2021-03-27T21:57:22Z" level=error msg="failed to request new Vault token" app=vault-secrets-webhook err="Error making API request.\n\nURL: PUT https://vault.vault:8200/v1/auth/kubernetes/login\nCode: 500. Errors:\n\n* namespace not authorized"
time="2021-03-27T21:57:23Z" level=error msg="admission webhook error: failed to create vault client: timeout [10s] during waiting for Vault token" app=vault-secrets-webhook
...
...
time="2021-03-27T21:57:26Z" level=error msg="failed to request new Vault token" app=vault-secrets-webhook err="Error making API request.\n\nURL: PUT https://vault.vault:8200/v1/auth/kubernetes/login\nCode: 500. Errors:\n\n* namespace not authorized"

Unfortunately, the log entries do not specifically call out which mutation errored with this error, but it’s consistent with the number of secrets to be mutated in the resource (3), the timing of my continual apply of the resource, and the place in the logs where the logs tell me that my resource is getting evaluated. Consequently, this is one of the things I’m asking gets “fixed” in terms of log output and greater verbosity. It would be nice to know what errors are relevant to which requests.

With respect to the error: what I don’t understand is what in our setup could be causing the error, “namespace not authorized.” The following Vault manifest externalConfig properties for our cluster is as follows:

  externalConfig: 
    auth:
    - roles: 
      - bound_service_account_names: 
        - alertmanager 
        - vault
        - vault-secrets-webhook
        bound_service_account_namespaces:
        - vault
        - dex
        - monitoring
        name: default
        policies: allow_secrets
        ttl: 1h
      type: kubernetes
    policies:
    - name: allow_secrets
      rules: path "secret/*" { capabilities = ["create", "read", "update", "delete",
        "list"] }
    secrets:
    - description: General secrets.
      options:
        version: 2
      path: secret
      type: kv

This is the only time this exception gets thrown (when I apply this mutation). The webhook works for all of our other mutation requests. I’ve tried installing the secret under the dex, monitoring, and one other namespace (removed from the list for security purposes) namespaces. I get the same error for every namespace I use. The serviceaccounts specified in the config all exist and they are all properly attributed to their correct namespaces.

We are using vault 1.6.0 and the helm chart 1.8.0

I think this is the crux of my problem but I cannot find out why this would likely be the problem. Do you have any thoughts @bonifaido ? I found https://www.gitmemory.com/issue/banzaicloud/bank-vaults/542/508459934 which seems like the thing I might be running into, but that link doesn’t have a solution.

OK, gonna try these examples now. My main concern/issue is that the secret is not created as a vanilla k8s secret (like above). It’s created by the kube-prometheus-stack helm chart taking what’s stashed as “stuff” in a values.yaml file and generates a secret. Thus the creation of the secret is abstracted away from the CICD/operator. So what I’m doing is waiting for the secret to get generated and then applying the annotations after-the-fact. That doesn’t seem like it should be a problem, though, so I’ll test and let you know.