vault-helm: HA vault init with TLS - cannot validate certificate

Hello,

I’m trying to setup HA vault cluster consisting of 3 vault pods in EKS.

I followed the TLS cert generation instructions from https://www.vaultproject.io/docs/platform/k8s/helm/examples/standalone-tls/

When I try to run vault operator init, vault is returning:

Error initializing: Put https://127.0.0.1:8200/v1/sys/init: x509: cannot validate certificate for 127.0.0.1 because it doesn't contain any IP SANs

In my csr.conf, I have these defined:

[alt_names]
DNS.1 = vault
DNS.2 = vault.vault
DNS.3 = vault.vault.svc
DNS.4 = vault.vault.svc.cluster.local
IP.1 = 127.0.0.1

I also checked the csr generated:

            X509v3 Subject Alternative Name:
                DNS:vault, DNS:vault.vault, DNS:vault.vault.svc, DNS:vault.vault.svc.cluster.local, IP Address:127.0.0.1

I suspect 127.0.0.1 is from the env variable defined in the statefulset template

- name: VAULT_ADDR
value: "{{ include "vault.scheme" . }}://127.0.0.1:8200"
- name: VAULT_API_ADDR
value: "{{ include "vault.scheme" . }}://$(POD_IP):8200"

My tcp listener is configured as:

listener "tcp" {
  address = "[::]:8200"
  cluster_address = "[::]:8201"
  tls_cert_file = "/vault/userconfig/vault-server-tls/vault.crt"
  tls_key_file  = "/vault/userconfig/vault-server-tls/vault.key"
  tls_client_ca_file = "/vault/userconfig/vault-server-tls/vault.ca"
}

Is there another set of instructions I am missing?

Thanks

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Reactions: 4
  • Comments: 19 (1 by maintainers)

Commits related to this issue

Most upvoted comments

For anyone running across this who happens to be following the documentation in a cluster running 1.22 or greater, with the changes to the certificate API, you will need to make some changes.

First, for the csr.yaml file you will need a signerName. Reference the documentation here for a TLS cert. You will want to use signerName: kubernetes.io/kubelet-serving for the server cert. That being said, you’ll need to modify the command in the vault documentation to include the organization and common name requirements for that signerName, as mentioned here.

You’ll know if you’re dealing with this problem if your certificate immediately goes to the Approved,Failed status when you approve the csr. The way I solved this was to modify the command to create the server.cert by using the following command. This sets the organization and common names to align with the requirements for this signer:

openssl req -new -key ${TMPDIR}/vault.key -subj "/O=system:nodes/CN=system:node:${SERVICE}.${NAMESPACE}.svc" -out ${TMPDIR}/server.csr -config ${TMPDIR}/csr.conf

@j-sokol Thank you so much for the tip that certificates need to be provided this way, worked for me!!

In case anyone like me stumbles across this feel free to use my whole config, which works with the fixes described in this thread:

global:
  enabled: true
  tlsDisable: false

injector:
  enabled: false

server:
  extraEnvironmentVars:
    VAULT_CACERT: /vault/userconfig/vault/vault.ca
    VAULT_TLSCERT: /vault/userconfig/vault/vault.crt
    VAULT_TLSKEY: /vault/userconfig/vault/vault.key

  extraVolumes:
    - type: secret
      name: vault

  ha:
    enabled: true
    replicas: 3
    raft:
      enabled: true
      setNodeId: false

      config: |
        ui = true
        api_addr = "http://POD_IP:8200"

        listener "tcp" {
          address = "0.0.0.0:8200"
          cluster_address = "0.0.0.0:8201"

          tls_cert_file = "/vault/userconfig/vault/vault.crt"
          tls_key_file  = "/vault/userconfig/vault/vault.key"
          tls_client_ca_file = "/vault/userconfig/vault/vault.ca"
        }

        storage "raft" {
          path = "/vault/data"

          retry_join {
            leader_api_addr = "https://vault-0.vault-internal:8200"
            leader_ca_cert_file = "/vault/userconfig/vault/vault.ca"
            leader_client_cert_file = "/vault/userconfig/vault/vault.crt"
            leader_client_key_file = "/vault/userconfig/vault/vault.key"
          }
          retry_join {
            leader_api_addr = "https://vault-1.vault-internal:8200"
            leader_ca_cert_file = "/vault/userconfig/vault/vault.ca"
            leader_client_cert_file = "/vault/userconfig/vault/vault.crt"
            leader_client_key_file = "/vault/userconfig/vault/vault.key"
          }
          retry_join {
            leader_api_addr = "https://vault-2.vault-internal:8200"
            leader_ca_cert_file = "/vault/userconfig/vault/vault.ca"
            leader_client_cert_file = "/vault/userconfig/vault/vault.crt"
            leader_client_key_file = "/vault/userconfig/vault/vault.key"
          }

          autopilot {
            cleanup_dead_servers = "true"
            last_contact_threshold = "200ms"
            last_contact_failure_threshold = "10m"
            max_trailing_logs = 250000
            min_quorum = 3
            server_stabilization_time = "10s"
          }

        }

Using wildcard certificate as @ikarlashov noted worked for me.

One thing to mention, when joining the cluster from vault-1 and vault-2 pods, key, cert and ca-cert have to be provided:

vault operator raft join -leader-ca-cert="@${VAULT_CACERT}" -leader-client-cert="@${VAULT_TLSCERT}" -leader-client-key="@${VAULT_TLSKEY}" https://vault-0.vault-internal:8200'

where env variables above are set in Helm chart’s values:

server:
  extraEnvironmentVars:
    VAULT_CACERT: /vault/userconfig/vault-tls/vault.ca 
    VAULT_TLSCERT: /vault/userconfig/vault-tls/vault.crt
    VAULT_TLSKEY: /vault/userconfig/vault-tls/vault.key
  ha:
    enabled: true
    raft:
      enabled: true
      config: |
        ui = true
        listener "tcp" {
          address = "[::]:8200"
          cluster_address = "[::]:8201"

          tls_cert_file = "/vault/userconfig/vault-tls/vault.crt" 
          tls_key_file  = "/vault/userconfig/vault-tls/vault.key"
          tls_client_ca_file = "/vault/userconfig/vault-tls/vault.ca"
        }

Also, certificates have to be configured in server.ha.raft.config.

Just faced this issue today. You need to generate csr with the following config:

[alt_names]
DNS.1 = *.${VAULT_INTERNAL_SVC}
DNS.2 = *.${NAMESPACE}.svc.cluster.local
DNS.3 = *.${VAULT_INTERNAL_SVC}.${NAMESPACE}.svc.cluster.local

where

VAULT_RELEASE_NAME="vault"
VAULT_INTERNAL_SVC="${VAULT_RELEASE_NAME}-internal"

Then you can do:

vault operator init -address https://vault-0.vault-internal.vault.svc.cluster.local:8200

That’s it 😃

it is possible to set specific server name which will be used in TLS handshake

retry_join {
  leader_tls_servername = "vault"
}

ref. https://www.vaultproject.io/docs/concepts/integrated-storage#autojoin-with-tls-servername