linkerd2: upgrading helm managed template from ~2.8.1 to 2.9.1 failed

Bug Report

What is the issue?

linkerd check --verbose gave an error:

linkerd-webhooks-and-apisvc-tls
-------------------------------
× tap API server has valid cert
    key tls.crt needs to exist in secret linkerd-tap-tls
    see https://linkerd.io/checks/#l5d-tap-cert-valid for hints

How can it be reproduced?

We were running this modification of stable-2.8.1 (it probably wasn’t really 83ae0ccf, but this is the smallest diff I could find along the path to master):

diff -ur github/linkerd/linkerd2/charts/linkerd2/Chart.yaml ./Chart.yaml
--- github/linkerd/linkerd2/charts/linkerd2/Chart.yaml	2020-08-07 14:37:15.000000000 -0400
+++ ./Chart.yaml	2020-12-21 19:27:08.000000000 -0500
@@ -1,18 +1,17 @@
-apiVersion: "v1"
-# this version will be updated by the CI before publishing the Helm tarball
-appVersion: edge-XX.X.X
-description: Linkerd gives you observability, reliability, and security for your microservices — with no code change required.
+apiVersion: v1
+appVersion: stable-2.8.1
+description: Linkerd gives you observability, reliability, and security for your microservices
+  — with no code change required.
 home: https://linkerd.io
+icon: https://linkerd.io/images/logo-only-200h.png
 keywords:
 - service-mesh
-kubeVersion: ">=1.13.0-0"
-name: "linkerd2"
+kubeVersion: '>=1.13.0-0'
+maintainers:
+- email: cncf-linkerd-dev@lists.cncf.io
+  name: Linkerd authors
+  url: https://linkerd.io/
+name: linkerd2
 sources:
 - https://github.com/linkerd/linkerd2/
-# this version will be updated by the CI before publishing the Helm tarball
-version: 0.1.0
-icon: https://linkerd.io/images/logo-only-200h.png
-maintainers:
-  - name: Linkerd authors
-    email: cncf-linkerd-dev@lists.cncf.io
-    url: https://linkerd.io/
+version: 2.8.1
Only in github/linkerd/linkerd2/charts/linkerd2/: OWNERS
diff -ur github/linkerd/linkerd2/charts/linkerd2/README.md ./README.md
--- github/linkerd/linkerd2/charts/linkerd2/README.md	2020-12-21 19:26:08.000000000 -0500
+++ ./README.md	2020-12-21 19:27:08.000000000 -0500
@@ -187,7 +187,7 @@
 | `grafana.name`                | Name of the grafana instance Service                                                                                                                                                 | `linkerd-grafana`                             |
 | `grafana.image.name`                | Docker image name for the grafana instance                                                                                                                                                 | `gcr.io/linkerd-io/grafana`                             |
 | `grafana.resources.cpu.limit`       | Maximum amount of CPU units that the grafana container can use                                                                                                                     ||
-| `grafana.resources.cpu.request`     | Amount of CPU units that the gafana container requests                                                                                                                            ||
+| `grafana.resources.cpu.request`     | Amount of CPU units that the grafana container requests                                                                                                                            ||
 | `grafana.resources.memory.limit`    | Maximum amount of memory that grafana container can use                                                                                                                        ||
 | `grafana.resources.memory.request`  | Amount of memory that the grafana container requests                                                                                                                               ||
 
Only in .: charts
diff -ur github/linkerd/linkerd2/charts/linkerd2/templates/prometheus.yaml ./templates/prometheus.yaml
--- github/linkerd/linkerd2/charts/linkerd2/templates/prometheus.yaml	2020-12-21 19:26:08.000000000 -0500
+++ ./templates/prometheus.yaml	2020-12-21 19:27:09.000000000 -0500
@@ -34,7 +34,7 @@
       static_configs:
       - targets: ['localhost:9090']
 
-    {{ if .Values.grafana.enabled -}} 
+    {{ if .Values.grafana.enabled -}}
     - job_name: 'grafana'
       kubernetes_sd_configs:
       - role: pod
diff -ur github/linkerd/linkerd2/charts/linkerd2/templates/proxy-injector.yaml ./templates/proxy-injector.yaml
--- github/linkerd/linkerd2/charts/linkerd2/templates/proxy-injector.yaml	2020-12-21 19:26:08.000000000 -0500
+++ ./templates/proxy-injector.yaml	2020-12-21 19:27:09.000000000 -0500
@@ -11,6 +11,7 @@
 metadata:
   annotations:
     {{.Values.global.createdByAnnotation}}: {{default (printf "linkerd/helm %s" .Values.global.linkerdVersion) .Values.global.cliVersion}}
+    secret.reloader.stakater.com/reload: "linkerd-proxy-injector-tls"
   labels:
     app.kubernetes.io/name: proxy-injector
     app.kubernetes.io/part-of: Linkerd
@@ -33,7 +34,7 @@
     metadata:
       annotations:
         {{- if empty .Values.global.cliVersion }}
-        linkerd.io/helm-release-version: {{ $.Release.Revision | quote}}
+        checksum/config: {{ include (print $.Template.BasePath "/proxy-injector-rbac.yaml") . | sha256sum }}
         {{- end }}
         {{.Values.global.createdByAnnotation}}: {{default (printf "linkerd/helm %s" .Values.global.linkerdVersion) .Values.global.cliVersion}}
         {{- include "partials.proxy.annotations" .Values.global.proxy| nindent 8}}
@@ -79,7 +80,7 @@
         - mountPath: /var/run/linkerd/config
           name: config
         - mountPath: /var/run/linkerd/tls
-          name: tls
+          name: tls-projected
           readOnly: true
       - {{- include "partials.proxy" . | indent 8 | trimPrefix (repeat 7 " ") }}
       {{ if not .Values.global.cniEnabled -}}
@@ -89,11 +90,25 @@
       serviceAccountName: linkerd-proxy-injector
       volumes:
       - configMap:
+          defaultMode: 420
           name: linkerd-config
         name: config
-      - name: tls
-        secret:
-          secretName: linkerd-proxy-injector-tls
+      - name: tls-projected
+        projected:
+          defaultMode: 420
+          sources:
+          - secret:
+              name: linkerd-proxy-injector-tls
+              items:
+                - key: tls.crt
+                  path: crt.pem
+                  mode: 420
+          - secret:
+              name: linkerd-proxy-injector-tls
+              items:
+                - key: tls.key
+                  path: key.pem
+                  mode: 420
       {{ if .Values.global.controlPlaneTracing -}}
       - {{- include "partials.proxy.volumes.labels" . | indent 8 | trimPrefix (repeat 7 " ") }}
       {{ end -}}
diff -ur github/linkerd/linkerd2/charts/linkerd2/templates/smi-metrics.yaml ./templates/smi-metrics.yaml
--- github/linkerd/linkerd2/charts/linkerd2/templates/smi-metrics.yaml	2020-12-21 19:26:08.000000000 -0500
+++ ./templates/smi-metrics.yaml	2020-12-21 19:27:09.000000000 -0500
@@ -38,7 +38,7 @@
     metadata:
       annotations:
         {{- if empty .Values.global.cliVersion }}
-        linkerd.io/helm-release-version: {{ $.Release.Revision | quote}}
+        checksum/config: {{ include (print $.Template.BasePath "/smi-metrics-rbac.yaml") . | sha256sum }}
         {{- end }}
         {{.Values.global.createdByAnnotation}}: {{default (printf "linkerd/helm %s" .Values.global.linkerdVersion) .Values.global.cliVersion}}
         {{- include "partials.proxy.annotations" .Values.global.proxy| nindent 8}}
diff -ur github/linkerd/linkerd2/charts/linkerd2/templates/sp-validator.yaml ./templates/sp-validator.yaml
--- github/linkerd/linkerd2/charts/linkerd2/templates/sp-validator.yaml	2020-12-21 19:26:08.000000000 -0500
+++ ./templates/sp-validator.yaml	2020-12-21 19:27:09.000000000 -0500
@@ -30,6 +30,7 @@
 metadata:
   annotations:
     {{.Values.global.createdByAnnotation}}: {{default (printf "linkerd/helm %s" .Values.global.linkerdVersion) .Values.global.cliVersion}}
+    secret.reloader.stakater.com/reload: "linkerd-sp-validator-tls"
   labels:
     app.kubernetes.io/name: sp-validator
     app.kubernetes.io/part-of: Linkerd
@@ -52,7 +53,7 @@
     metadata:
       annotations:
         {{- if empty .Values.global.cliVersion }}
-        linkerd.io/helm-release-version: {{ $.Release.Revision | quote}}
+        checksum/config: {{ include (print $.Template.BasePath "/sp-validator-rbac.yaml") . | sha256sum }}
         {{- end }}
         {{.Values.global.createdByAnnotation}}: {{default (printf "linkerd/helm %s" .Values.global.linkerdVersion) .Values.global.cliVersion}}
         {{- include "partials.proxy.annotations" .Values.global.proxy| nindent 8}}
@@ -96,6 +97,9 @@
           runAsUser: {{.Values.controllerUID}}
         volumeMounts:
         - mountPath: /var/run/linkerd/tls
+          name: tls-fixed
+          readOnly: true
+        - mountPath: /var/run/linkerd/tls-tmp
           name: tls
           readOnly: true
       - {{- include "partials.proxy" . | indent 8 | trimPrefix (repeat 7 " ") }}
@@ -103,8 +107,24 @@
       initContainers:
       - {{- include "partials.proxy-init" . | indent 8 | trimPrefix (repeat 7 " ") }}
       {{ end -}}
+      - image: busybox
+        name: fix-cert-paths
+        volumeMounts:
+        - mountPath: /var/run/linkerd/tls-tmp
+          name: tls
+          readOnly: true
+        - mountPath: /var/run/linkerd/tls
+          name: tls-fixed
+        command:
+        - "sh"
+        - "-c"
+        - |
+          ln -s /var/run/linkerd/tls-tmp/tls.crt /var/run/linkerd/tls/crt.pem &&
+          ln -s /var/run/linkerd/tls-tmp/tls.key /var/run/linkerd/tls/key.pem
       serviceAccountName: linkerd-sp-validator
       volumes:
+      - emptyDir: {}
+        name: tls-fixed
       - name: tls
         secret:
           secretName: linkerd-sp-validator-tls
diff -ur github/linkerd/linkerd2/charts/linkerd2/templates/tap.yaml ./templates/tap.yaml
--- github/linkerd/linkerd2/charts/linkerd2/templates/tap.yaml	2020-12-21 19:26:08.000000000 -0500
+++ ./templates/tap.yaml	2020-12-21 19:27:09.000000000 -0500
@@ -33,6 +33,7 @@
 metadata:
   annotations:
     {{.Values.global.createdByAnnotation}}: {{default (printf "linkerd/helm %s" .Values.global.linkerdVersion) .Values.global.cliVersion}}
+    secret.reloader.stakater.com/reload: "linkerd-tap-tls"
   labels:
     app.kubernetes.io/name: tap
     app.kubernetes.io/part-of: Linkerd
@@ -57,7 +58,7 @@
     metadata:
       annotations:
         {{- if empty .Values.global.cliVersion }}
-        linkerd.io/helm-release-version: {{ $.Release.Revision | quote}}
+        checksum/config: {{ include (print $.Template.BasePath "/tap-rbac.yaml") . | sha256sum }}
         {{- end }}
         {{.Values.global.createdByAnnotation}}: {{default (printf "linkerd/helm %s" .Values.global.linkerdVersion) .Values.global.cliVersion}}
         {{- include "partials.proxy.annotations" .Values.global.proxy| nindent 8}}
@@ -104,18 +105,37 @@
         securityContext:
           runAsUser: {{.Values.controllerUID}}
         volumeMounts:
+        - mountPath: /var/run/linkerd/config
+          name: config
         - mountPath: /var/run/linkerd/tls
+          name: tls-fixed
+          readOnly: true
+        - mountPath: /var/run/linkerd/tls-tmp
           name: tls
           readOnly: true
-        - mountPath: /var/run/linkerd/config
-          name: config
       - {{- include "partials.proxy" . | indent 8 | trimPrefix (repeat 7 " ") }}
       {{ if not .Values.global.cniEnabled -}}
       initContainers:
       - {{- include "partials.proxy-init" . | indent 8 | trimPrefix (repeat 7 " ") }}
       {{ end -}}
+      - image: busybox
+        name: fix-cert-paths
+        volumeMounts:
+        - mountPath: /var/run/linkerd/tls-tmp
+          name: tls
+          readOnly: true
+        - mountPath: /var/run/linkerd/tls
+          name: tls-fixed
+        command:
+        - "sh"
+        - "-c"
+        - |
+          ln -s /var/run/linkerd/tls-tmp/tls.crt /var/run/linkerd/tls/crt.pem &&
+          ln -s /var/run/linkerd/tls-tmp/tls.key /var/run/linkerd/tls/key.pem
       serviceAccountName: linkerd-tap
       volumes:
+      - emptyDir: {}
+        name: tls-fixed
       - configMap:
           name: linkerd-config
         name: config
diff -ur github/linkerd/linkerd2/charts/linkerd2/values.yaml ./values.yaml
--- github/linkerd/linkerd2/charts/linkerd2/values.yaml	2020-12-21 19:26:08.000000000 -0500
+++ ./values.yaml	2020-12-21 19:27:09.000000000 -0500
@@ -11,7 +11,7 @@
   controlPlaneTracing: false
 
   # control plane version. See Proxy section for proxy version
-  linkerdVersion: &linkerd_version linkerdVersionValue
+  linkerdVersion: &linkerd_version stable-2.8.1
 
   namespace: linkerd

Note: we don’t use tap, so the changes for it probably don’t work. The injector changes result in projected volumes that rename certificates (instead of relying on a busybox init container to rename them) so that they’re usable by linkerd. The stakater annotations tell the reloader to restart the pods when the certificates are replaced (in the cases where the projected volumes are used).

We applied this change:

diff --git a/apps/charts/linkerd2/Chart.yaml b/apps/charts/linkerd2/Chart.yaml
--- a/apps/charts/linkerd2/Chart.yaml
+++ b/apps/charts/linkerd2/Chart.yaml
@@ -1,10 +1,9 @@
 name: linkerd2
 apiVersion: v2
-appVersion: 2.8.1-stable
+appVersion: 2.9.1-stable
 description: Linkerd2
-version: 2.8.1
-# TODO: Use upstream chart when https://github.com/linkerd/linkerd2/pull/4645/commits/a48b456192f41879b2234da56d2c952e594d5127 is merged into stable
-# dependencies:
-  # - name: linkerd2
-    # version: 2.8.1
-    # repository: https://helm.linkerd.io/stable
+version: 2.9.1
+dependencies:
+  - name: linkerd2
+    version: 2.9.1
+    repository: https://helm.linkerd.io/stable

(And deleted the charts directory as it is thus provided by the helm repository.)

ArgoCD applied the changes, and the pods were very upset.

So, I upgraded my mac’s linkerd w/ brew to 2.9.1 and ran linkerd check

Logs, error output, etc

bash-5.0$ kubectl -n linkerd get secret/linkerd-tap-tls -o yaml|perl -pne 's/:.+/: .../'
apiVersion: ...
data:
  ca.crt: ...
  tls.crt: ...
  tls.key: ...
kind: ...
metadata:
  annotations:
    cert-manager.io/alt-names: ...
    cert-manager.io/certificate-name: ...
    cert-manager.io/common-name: ...
    cert-manager.io/ip-sans: ...
    cert-manager.io/issuer-group: ...
    cert-manager.io/issuer-kind: ...
    cert-manager.io/issuer-name: ...
    cert-manager.io/uri-sans: ...
  creationTimestamp: ...
  name: ...
  namespace: ...
  resourceVersion: ...
  selfLink: ...
  uid: ...
type: ...

linkerd check output

This was the portion of the output that was interesting:

linkerd-webhooks-and-apisvc-tls
-------------------------------
× tap API server has valid cert
    key tls.crt needs to exist in secret linkerd-tap-tls
    see https://linkerd.io/checks/#l5d-tap-cert-valid for hints

Environment

  • Kubernetes Version:
  • Cluster Environment: (GKE, AKS, kops, …)
  • Host OS:
  • Linkerd version:

Possible solution

Additional context

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 16 (16 by maintainers)

Most upvoted comments

Ok, there’s a bunch of stuff going on here and I admit I don’t totally understand what state you’re upgrading from, but here’s some relevant information:

  • We migrated from secret/linkerd-tap-tls (which uses the crt.pem and key.pem keys) to using secret/linkerd-tap-k8s-tls (which uses the tls.crt and tls.key keys)
  • For backwards compatibility, we’ll still read the old secret if it exists
  • It seems that there’s a bug with the healthcheck error message where it looks for crt.pem in the old style secret but prints an error message about tls.crt

With all that said, doing a fresh install of Linkerd 2.9.1 might be the easiest way to get back into a good state. If that’s not feasible, you may be able to simply rename the keys in your secret to crt.pem and key.pem to bring it back into the format that Linkerd expects for this old type of secret.

So the check is complaining that tls.crt doesn’t exist in secret/linkerd-tap-tls but when you look with kubectl, tls.crt does exist in secret/linkerd-tap-tls? Very strange…