k3s: K3s v1.27.7+k3s1 and v1.28.3+k3s1 bundle old Traefik CRDs, causing kubernetes api connection issues

Environmental Info: K3s Version:

k3s version v1.27.7+k3s1 (b6f23014)
go version go1.20.10

Describe the bug:

Sorry for omitting some steps of the issue template, but I am reasonable sure that they don’t apply.

We just upgraded from v1.26.6+k3s1 to v1.27.7+k3s1 and now the bundled Traefik instance cannot connect to the kubernetes api anymore, probably since https://github.com/k3s-io/k3s/commit/3abc8b82ed0779ebaa5d0ca00165408ad085cc8f#diff-950b8e60144da8e48c9c65a3e25d0c4cd3264400aca9bbf94d6f30e7dc2f030c

The issue seems to be that Traefik expects a new GKV in their CRDs since Traefik 2.10. The migration from 2.9 to 2.10 is explained here: https://doc.traefik.io/traefik/migration/v2/#kubernetes-crds

K3s bundles the Traefik Helm-Chart at version 21.2.1+up21.2.0 that still includes the old CRDs, which probably is the reason why the bundled RBAC doesn’t apply anymore to Traefik 2.10.

About this issue

  • Original URL
  • State: closed
  • Created 8 months ago
  • Reactions: 20
  • Comments: 22 (6 by maintainers)

Most upvoted comments

Another way to fix this is to:

  • Manually add missing CRDs:
kubectl apply -f https://raw.githubusercontent.com/traefik/traefik/v2.10/docs/content/reference/dynamic-configuration/kubernetes-crd-definition-v1.yml
  • Add traefik.io apiGroup to traefik-kube-system Cluster Role:
...
  - verbs:
      - get
      - list
      - watch
    apiGroups:
      - traefik.containo.us
      - traefik.io <-- add this
 ...
  • Change all apiGroups in Traefik resources from traefik.containo.us/v1alpha1 to traefik.io/v1alpha1:

Also, don’t forget to restart Traefik deployment.

This issue also affects v1.28.3+k3s1

I am able to reproduce the issues on these latest October releases using the following steps:

  1. Create VM that has a public IP
  2. Configure k3s config to use public ip
# /etc/rancher/k3s/config.yaml
node-external-ip: 1.2.3.4 # <-- actually supply the public IP of the VM
  1. Start k3s. For example curl -sfL https://get.k3s.io | INSTALL_K3S_VERSION=v1.27.7+k3s1 sh -
  2. After k3s is up and running, deploy Traefik IngressRoute resources. The following yaml is configured for hardened environments, so this should work regardless of setup. Ensure to change the IPs in the IngressRoutes to use your public IP!
apiVersion: v1
kind: Namespace
metadata:
  name: test-ingress
  labels:
    pod-security.kubernetes.io/enforce: privileged
    pod-security.kubernetes.io/enforce-version: v1.25
    pod-security.kubernetes.io/audit: privileged
    pod-security.kubernetes.io/audit-version: v1.25
    pod-security.kubernetes.io/warn: privileged
    pod-security.kubernetes.io/warn-version: v1.25
---
apiVersion: v1
kind: Service
metadata:
  name: traefik
  namespace: test-ingress
spec:
  ports:
    - protocol: TCP
      name: web
      port: 8000
    - protocol: TCP
      name: admin
      port: 8080
    - protocol: TCP
      name: websecure
      port: 4443
  selector:
    app: traefik
---
apiVersion: v1
kind: Service
metadata:
  name: whoami
  namespace: test-ingress
spec:
  ports:
    - protocol: TCP
      name: web
      port: 80
  selector:
    app: whoami
---
apiVersion: v1
kind: ServiceAccount
metadata:
  namespace: test-ingress
  name: traefik-ingress-controller
---
kind: Deployment
apiVersion: apps/v1
metadata:
  namespace: test-ingress
  name: whoami
  labels:
    app: whoami
spec:
  replicas: 2
  selector:
    matchLabels:
      app: whoami
  template:
    metadata:
      labels:
        app: whoami
    spec:
      containers:
        - name: whoami
          image: traefik/whoami
          ports:
            - name: web
              containerPort: 80
---
apiVersion: traefik.containo.us/v1alpha1
kind: IngressRoute
metadata:
  name: simpleingressroute
  namespace: test-ingress
spec:
  entryPoints:
    - web
  routes:
  - match: Host(`1.2.3.4.nip.io`) && PathPrefix(`/notls`)
    kind: Rule
    services:
    - name: whoami
      port: 80
---
apiVersion: traefik.containo.us/v1alpha1
kind: IngressRoute
metadata:
  name: ingressroutetls
  namespace: test-ingress
spec:
  entryPoints:
    - websecure
  routes:
  - match: Host(`1.2.3.4.nip.io`) && PathPrefix(`/tls`)
    kind: Rule
    services:
    - name: whoami
      port: 80
  tls:
    certResolver: myresolver
  1. Attempt to access the resources. See below the queries with their EXPECTED results:
$ curl -k https://1.2.3.4.nip.io/tls
Hostname: whoami-76c79d59c8-qtp8m
IP: 127.0.0.1
IP: ::1
IP: 10.42.2.15
IP: fe80::ec9d:acff:fed7:62db
RemoteAddr: 10.42.0.8:35200
GET /tls HTTP/1.1
Host: 1.2.3.4.nip.io
User-Agent: curl/8.1.2
Accept: */*
Accept-Encoding: gzip
X-Forwarded-For: 10.42.0.7
X-Forwarded-Host: 1.2.3.4.nip.io
X-Forwarded-Port: 443
X-Forwarded-Proto: https
X-Forwarded-Server: traefik-64f55bb67d-m579f
X-Real-Ip: 10.42.0.7

$ curl -k http://1.2.3.4.nip.io/notls
Hostname: whoami-76c79d59c8-9xxcd
IP: 127.0.0.1
IP: ::1
IP: 10.42.3.7
IP: fe80::7cbc:4eff:fe7b:ce15
RemoteAddr: 10.42.0.8:47942
GET /notls HTTP/1.1
Host: 1.2.3.4.nip.io
User-Agent: curl/8.1.2
Accept: */*
Accept-Encoding: gzip
X-Forwarded-For: 10.42.0.7
X-Forwarded-Host: 1.2.3.4.nip.io
X-Forwarded-Port: 80
X-Forwarded-Proto: http
X-Forwarded-Server: traefik-64f55bb67d-m579f
X-Real-Ip: 10.42.0.7

The actual result in the current state is:

$ curl -k https://1.2.3.4.nip.io/tls
404 page not found

$ curl -k http://1.2.3.4.nip.io/notls
404 page not found

And the traefik pod logs are flooded with similar errors:

$ k logs -n kube-system pod/traefik-7fbbb44c44-wnqp8
...
E1102 18:24:53.049953       1 reflector.go:140] k8s.io/client-go@v0.26.3/tools/cache/reflector.go:169: Failed to watch *v1alpha1.IngressRouteUDP: failed to list *v1alpha1.IngressRouteUDP: ingressrouteudps.traefik.io is forbidden: User "system:serviceaccount:kube-system:traefik" cannot list resource "ingressrouteudps" in API group "traefik.io" at the cluster scope
W1102 18:25:01.471019       1 reflector.go:424] k8s.io/client-go@v0.26.3/tools/cache/reflector.go:169: failed to list *v1alpha1.TLSStore: tlsstores.traefik.io is forbidden: User "system:serviceaccount:kube-system:traefik" cannot list resource "tlsstores" in API group "traefik.io" at the cluster scope
E1102 18:25:01.471147       1 reflector.go:140] k8s.io/client-go@v0.26.3/tools/cache/reflector.go:169: Failed to watch *v1alpha1.TLSStore: failed to list *v1alpha1.TLSStore: tlsstores.traefik.io is forbidden: User "system:serviceaccount:kube-system:traefik" cannot list resource "tlsstores" in API group "traefik.io" at the cluster scope
W1102 18:25:07.205565       1 reflector.go:424] k8s.io/client-go@v0.26.3/tools/cache/reflector.go:169: failed to list *v1alpha1.IngressRouteTCP: ingressroutetcps.traefik.io is forbidden: User "system:serviceaccount:kube-system:traefik" cannot list resource "ingressroutetcps" in API group "traefik.io" at the cluster scope
E1102 18:25:07.205611       1 reflector.go:140] k8s.io/client-go@v0.26.3/tools/cache/reflector.go:169: Failed to watch *v1alpha1.IngressRouteTCP: failed to list *v1alpha1.IngressRouteTCP: ingressroutetcps.traefik.io is forbidden: User "system:serviceaccount:kube-system:traefik" cannot list resource "ingressroutetcps" in API group "traefik.io" at the cluster scope

Validated using k3s version v1.28.3-rc1+k3s2 On fresh install and upgrade scenario.

ec2-user@ip-172-31-13-175:~> k3s -v
k3s version v1.28.3+k3s1 (49411e70)
go version go1.20.10
ec2-user@ip-172-31-13-175:~> kubectl get nodes
NAME               STATUS   ROLES                       AGE     VERSION
ip-172-31-13-175   Ready    control-plane,etcd,master   7h44m   v1.28.3+k3s1
ip-172-31-14-82    Ready    control-plane,etcd,master   7h42m   v1.28.3+k3s1
ip-172-31-5-126    Ready    control-plane,etcd,master   7h42m   v1.28.3+k3s1
ip-172-31-8-144    Ready    <none>                      7h41m   v1.28.3+k3s1

ec2-user@ip-172-31-13-175:~> k3s -v
k3s version v1.28.3-rc1+k3s2 (1ae053d9)
go version go1.20.10
ec2-user@ip-172-31-13-175:~> kubectl get nodes
NAME               STATUS   ROLES                       AGE   VERSION
ip-172-31-13-175   Ready    control-plane,etcd,master   8h    v1.28.3-rc1+k3s2
ip-172-31-14-82    Ready    control-plane,etcd,master   8h    v1.28.3-rc1+k3s2
ip-172-31-5-126    Ready    control-plane,etcd,master   8h    v1.28.3-rc1+k3s2
ip-172-31-8-144    Ready    <none>                      8h    v1.28.3-rc1+k3s2
ec2-user@ip-172-31-13-175:~> kubectl get crd |grep traefik
ingressroutes.traefik.containo.us       2023-11-06T08:03:50Z
ingressroutes.traefik.io                2023-11-06T15:50:15Z
ingressroutetcps.traefik.containo.us    2023-11-06T08:03:50Z
ingressroutetcps.traefik.io             2023-11-06T15:50:15Z
ingressrouteudps.traefik.containo.us    2023-11-06T08:03:50Z
ingressrouteudps.traefik.io             2023-11-06T15:50:16Z
middlewares.traefik.containo.us         2023-11-06T08:03:50Z
middlewares.traefik.io                  2023-11-06T15:50:16Z
middlewaretcps.traefik.containo.us      2023-11-06T08:03:50Z
middlewaretcps.traefik.io               2023-11-06T15:50:16Z
serverstransports.traefik.containo.us   2023-11-06T08:03:50Z
serverstransports.traefik.io            2023-11-06T15:50:16Z
serverstransporttcps.traefik.io         2023-11-06T15:50:16Z
tlsoptions.traefik.containo.us          2023-11-06T08:03:50Z
tlsoptions.traefik.io                   2023-11-06T15:50:16Z
tlsstores.traefik.containo.us           2023-11-06T08:03:50Z
tlsstores.traefik.io                    2023-11-06T15:50:16Z
traefikservices.traefik.containo.us     2023-11-06T08:03:50Z
traefikservices.traefik.io              2023-11-06T15:50:16Z
ec2-user@ip-172-31-13-175:~> kubectl get clusterrole -o yaml |grep -i apiGroup -A 2|grep traefik
    - traefik.io
    - traefik.containo.us
ec2-user@ip-172-31-13-175:~> 

➜  ~ curl -k http://<IP>.nip.io/notls
Hostname: whoami-8c9864b56-6jk6c
IP: 127.0.0.1
IP: ::1
IP: 10.42.3.7
IP: fe80::24bd:6ff:fe28:e940
RemoteAddr: 10.42.1.12:43196
GET /notls HTTP/1.1
Host: <IP>.nip.io
User-Agent: curl/7.64.1
Accept: */*
Accept-Encoding: gzip
X-Forwarded-For: 10.42.0.7
X-Forwarded-Host: <IP>.nip.io
X-Forwarded-Port: 80
X-Forwarded-Proto: http
X-Forwarded-Server: traefik-67b97c588-mkf6z
X-Real-Ip: 10.42.0.7

➜  ~ curl -k https://<IP>.nip.io/tls 
Hostname: whoami-8c9864b56-6jk6c
IP: 127.0.0.1
IP: ::1
IP: 10.42.3.7
IP: fe80::24bd:6ff:fe28:e940
RemoteAddr: 10.42.1.12:43196
GET /tls HTTP/1.1
Host: <IP>.nip.io
User-Agent: curl/7.64.1
Accept: */*
Accept-Encoding: gzip
X-Forwarded-For: 10.42.0.7
X-Forwarded-Host: <IP>.nip.io
X-Forwarded-Port: 443
X-Forwarded-Proto: https
X-Forwarded-Server: traefik-67b97c588-mkf6z
X-Real-Ip: 10.42.0.7

Same issues here. I managed to temporarily fix by rolling back the installed helm release, but I expect this will be overwritten by the next automated k3s update. We’ll need a more permanent solution.

First of all, I confirm, that rolling back image version to 2.9.10 works.

To make it permanent (if I understood correctly) we can customize HelmChartConfig:

# /var/lib/rancher/k3s/server/manifests/traefik-config.yaml

apiVersion: helm.cattle.io/v1
kind: HelmChartConfig
metadata:
  name: traefik
  namespace: kube-system
spec:
  valuesContent: |-
    image:
        tag: 2.9.10

Can someone please confirm exactly what is and is not working, and provide some specific steps showing what resources you’re using that no longer work? Are you saying that the legacy CRDs no longer function, and the new CRDs must be present for traefik to use either version?

This upgrade bumped Traefik image to docker.io/rancher/mirrored-library-traefik:2.10.5. It works only with the new CRDs in traefik.io apiGroup. It ignores traefik.containo.us resources. The bundled Traefik helm chart includes only traefik.containo.us CRDs and traefik-kube-system ClusterRole is missing traefik.io apiGroup. You can see the errors in the traefik pod(s). Something like:

Failed to list *v1alpha1.TLSStore: tlsstores.traefik.io is forbidden: User "system:serviceaccount:kube-system:traefik" cannot list resource "tlsstores" in API group "traefik.io" at the cluster scope

Examples Broken:

apiVersion: traefik.containo.us/v1alpha1 <-- old apiGroup
kind: TLSStore
metadata:
  name: default
  namespace: kube-system
spec:
  defaultCertificate:
    secretName: wildcard-secret

Working (after fix I mentioned earlier):

apiVersion: traefik.io/v1alpha1
kind: TLSStore
metadata:
  name: default
  namespace: kube-system
spec:
  defaultCertificate:
    secretName: wildcard-secret

@lifo9 @brandond according to their official readme, they are installing both CRDs in the versions >= v23 https://github.com/traefik/traefik-helm-chart#crds-support-of-traefik-proxy traefik version v21 is installed by k3s by default and should only support the traefik.containo.us api group

Validated using commit id 1ae053d9447229daf8bbd2cd5adf89234e203bcc, ingress routes using the old api group and new api group work as expected using the above test yaml modified to use old and new apigroup

Validated upgrade from v1.27.6+k3s1 (docker.io/rancher/mirrored-library-traefik 2.9.10) to commit 1ae053d9447229daf8bbd2cd5adf89234e203bcc (docker.io/rancher/mirrored-library-traefik 2.10.5 )

$ kubectl get crd |grep traefik
traefikservices.traefik.io              2023-11-02T23:59:26Z
ingressrouteudps.traefik.containo.us    2023-11-02T23:59:26Z
middlewaretcps.traefik.io               2023-11-02T23:59:26Z
ingressroutetcps.traefik.containo.us    2023-11-02T23:59:26Z
ingressroutes.traefik.containo.us       2023-11-02T23:59:26Z
ingressrouteudps.traefik.io             2023-11-02T23:59:26Z
tlsstores.traefik.containo.us           2023-11-02T23:59:26Z
middlewares.traefik.containo.us         2023-11-02T23:59:26Z
traefikservices.traefik.containo.us     2023-11-02T23:59:26Z
tlsoptions.traefik.containo.us          2023-11-02T23:59:26Z
middlewaretcps.traefik.containo.us      2023-11-02T23:59:26Z
serverstransports.traefik.containo.us   2023-11-02T23:59:26Z
tlsstores.traefik.io                    2023-11-02T23:59:26Z
middlewares.traefik.io                  2023-11-02T23:59:26Z
ingressroutes.traefik.io                2023-11-02T23:59:26Z
ingressroutetcps.traefik.io             2023-11-02T23:59:26Z
tlsoptions.traefik.io                   2023-11-02T23:59:26Z
serverstransports.traefik.io            2023-11-02T23:59:26Z
serverstransporttcps.traefik.io         2023-11-02T23:59:26Z
$ kubectl get clusterrole -o yaml |grep -i apiGroup -A 2|grep traefik
    - traefik.io
    - traefik.containo.us

Using old apiGroup

apiVersion: v1
kind: Namespace
metadata:
  name: test-ingress
  labels:
    pod-security.kubernetes.io/enforce: privileged
    pod-security.kubernetes.io/enforce-version: v1.25
    pod-security.kubernetes.io/audit: privileged
    pod-security.kubernetes.io/audit-version: v1.25
    pod-security.kubernetes.io/warn: privileged
    pod-security.kubernetes.io/warn-version: v1.25
---
apiVersion: v1
kind: Service
metadata:
  name: traefik
  namespace: test-ingress
spec:
  ports:
    - protocol: TCP
      name: web
      port: 8000
    - protocol: TCP
      name: admin
      port: 8080
    - protocol: TCP
      name: websecure
      port: 4443
  selector:
    app: traefik
---
apiVersion: v1
kind: Service
metadata:
  name: whoami
  namespace: test-ingress
spec:
  ports:
    - protocol: TCP
      name: web
      port: 80
  selector:
    app: whoami
---
apiVersion: v1
kind: ServiceAccount
metadata:
  namespace: test-ingress
  name: traefik-ingress-controller
---
kind: Deployment
apiVersion: apps/v1
metadata:
  namespace: test-ingress
  name: whoami
  labels:
    app: whoami
spec:
  replicas: 2
  selector:
    matchLabels:
      app: whoami
  template:
    metadata:
      labels:
        app: whoami
    spec:
      containers:
        - name: whoami
          image: traefik/whoami
          ports:
            - name: web
              containerPort: 80
---
apiVersion: traefik.containo.us/v1alpha1
kind: IngressRoute
metadata:
  name: simpleingressroute
  namespace: test-ingress
spec:
  entryPoints:
    - web
  routes:
  - match: Host(`<IP>.nip.io`) && PathPrefix(`/notls`)
    kind: Rule
    services:
    - name: whoami
      port: 80
---
apiVersion: traefik.containo.us/v1alpha1
kind: IngressRoute
metadata:
  name: ingressroutetls
  namespace: test-ingress
spec:
  entryPoints:
    - websecure
  routes:
  - match: Host(`<IP>.nip.io`) && PathPrefix(`/tls`)
    kind: Rule
    services:
    - name: whoami
      port: 80
  tls:
    certResolver: myresolver
➜  ~ curl -k https://<IP>nip.io/tls    
Hostname: whoami-8c9864b56-s62ct
IP: 127.0.0.1
IP: ::1
IP: 10.42.0.18
IP: fe80::6034:8ff:feb2:d55a
RemoteAddr: 10.42.0.8:53274
GET /tls HTTP/1.1
Host: <IP>.nip.io
User-Agent: curl/7.64.1
Accept: */*
Accept-Encoding: gzip
X-Forwarded-For: 10.42.0.1
X-Forwarded-Host: <IP>.nip.io
X-Forwarded-Port: 443
X-Forwarded-Proto: https
X-Forwarded-Server: traefik-67b97c588-jxc5t
X-Real-Ip: 10.42.0.1

➜  ~ curl -k http://<IP>.nip.io/notls   
Hostname: whoami-8c9864b56-s62ct
IP: 127.0.0.1
IP: ::1
IP: 10.42.0.18
IP: fe80::6034:8ff:feb2:d55a
RemoteAddr: 10.42.0.8:53274
GET /notls HTTP/1.1
Host: <IP>.nip.io
User-Agent: curl/7.64.1
Accept: */*
Accept-Encoding: gzip
X-Forwarded-For: 10.42.0.1
X-Forwarded-Host: 3.145.171.80.nip.io
X-Forwarded-Port: 80
X-Forwarded-Proto: http
X-Forwarded-Server: traefik-67b97c588-jxc5t
X-Real-Ip: 10.42.0.1

Using new apiGroup

apiVersion: v1
kind: Namespace
metadata:
  name: test-ingress
  labels:
    pod-security.kubernetes.io/enforce: privileged
    pod-security.kubernetes.io/enforce-version: v1.25
    pod-security.kubernetes.io/audit: privileged
    pod-security.kubernetes.io/audit-version: v1.25
    pod-security.kubernetes.io/warn: privileged
    pod-security.kubernetes.io/warn-version: v1.25
---
apiVersion: v1
kind: Service
metadata:
  name: traefik
  namespace: test-ingress
spec:
  ports:
    - protocol: TCP
      name: web
      port: 8000
    - protocol: TCP
      name: admin
      port: 8080
    - protocol: TCP
      name: websecure
      port: 4443
  selector:
    app: traefik
---
apiVersion: v1
kind: Service
metadata:
  name: whoami
  namespace: test-ingress
spec:
  ports:
    - protocol: TCP
      name: web
      port: 80
  selector:
    app: whoami
---
apiVersion: v1
kind: ServiceAccount
metadata:
  namespace: test-ingress
  name: traefik-ingress-controller
---
kind: Deployment
apiVersion: apps/v1
metadata:
  namespace: test-ingress
  name: whoami
  labels:
    app: whoami
spec:
  replicas: 2
  selector:
    matchLabels:
      app: whoami
  template:
    metadata:
      labels:
        app: whoami
    spec:
      containers:
        - name: whoami
          image: traefik/whoami
          ports:
            - name: web
              containerPort: 80
---
apiVersion: traefik.io/v1alpha1
kind: IngressRoute
metadata:
  name: simpleingressroute
  namespace: test-ingress
spec:
  entryPoints:
    - web
  routes:
  - match: Host(`<IP>.nip.io`) && PathPrefix(`/notls`)
    kind: Rule
    services:
    - name: whoami
      port: 80
---
apiVersion: traefik.io/v1alpha1
kind: IngressRoute
metadata:
  name: ingressroutetls
  namespace: test-ingress
spec:
  entryPoints:
    - websecure
  routes:
  - match: Host(`<IP>.nip.io`) && PathPrefix(`/tls`)
    kind: Rule
    services:
    - name: whoami
      port: 80
  tls:
    certResolver: myresolver

➜  ~ curl -k https://<IP>.nip.io/tlsnew 
Hostname: whoami-8c9864b56-wptp6
IP: 127.0.0.1
IP: ::1
IP: 10.42.0.20
IP: fe80::6801:96ff:fe88:7f0a
RemoteAddr: 10.42.0.8:45114
GET /tlsnew HTTP/1.1
Host: <IP>.nip.io
User-Agent: curl/7.64.1
Accept: */*
Accept-Encoding: gzip
X-Forwarded-For: 10.42.0.1
X-Forwarded-Host: <IP>.nip.io
X-Forwarded-Port: 443
X-Forwarded-Proto: https
X-Forwarded-Server: traefik-67b97c588-jxc5t
X-Real-Ip: 10.42.0.1

➜  ~ curl -k http://<IP>.nip.io/notlsnew
Hostname: whoami-8c9864b56-xxs5r
IP: 127.0.0.1
IP: ::1
IP: 10.42.0.19
IP: fe80::d412:ff:fe12:8525
RemoteAddr: 10.42.0.8:47482
GET /notlsnew HTTP/1.1
Host: 3.145.171.80.nip.io
User-Agent: curl/7.64.1
Accept: */*
Accept-Encoding: gzip
X-Forwarded-For: 10.42.0.1
X-Forwarded-Host: <IP>.nip.io
X-Forwarded-Port: 80
X-Forwarded-Proto: http
X-Forwarded-Server: traefik-67b97c588-jxc5t
X-Real-Ip: 10.42.0.1

I encountered the same issue, it can be temporarily fixed by kubect edit -n kube-system helmchart traefik and rolling back the image version to 2.9.10.