traefik: v2.8.2 go panic

Welcome!

  • Yes, I’ve searched similar issues on GitHub and didn’t find any.
  • Yes, I’ve searched similar issues on the Traefik community forum and didn’t find any.

What did you do?

Watchtower upgraded to 2.8.2, I’m sourcing latest. Upgrade should have gone smoothly as usual.

What did you see instead?

Go panic, can post full stack trace if necessary, its very large and hard to bound.

What version of Traefik are you using?

Version:      2.8.2
Codename:     vacherin
Go version:   go1.19
Built:        2022-08-11T14:55:50Z
OS/Arch:      linux/amd64

What is your environment & configuration?

Docker provider, cannot provide config (company/org). 2.8.1 works as expected.

If applicable, please paste the log output in DEBUG level

 time="2022-08-11T17:16:16-03:00" level=error msg="Error in Go routine: runtime error: slice bounds out of range [2:1]"
traefik-traefik-1  | time="2022-08-11T17:16:16-03:00" level=error msg="Stack: goroutine 29 [running]:\nruntime/debug.Stack()\n\truntime/debug/stack.go:24 +0x65\ngithub.com/traefik/traefik/v2/pkg/safe.defaultRecoverGoroutine({0x36a75c0?, 0xc0007e40c0})\n\tgithub.com/traefik/traefik/v2/pkg/safe/routine.go:66 +0xa5\ngithub.com/traefik/traefik/v2/pkg/safe.GoWithRecover.func1.1()\n\tgithub.com/traefik/traefik/v2/pkg/safe/routine.go:56 +0x36\npanic({0x36a75c0, 0xc0007e40c0})\n\truntime/panic.go:884 +0x212\ngithub.com/traefik/paerser/parser.filler.setSlice({{0x19?, {0x3989c0e?, 0x0?}}}, {0x2ff49c0?, 0xc0004c91f8?, 0x355146f?}, 0xc0004d86c0)\n\tgithub.com/traefik/paerser@v0.1.6/parser/element_fill.go:157 +0xaa5\ngithub.com/traefik/paerser/pa

lots more, typical go stack trace. I can’t reproduce this frequently, I need to get this server back to production.

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Reactions: 17
  • Comments: 96 (29 by maintainers)

Commits related to this issue

Most upvoted comments

If this sort of instability is to be expected perhaps there should be a stable tag to accompany tested versions? I’d prefer not to have to manually pin to each minor version to prevent unrecovered panics in production.

Because of the amount of discussion on this topic, I will try to summarize it (again).

In v2.8.2, the parser of dynamic configuration files can detect malformed configuration. The parser will detect when a string is used instead of an array.

example:

http:
  routers:
    example:
      entryPoints: websecure ## <-- INVALID
http:
  routers:
    example:
      entryPoints:
        - websecure  ## <-- VALID

To detect the problems, you can use JSON schema validation https://www.schemastore.org/json/. The schema validation can be integrated into your IDE/text editor.

In v2.8.3, we changed the behavior to allow invalid configurations (this is not a revert or a rollback but an extended behavior). We will remove this (the support of invalid configurations) in a future release (v2.9 or later). So we recommend validating your configuration with the JSON schema of Traefik and fixing the invalid configurations.

Because of the amount of discussion on this topic, I will try to summarize it.

In v2.8.2, the parser of dynamic configuration files can detect malformed configuration. The parser will detect when a string is used instead of an array.

example:

http:
  routers:
    example:
      entryPoints: websecure ## <-- INVALID
http:
  routers:
    example:
      entryPoints:
        - websecure  ## <-- VALID

To detect the problems, you can use JSON schema validation https://www.schemastore.org/json/.

The schema validation can be integrated into your IDE/text editor.

If you need help, you can come to the forum: https://community.traefik.io/

I will create a path that will return an error instead of panic.

But your configuration needs to be changed

does that mean a patch release caused a breaking change?

The previous behavior (<2.8.2) was a bug, so it’s not a breaking change, now we just detect invalid configuration.

So apparently I have no errors in my traefik or dynamic config (at least nothing gets underlined from the schemastore plugin in VSCode but it still doesn’t work with 2.8.2 😕

The schemastore extension is related to json files, you need the vscode-yaml extension to validate yaml files using schemastore.

This is the plugin: image

In order to enable a specific JSON schema, the quickest way is through modeline, so in dynamic config file, put this as the first line:

# set yaml schema through modeline (vscode-yaml extension): traefik dynamic schema
# yaml-language-server: $schema=https://json.schemastore.org/traefik-v2-file-provider.json

For the static config you would use this:

# set yaml schema through modeline (vscode-yaml extension): traefik static schema
# yaml-language-server: $schema=https://json.schemastore.org/traefik-v2.json

I put your dynamic config in my VSCode with the vscode-yaml extension, and it highlighted all the errors, also the crowdsec one (it should be forwardAuth, with the uppercased A) that wasn’t mentioned:

image

I documented the problem into the migration guide: https://doc.traefik.io/traefik/migration/v2/#v283

We will add a link to this page inside the release note of v2.8.3

I missed the point of the series of errors, sorry and thank you very much

You can use JSON schema validation.

https://www.schemastore.org/json/

Thanks a lot for the hint. I installed the vscode-yaml extension that also supports SchemaStore repo.

It immediately found the issue in my big dynamic config file:

problem (string)

hostsProxyHeaders: "X-Forwarded-Host"

fix (array/list)

hostsProxyHeaders: 
  - "X-Forwarded-Host"

Now v2.8.2 starts correctly.

I think this information is inaccurate. A config which uses the string from of entryPoints on a router (example below) that worked on 2.8.1 does not work on 2.8.2, but works on 2.8.3 without any errors allowing routing the same as 2.8.1.

You didn’t read the whole thread: in 2.8.2 the errors have become blocking, in 2.8.3 they reverted this. Read this post and the following replies of the devs.

There are no unidentified issues.

Thanks Ludovic. I’ll wait for the merge then test again.

@ldez @VladyslavVolkov it’s working great: with the updated schemas, I found another error I had in the dynamic config: contentsecuritypolicy vs contentSecurityPolicy.

Great job gentlemen, and again: thanks a lot. 😃

@VladyslavVolkov @ldez I was thinking: would it be possible to create a simple webpage with two sections (static and dynamic) where the users could paste their configs and they would be validated using YAML Language Server, that supports schemastore?

I tried to look for an online YAML validator with schemastore support, but couldn’t find one. I think it would really help traefik users.

I updated to scheme to be strict for all properties.

Thanks Ludovic. I’ll wait for the merge then test again.

I updated to scheme to be strict for all properties. https://github.com/SchemaStore/schemastore/pull/2419

I’m currently working on it.

@alexdelprete I see what you mean - found that additionalProperties are not forbidden on http router object.

Could you check please if my assumption correct? You should point yaml config to this schema url https://raw.githubusercontent.com/VladyslavVolkov/schemastore/traefik-schema-update/src/schemas/json/traefik-v2-file-provider.json which is not in schemastore yet.

@ldez @alexdelprete thanks guys for commenting that, I’ve updated both schemas (static/dynamic) in PR mentioned above - let me know if additional changes required.

I did not think that you are a validation system but every tool I used for YAML-Validation said it was all perfectly valid… I still do not fully understand why this is so different but I did correct the errors you pointed out and now it works again, so thank you very much 👍

I’m not a validation system 😄

homebridge.yml
http:
  routers:
    portainer:
      entryPoints:
        - https  ## <--- THE FIX
      rule: Host(`homebridge.****.tld`)
      tls:
        certResolver: letsencrypt
        options: myTLSOptions
      service: homebridge
      middlewares:
        - chain-no-auth

  services:
    homebridge:
      loadBalancer:
        servers:
          - url: http://192.168.0.250:8581/
        passHostHeader: true
unraid.yml
http:
  routers:
    unraid:
      entryPoints:
        - https
      rule: Host(`unraid.****.tld`)
      tls:
        certResolver: letsencrypt
        options: myTLSOptions
      service: unraid
      middlewares:
        - chain-oauth
  
  services:
    unraid:
      loadBalancer:
        servers:
        - url: http://192.168.0.250:8880/
        passHostHeader: true

Use the JSON schema validation from schemastore, entrypoints must be entryPoints.

To get more help, please go to the forum: https://community.traefik.io/

The main problem is the indentation of the section http.

You have problems with entryPoints and middlewares.

Use the JSON schema validation from schemastore

To get more help, please go to the forum: https://community.traefik.io/

tcp:
  routers:
    openvpn:
      entryPoints:
       - https
      rule: HostSNI(`*`)
      service: openvpn
  services:
    openvpn:
      loadBalancer:
        servers:
          - address: '192.168.10.14:443'
http:
  routers:
    traefik:
      entryPoints:
      - https
      rule: (Host(`traefik.mydomain.co.uk`) && (PathPrefix(`/api`) || PathPrefix(`/dashboard`)))
      service: api@internal
      middlewares:
        - auth
        - secure_https
      tls:
        certResolver: myresolver
        domains:
          - main: '*.mydomain.co.uk'
            sans:
              - mydomain.co.uk
    sab:
      entryPoints:
        - https
      rule: Host(`sab.mydomain.co.uk`)
      service: sab
      middlewares:
        - secure_https
      tls: {}
    radarr:
      entryPoints:
        - https
      rule: Host(`radarr.mydomain.co.uk`)
      service: radarr
      middlewares:
        - secure_https
      tls: {}
    trans:
      entryPoints:
        - https
      rule: Host(`trans.mydomain.co.uk`)
      service: trans
      middlewares:
        - secure_https
      tls: {}
    sonarr:
      entryPoints:
        - https
      rule: Host(`sonarr.mydomain.co.uk`)
      service: sonarr
      middlewares:
        - secure_https
      tls: {}
    jellyfin:
      entryPoints:
        - https
      rule: Host(`jellyfin.mydomain.co.uk`)
      service: jellyfin
      middlewares:
        - secure_https
      tls: {}
    prowlarr:
      tls: {}
      entryPoints:
        - https
      rule: Host(`prowlarr.mydomain.co.uk`)
      service: prowlarr
      middlewares:
        - secure_https
    readarr:
      entryPoints:
      - https
      rule: Host(`readarr.mydomain.co.uk`)
      service: readarr
      middlewares:
        - secure_https
      tls: {}
  middlewares:
    auth:
      basicAuth:
        users:
          - 'me:something'
    secure_https:
      headers:
        frameDeny: true
        browserXssFilter: true
        stsSeconds: 31536000
        stsIncludeSubdomains: true
        stsPreload: true
        forceSTSHeader: true
        contentTypeNosniff: true
        customResponseHeaders:
          X-Robots-Tag: 'noindex,nofollow,nosnippet,noarchive,notranslate,noimageindex'
    ipwhitelist:
      ipWhiteList:
        sourceRange:
        - 127.0.0.1/32
        - 192.168.1.0/24
        - 192.168.5.0/24
        - 192.168.10.0/24
        - 10.8.0.0/24
  services:
    sab:
      loadBalancer:
        servers:
          - url: 'http://192.168.10.6:8080/'
    radarr:
      loadBalancer:
        servers:
          - url: 'http://192.168.10.8:7878/'
    trans:
      loadBalancer:
        servers:
          - url: 'http://192.168.10.12:9091/'
    sonarr:
      loadBalancer:
        servers:
          - url: 'http://192.168.10.9:8989/'
    jellyfin:
      loadBalancer:
        servers:
          - url: 'http://192.168.10.10:8096/'
    nas:
      loadBalancer:
        servers:
          - url: 'http://192.168.1.100:5000/'
    prowlarr:
      loadBalancer:
        servers:
          - url: 'http://192.168.10.5:9696/'
    readarr:
      loadBalancer:
        servers:
          - url: 'http://192.168.10.20:8787/'
  • my fix is the right solution
  • the fix will be moved in another release (maybe v2.9 or later).

Ok, now I get it (I think): it was the right solution but on the timing there were different opinions. 😃

Wow, that fixed it. Thanks! Wasn’t underlined from the schemastore plugin though. Hm. Anyways, thanks again. I wasn’t able to spot that.

invalid configurations:


    user-auth:
      basicAuth:
        users: "user:$passwordhash" # <-- here

    proxmox-secure:
      entryPoints:
        - websecure
      rule: Host(`proxmox.local.mydomain.com`)
      middlewares: safe-ipwhitelist@file # <--- here

valid configurations:


    user-auth:
      basicAuth:
        users:
          - "user:$passwordhash" # <-- the fix

    proxmox-secure:
      entryPoints:
        - websecure
      rule: Host(`proxmox.local.mydomain.com`)
      middleware:
        - safe-ipwhitelist@file # <--- the fix

And when the tag is modified to 2.7.3, it can run normally, which makes me misunderstand the problem of the current version

If you read @ldez post here, he explained why:

The previous behavior (<2.8.2) was a bug, so it’s not a breaking change, now we just detect invalid configuration.

this is dynamic configuration after all

no, that’s the static. and the error was clear in the log: the websecure definition.

Thanks, I planned to add a port and forgot to change it back, but I don’t know why 2.7.3 can run, sorry

I’ve been back through all the official docs and I can’t seem to find an instance of the config I originally used. Perhaps I made the assumption that if it was a single value, a string instead of a list would be fine. Now I know better!

@ldez I honestly couldn’t tell you. It’s been… years, since I created that config file. And given the time I’m also going to sleep, now that everything is working again. I hope @MattKobayashi will help you out 😅

Thank you for the super quick response though. Didn’t expect that to get resolved so fast 😄

You can use JSON schema validation.

https://www.schemastore.org/json/

now we just detect invalid configuration.

how do we identify where the problem in the config is exactly? my dynamic config is pretty big…and there’s no indication of the lines that create the problem.