caddy: Frequent hangs when using http/2 push

My team and I use Caddy as a reverse proxy, and we rely on HTTP/2 Push a lot. We’ve started with v2.2.1, on certain configurations we experienced random hangs.

We thought that the problem was fixed with this merge request: https://github.com/caddyserver/caddy/pull/3875 (more details can be found in the net/http issue: https://github.com/golang/go/issues/42534) Because it didn’t occur for past couple weeks, since it was merged. But today we’ve found another configuration that makes it very easy to reproduce the problem again, on any of the recent Caddy versions.

Caddy behavior when hangs happen

  • Web browser hangs infinitely on some resources during webpage assets download: image
  • When it happens, Caddy doesn’t output any error messages
  • During that hang, if I try to download those “pending” assets with a different browser or console curl - it might work for some assets (usually, smaller ones), but for many assets, it hangs infinitely, i.e. when the problem happens for one user - i basically makes the whole website inaccessible for other users too, till that one user hits “stop” button
  • When I hit the “stop” button in a browser tab with hanging downloads, Caddy outputs lots of errors to console: ERROR http.handlers.reverse_proxy aborting with incomplete response {"error": "http2: stream closed"}

How to reproduce

It depends on the proxied website and caddy config, and some random factors, thus it occurs with different frequency on different hardware. The steps are:

  1. Start caddy with the configuration provided below
  2. Open developer tools network tab, to visually see the hangs; checking the “disable cache” toggle will help reproduce the problem faster, but is not necessary
  3. Navigate to / page in a browser (i.e., https://terem-pro.localhost)
  4. Wait till it fully loads
  5. If it didn’t hang on step 4 - hit f5, and again, wait till it fully loads; repeat several times if needed

On our test server, it usually hangs after 2-3 reloads. On some devices, it might require 10-15 attempts but still hangs at some point.

Caddy version

Reproduces on:

  • v2.2.1
  • v2.3.0-beta.1
  • current head of master (4cff36d731390915649261f0e9c088be0eeafcf1), “caddyauth: Use buffered channel passed to signal.Notify”

Built with: CADDY_RACE_DETECTOR=1 xcaddy build <revision>

Caddy configuration

https://terem-pro.localhost {                                                 
    handle {                                                                                                                                                           
        reverse_proxy https://www.terem-pro.ru {
            header_up host {http.reverse_proxy.upstream.host}
        }                                
                                         
        push / {                         
            /local/components/terem/catalog.list/templates/index.best.seller/style.css
            /local/components/terem/new_services.content/templates/home.banner.lots/style.css
            /local/components/terem/slider.blocks/templates/slider.useful/style.css
            /local/components/terem/standard.blocks/templates/call.action.white/style.css
            /local/components/terem/review.list/templates/carousel.home/style.css
            /local/components/terem/standard.blocks/templates/promo.red.home/style.css
            /local/components/terem/promotion.list/templates/home.slider/style.css
            /local/components/terem/form.form/templates/template.pdf/style.css
            /local/templates/terem/components/bitrix/menu/template.header.menu.top/desktop-menu.css
            /local/components/terem/form.form/templates/template.taxi/style.css
            /assets/resources/css/home.css
            /local/templates/terem/components/bitrix/menu/template.header.menu-mobile/style_menu.css
            /local/components/terem/catalog.type.list/templates/.default/style.css
            /assets/resources/css/styles.css
            /bitrix/cache/css/s1/terem/template_ad73b02503569e1113abf0b013fdbb28/template_ad73b02503569e1113abf0b013fdbb28_v1.css?16067202133580
            /bitrix/cache/css/s1/terem/page_074396ca6d41424fe878cb365c109aa1/page_074396ca6d41424fe878cb365c109aa1_v1.css?160672023225970
        }
    }
}

System environment:

Both test server and my PC run on Ubuntu 20.04.1 LTS, x86_64, No-docker Caddy installation

Highlights

  • it allows an easy Denial of Service attack: a single client makes the whole server non-functional
  • there seems to be no timeout, so it might keep the server locked for a while
  • there’s no log message when it hangs (only a message when user hits stop, but it’s not very useful, because the same message appears when user just hits stop in the middle of a normal transfer); thus, this one or other similar situations might be happening on production servers right now, and if it happens with a low enough frequency - it might be tough to catch the problem or distinguish it from just a random network glitch

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Reactions: 8
  • Comments: 27 (21 by maintainers)

Most upvoted comments

{"level":"error","ts":1614297333.1600995,"logger":"http.handlers.reverse_proxy","msg":"aborting with incomplete response","error":"http2: stream closed"} This is happening with local Docker configuration and reverse proxy is to Next/React PWA app and Chrome browser (haven’t tried another one). To see whole Docker setup just check code for ApiPlatform https://github.com/api-platform/api-platform

Will be fixed in Go 1.16.

I confirm that this is still an issue on Go 1.18. I used build from this link https://github.com/caddyserver/caddy/actions/runs/1989510139

I reopened the ticket in Go’s repository again.

I reopened the ticket in Go’s repository.