caddy: High GC pressure causing slowdowns (was: gzip awfully slow)

1. What version of Caddy are you running (caddy -version)?

Caddy 0.8.2

2. What are you trying to do?

Simply running a comparison between my own, simple Go web server and Caddy in various modes.

My tests were run with https://github.com/rakyll/boom:

boom -n 10000 -c 100 http://mysite/index.html

3. What is your entire Caddyfile?

0.0.0.0:8080
root /somewhere
log /var/log/caddy.log
gzip

4. How did you run Caddy (give the full command and describe the execution environment)?

./caddy -conf Caddyfile

5. What did you expect to see?

A total execution time of the test somewhere around 12s±5%, which is the speed it takes for my own Go server to respond to the test when set up to serve from disk with gzip enabled (which SHA256 and gzips the content on every request).

6. What did you see instead (give full error messages and/or log)?

A total execution time of 39.6s±5%. That’s a lot. The same run against Caddy without gzip enabled on the server takes 6.8s±5% in comparison. That’s 3 times slower than my server, and while I suspected that my server would be faster, I had thought a few percent, not factor 3.

I don’t use Caddy myself, but I thought I’d give you a heads up about my findings. I have not investigated why Caddy is so slow in this case, but it should be possible to get similar speeds to my server, which isn’t magic when serving data from disk.

About this issue

  • Original URL
  • State: closed
  • Created 8 years ago
  • Comments: 28 (7 by maintainers)

Most upvoted comments

The difference in behavior between these two servers comes from the way how both handle files internally. Caddy streams files using FileServer from disk. This causes many small reads / copies as it is reading chunks of data from disk and then GZIPing those chunks which increases GC pressure due to small allocations. minihttp on the other side reads the whole file in one go and is then GZIPing the whole buffer.

The second approach will indeed perform much better overall while handling small files and especially so in a low performance environment as it is less punishing on resources. However, the more you increase file size, the more this approach suffers in both raw performance as well as in memory consumption and response time. The approach of minihttp has by definition a much higher time to the first byte sent while caddy is able to respond right after the first chunk was gzipped.

To conclude my thoughts… The approach minihttp takes shows that it vastly outperforms chunk based streaming for small files where caddy is using the better option for larger files. The optimal solution to this would involve taking a middle way where we implement our own “FileServer” and set a certain size threshold for the chunk based file serving and files smaller then this threshold get read into memory and GZIPed as a whole.

Now on to some charts…

Please keep in mind that the following graphs and numbers are coming from a microbenchmark which in no way is representative of actual application performance but are here to illustrate the difference in implementations.

The machine used was a dual core x64 archlinux box with 16GB of RAM as well as the following configurations for caddy / minihttp:

Caddyfile

0.0.0.0:8080
root root/
log caddy.log
gzip

minihttp.toml

root = "./web"
logFile = "minihttp.log"
logLines = 16384

[http]
        address = ":9090"

For minihttp, the “fancy folder” feature was used to reload the file from disk on every request to get a near even playingfield.

1MB File

Increasing the file size to 1MB already shows this in caddy outpacing minihttp in req/s as well as memory consumption. Please keep in mind that req/s here mean HTTP 200 responses from the server, not a fully received file.

Caddy used ~150MB of ram while doing ~300 req/s caddy_1mb_60s

⇒  ./wrk -H "Accept-Encoding: gzip" -t 100 -c 100 -d 60s http://192.168.10.12:8080/index.html
Running 1m test @ http://192.168.10.12:8080/index.html
  100 threads and 100 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   349.55ms  240.02ms   1.89s    73.74%
    Req/Sec     4.03      3.47    50.00     83.07%
  18142 requests in 1.00m, 101.62MB read
  Socket errors: connect 0, read 0, write 0, timeout 5
Requests/sec:    301.87
Transfer/sec:      1.69MB

while minihttp goes up to ~200MB while doing ~163 req/s. minihttp_1mb_60s

⇒  ./wrk -H "Accept-Encoding: gzip" -t 100 -c 100 -d 60s http://192.168.10.12:9090/f/index.html
Running 1m test @ http://192.168.10.12:9090/f/index.html
  100 threads and 100 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   617.89ms  359.71ms   1.99s    67.67%
    Req/Sec     2.00      2.27    20.00     87.98%
  9805 requests in 1.00m, 55.45MB read
  Socket errors: connect 0, read 0, write 0, timeout 25
Requests/sec:    163.14
Transfer/sec:      0.92MB

2MB File:

Caddy: caddy_2mb_60s

minihttp: minihttp_2mb_60s