guetzli: Extremely slow performance

How long does it take for you to compress a couple of images?

I tried compressing a 7,8MB JPG with --quality 84 and it took nearly 20 minutes.

I also tried a 1,4MB JPG with --quality 85 and it took nearly 10 minutes.

I must assume that this is not normal - is something wrong with my binary?

I am on Ubuntu 16.04 LTS, intel core i7-4790K CPU @ 4.00GHz I installed gflag via sudo apt-get install libgflags-dev and got libpng via sudo apt-get install libpng16-dev. After that I make with no errors.

convert -quality 85 src.jpg dst.jpg runs in under 1 second, if that is any help.

Anyone else experience this?

About this issue

Original URL
State: open
Created 7 years ago
Reactions: 10
Comments: 43 (7 by maintainers)

Links to this issue

Comparing Google’s Guetzli JPEG encoder to other solutions

Most upvoted comments

I just profiled Gueztli and most of the time is spent in the butteraugli Convolution() and ButteraugliBlockDiff() methods. One of the big issues hurting the performance is the use of double-precision floating point values to calculate pixel errors. In this case, a 64-bit integer would provide the same accuracy for the error and increase the speed quite a bit since the original pixels could be left as-is. In certain cases, using doubles for pixels makes sense (e.g. some filter, scaling or transparency operations), but not for error calculations. The rest of the code has some efficiency problems, but won’t affect the performance nearly as much.

+16

bitbank2 on Mar 18, 2017

@pornel We didn’t try to optimize Guetzli in ways that could make it harder to modify. That means that there’s likely both some speedup available by just optimizing single routines and, more significantly, speedup available by restructuring parts of Guetzli (e.g. attempting to reuse more computation results between iterations).

That said, I believe that much more can be done for memory consumption, which we didn’t optimize nearly at all.

+13

robryk on Mar 17, 2017

Would it be an option to invoke the binary multiple times in parallel?

Functionally, you can do this with GNU parallel by invoking it like this:

parallel 'guetzli --quality 84 {} {.}.jpg' ::: *.png

Test it yourself:

wget https://github.com/google/guetzli/releases/download/v0/bees.png
for i in 1 2 3 4 5 6 7; do cp bees.png $i.png; done
time parallel 'guetzli --quality 84 {} {.}.jpg' ::: *.png

+11

graysky2 on Mar 18, 2017

Hi folks, if someone is still needed guetzli windows binaries with CUDA support, please check this out . This results 25-40 times faster recompression.

doterax on Aug 1, 2021

@clouless (img-width * img-height) / 1000000 = X megapixels

DanielBiegler on Mar 18, 2017

@robryk I presume large part of the slowness and memory use is because it’s the first release and Guetzli hasn’t been optimized yet. How much of the slowness is inherent to the algorithm and unavoidable, and how much can be done to improve the speed?

kornelski on Mar 17, 2017

I’d love to implement this in my app, but the current performance figures are definitely a roadblock. Taking 13 minutes for a reasonably sized jpeg of a couple MB is simply too long to be practical in many applications.

From my perspective after reading all the current issues, the roadblocks to wide adoption are three and should be prioritized like this:

Faster performance.
Lower memory consumption.
Failures on certain “non-standard” jpegs like those produced by certain cameras (you said you knew what the problem is here)

I think a good, rough goal would be to get to a point where a JPEG that’s a couple MB in size takes no more than 10-12 seconds to optimize. That would make the algorithm practical in my use case, which is an app that optimizes hundreds of images at once as part of building websites.

bdkjones on Mar 20, 2017

Guetzli was a proof-of-concept milestone for us in creating new solutions for JPEG XL.

I’m considering of creating “Guetzli 2.0” that runs only one iteration of butteraugli and using the butteraugli from https://gitlab.com/wg1/jpeg-xl/-/tree/master/jxl/butteraugli and initialization code from https://gitlab.com/wg1/jpeg-xl/-/tree/master/jxl/enc_adaptive_quantization.cc

I suspect that would make Guetzli around 100x faster.

jyrkialakuijala on Oct 6, 2020

Not sure if related, but at our company, we chose to apply the Guetzli algorithm on all our rendered images. Because it’s relative slow, we decided to distribute the load in a special way. You can read all about it here: https://techlab.bol.com/from-a-crazy-hackathon-idea-to-an-empty-queue/

rogierlommers on Oct 25, 2017

Another alternative to what @graysky2 said is https://github.com/fd0/machma

jayniz on Mar 28, 2017

@DanielBiegler If you have 200 pictures, I’d echo @jan-wassenberg’s suggestion: run multiple instances of Guetzli and thus process multiple pictures in parallel. This will be more effective parallelization than anything that can be done inside Guetzli.

robryk on Mar 17, 2017

When using the --verbose option, it would be great if an estimated time/memory consumption could be calculated and presented to the user. Perhaps calculating the megapixel count with the current estimated time/memory allocation.

erikng on Mar 18, 2017

The speed of Guetzli can be improved using OpenCL probably (on the GPU)

SuicSoft on Mar 18, 2017