magento2: Catalog image resizing performance problems
Preconditions
- Magento develop branch (1856c28 - January 15 2017)
- PHP 7.0.14
Steps to reproduce
- Have a shop with a bunch of products with images assigned to them
- Run
bin/magento catalog:images:resize
Expected result
- The command should only produced images for the themes which are currently active in the frontend
- The command shouldn’t produce multiple files which are binary exactly the same
Actual result
- The command produces files for all the installed themes, even if they aren’t being used in the frontend
- The command produced files multiple times which are binary exactly the same
Discussion
While trying to figure out why it takes over 12 hours to run the command bin/magento catalog:image:resize on a very beefy server with a Magento CE 2.1.2 shop with about 6500 products and 8500 images, I found a couple of performance problems in the code:
- The command generates resized images for all themes, so if you are not using
Magento/blankorMagento/lumaon the frontend, you still get images resized for those themes, and they take a lot of time to generate, so this is not good - After fixing nr 1, I looked at the generated files and compared them using a hashing function to see if there were duplicated files in the result. And this turned out to be the case.
It was always caused by a difference in the
image type(thumbnail, small_image, image, …). All the other parameters (width, height, keep frame, transparency, quality, …) to generate a unique file are correct I think. But I don’t think the distinction onimage typeis important, but maybe I’m missing something?
Possible solution
- I created a PR over here: https://github.com/magento/magento2/pull/8142
- In Magento\Catalog\Model\Product\Image::getMiscParams, remove the line:
'image_type' => $this->getDestinationSubdir(),
Result
I ran a very small benchmark using a test shop with 3 products and 5 images in total. Only Magento/blank and Magento/luma themes are installed and only Magento/luma is active in frontend:
- Originally: 11 seconds, 165 files are generated. After applying PR: 9 seconds, 135 files are generated
- Originally: 11 seconds, 165 files are generated. After removing line mentioned above: 9 seconds, 130 files are generated
Combined result of both optimizations:
- Originally: 11 seconds, 165 files are generated. After both optimizations: 7 seconds, 105 files are generated
I think this is a significant enough improvement for you guys to at least consider optimizing this 😃
Thanks!
About this issue
- Original URL
- State: closed
- Created 7 years ago
- Reactions: 11
- Comments: 42 (20 by maintainers)
Before leaving work yesterday I decided to see how long this operation would truly take. As is tradition, Magento2 failed catastrophically after 45 minutes without further details–providing a shallow message about a fart, when really it should be giving me all the gory details about its crap-filled underwear.
Before I could do that, though, it failed multiple times about a
swatch_image.jpgpath, which derailed me for a while, until I deleted the files under the path and suddenly it was able to progress. At that point I thought the troubles were over, detached my terminal multiplexer and ventured home. The next morning I awoke, feeling positive that I would see immense progress and possibly a disk that ran out of space (which would still be progress!), but instead I learned that the operation failed a mere 45 minutes after leaving, gracing me with its ever-useful error message, “Unsupported image format,” which I’ve referenced before. Why not–at the very least–just log it and continue? WHY do you need to fail catastrophically and die?This is so comically bad, it deserves its own Jimmy Fallon Thank You Notes bit:
@magento-engcom-team
For reference to anyone coming by, I’ll explain the current state of this command based on my latest experience.
I’m using Magento
2.2.7.The output of the
catalog:image:resizecommand is much better than it used to be. Additionally, it’s clearly not resizing the same images over and over again. These are about the only positive things I can say about it, though.Here are some of my current numbers:
428233product images in my store.2.7images per second, based on my store. I waited until 323 images hit exactly 2 minutes, so323/120=2.7images per second.428233/2.7=158604s / 60s=2643mins / 60mins=44hoursIn other words, it will take 44 hours for this tool to resize all the images in my store for the first time. That’s truly devastating performance. At the very least, this is the first time I’ve discovered an approximate amount of time to run this whole thing, given the two pros mentioned earlier, so my expectations are set (extremely low). Of course, I actually have to run this completely through to see how resilient it is, or whether it can even finish.
Other numbers:
More numbers again:
I am sure there is a way to bring that number of thumbnails per product down, but I’ve not discovered it, yet.
I should mention that these images are being generated on a 512GB SSD, so the IOPS is quite high.
Does anyone have suggestions to make this better? In its current state, it’s clearly anything but an acceptable way of resizing catalog images.
Edit:
I’d like to note that in both
developerandproductiondeploy mode, images are still generated on the fly, if they don’t exist, which is good. The release notes for 2.1.7, and then 2.2.6 both used to state that on-the-fly image generation was removed in favour of using the command line tool to generate images, but it seems the release notes were edited to remove this information. Anyway, as long as on-the-fly image generation continues to be a thing, I won’t care that much. It is immensely inconvenient that Magento2 decided to change the paths to all resized images, though. That means I will no matter what have to run the command at least once.Any updates on if this is fixed? Magento 2.3 is out now.
Edit: Nope, still running into this issue of duplicate images in cache.
a quick suggestion (i apologize if its redundant). add theme name as an argument to resize images for the theme only. similarly add attribute (small, thumbnail…) as an argument for resize image specific to that attribute only (globally or theme specific). I have not yet tested the resize image command but I believe this will help in performance as it won’t be resizing for all the themes and all the attributes.
Thanks, RT
Cache folder seems somewhat pointless if you are using Cloudflare or other CDN. Can we just disable this product image cache feature and have requests serve a resized image but cache that response with an etag based on the original source file’s checksum?
This is still a persistent issue on M2.3.3. I have a multi-store view with 7239 unique images and it’s estimated 19 hours before the command completes. I’m watching duplicate images be generated.
After upgrading from 2.2.7 -> 2.3.3 all images have to be regenerated due to a change in the way the image hash is generated. This isn’t realistic to do when we upgrade a production environment.
EDIT: I redeployed the MCloud environment and the current est is 5.9 hours… still way too long for an image set this small.
Having run into this before; anyone needing to lint images before resize can use ImageMagicks
identifyutility command to be sure the process doesn’t abruptly fail due to one bad image.Find malformed/invalid images recursively with ImageMagick’s identify utility of current working dir display output listing of current file scanned, only non 0 exit status get logged.
find -D rates . -type f \( -name '*.gif' -o -name '*.png' -o -name '*.jpg' -o -name '*.jpeg' \) -print -exec bash -c 'identify "$1" &> /dev/null || echo "$1" >> invalid-imgs.log' none {} \;EDIT to ignore cache directory use (note the -not -path):
find -D rates * -type f \( -name '*.gif' -o -name '*.png' -o -name '*.jpg' -o -name '*.jpeg' \) -not -path "*cache/*" -print -exec bash -c 'identify "$1" &> /dev/null || echo "$1" >> invalid-imgs.log' none {} \;In reading though the 2.2.6. release notes, it’s made clear that M2 has moved forward with its image generation tool. My only question is, will on-the-fly image generation still be supported? A 90% performance increase is unimportant when the operation takes days (and has never personally completed ever).
Have the concerns raised in this ticket been addressed? To sum it up, M2 moved forward with their image generation tool, realized it worked terribly, and then walked it back and reverted to on-the-fly image generation again, which is superior.
@magento-engcom-team: can you review my comment above, I still think this ticket needs to be re-opened. Thanks!
@magento-engcom-team: I just retested this on a clean 2.2.0 installation, and can’t see any difference in the results, so this issue is definitely not fixed, please reopen 😃
Evidence for issue number 1 (Magento Blank theme is active on frontend, Luma theme not):
Squeeze in a couple of new lines of code on line 63 of the Product\Image\Cache model:
Now change the newly introduced code and uncomment the first line, so only the Magento Blank theme is processed
There are less images being produced when only generating them for one single theme, which should happen by default, since Magento Luma isn’t in use anywhere.
Evidence for issue number 2:
You can see a bunch of duplicated hashes which isn’t desirable since those images waste a lot of disk space and it takes longer to generate all of those duplicated files.
It looks like the issues brought up in here have been resolved more or less in Magento 2.3.x, so that’s good, but I found another huge problem, for which I’ve created https://github.com/magento/magento2/issues/26796
unfortunately no joke. the change to the hashes was added in 2.3.0. regenerating the image caches took 5+ days for one of our customer with highres raw images… we resorted to hardcode the few hashes now in our customers’ instances just in case. 😦
@0x15f If you on Magento Cloud then you can try use Fastly image optimization
@hostep, thank you for your report. We’ve created internal ticket(s) MAGETWO-80606 to track progress on the issue.
@hostep Thank you for the investigation. Issue reopened for further research.
And yet another update. Manipulating
image_typeinMagento\Catalog\Model\Product\Image\ParamsBuilder::buildwhich I mentioned above, isn’t the right solution, asimage_typeis being used inMagento\Catalog\Model\View\Asset\Image::getPlaceHolderto figure out what placeholder to use.My new proposition is to change this in
Magento\Catalog\Model\View\Asset\Image::getMiscPathas follows:This is for Magento 2.1.6, I haven’t checked out the latest
developbranch code, to see if this also applies over there.