tesseract: tesseract process never finishes with specific gif image
Environment
tesseract 4.1.1
reproduced on macosx and linux
uname -a
Darwin VL-C02WL1AYHTD6 19.6.0 Darwin Kernel Version 19.6.0: Tue Nov 10 00:10:30 PST 2020; root:xnu-6153.141.10~1/RELEASE_X86_64 x86_64
Linux ocr-5b7bf86f6-f6qsd 5.4.65-wix #1 SMP Thu Nov 19 15:24:12 UTC 2020 x86_64 GNU/Linux
Current Behavior:
running tesseract in command line on this image https://bentkus.eu/ocr_while_true.gif does not finish after 1h
tesseract ocr_while_true.gif ocr_while_true --dpi 150
Expected Behavior:
process should finish in 2 minutes
Suggested Fix:
I’ll try to build and see why it never stops
upd. (by @egorpugin): test png - https://bentkus.eu/ocr_while_loop.png
About this issue
- Original URL
- State: open
- Created 3 years ago
- Comments: 37 (21 by maintainers)
I now have run latest Tesseract production code on the original animated GIF image. The image is processed, and Tesseract returns a “result” for the first included image. This takes 4:26 minutes, so it finishes, but takes rather long for an image which looks empty for me but obviously includes lots of small colour variations (otherwise the PNG file would be much smaller).
My answer is ‘Create OCR for all images’