tesseract: Error during processing of HEIC input files
Environment
- Tesseract Version: 4.1.1
- Platform: macOS Catalina 10.15
Current Behavior:
Whenever I execute $ tesseract images/IMG_3958.HEIC output/grocery_bill I get this error:
$ tesseract images/IMG_3958.HEIC output/grocery_bill
Tesseract Open Source OCR Engine v4.1.1 with Leptonica
Error during processing.
Expected Behavior:
I would expect tesseract to output the text from the grocery bill into the output file output/grocery_bill.
Is there something wrong with processing HEIC images? Also, is there a location where I can tail the logs to see if I can get a richer description of the error?
Here’s more information about the tesseract program that I installed with Homebrew:
$ tesseract -v
tesseract 4.1.1
leptonica-1.79.0
libgif 5.2.1 : libjpeg 9d : libpng 1.6.37 : libtiff 4.1.0 : zlib 1.2.11 : libwebp 1.1.0 : libopenjp2 2.3.1
Found AVX2
Found AVX
Found FMA
Found SSE
Also attached please find the image I had tesseract process.
IMG_3958.HEIC.zip
About this issue
- Original URL
- State: open
- Created 4 years ago
- Comments: 25 (6 by maintainers)
Dan. I understand your view.
The thing is, these two formats (HEIC and AVIF) are becoming quite popular. HEIC is the default file format for photos in iOS. Recently, Firefox followed Chrome’s lead, and it now supports AVIF by default (previously users had to enable it manually).
So, since you do not want to support these formats in your software. maybe we, Tesseract devs, should discuss whether we want to support them ourselves.
If we will decide to support them. we will convert them to Leptonica’s pix.
To be honest, this was mainly directed toward @stweil, hoping that he would want this enough to implement it… 😃