tesseract: Warning. Invalid resolution 0 dpi. Using 70 instead.

command tesseract https://image.ibb.co/eibzaT/test.png result

Current Behavior:

Warning. Invalid resolution 0 dpi. Using 70 instead.
Estimating resolution as 161
Estimating resolution as 161

version

tesseract 4.0.0-beta.2-313-g29f2
 leptonica-1.76.0
  libgif 5.1.4 : libjpeg 8d (libjpeg-turbo 1.4.2) : libpng 1.2.54 : libtiff 4.0.6 : zlib 1.2.8 : libwebp 0.4.4 : libopenjp2 2.1.0
 Found AVX
 Found SSE

original image https://image.ibb.co/eibzaT/test.png

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Reactions: 3
  • Comments: 37 (15 by maintainers)

Commits related to this issue

Most upvoted comments

There is an undocumented command line option. Try using --dpi 300 (or the correct value for your image).

@bhasinnaik : your input image has no information about dpi. If you want to avoid warning, you should fix it.

It means your image does not contain a resolution info in its metadata, so Tesseract warns you about this issue in the image and it tries to estimate the resolution by itself.

To test if an image has the correct header you can use magick identify -verbose filename or equivalent tools

and make sure these 2 values are set Resolution: 118.11x118.11 Units: PixelsPerCentimeter Above is for a 300 dpi PNG

Tesseract uses Leptonica which uses libpng to read the input image source resolution. If the input png does not have the correct metadata info, it will generate the warning referred in this issue. I also seen this to cause tesseract to return slightly different text results for certain images. The code above adds metadata to the PNG

That’s a bug in Tesseract. Tesseract internally creates a new image for that png file, but does not copy the resolution from the original image. Fixed now in commit a209a6b4b503c6ada4ce6eb257fde2b76c47f771.

/**
 * Save the buffered image to disk.
 *
 * If a PNG is passed add the dot per meter to the pHYS chunkc of the PNG metadata
 * @See PNG Metadata Format Specification - https://docs.oracle.com/javase/8/docs/api/javax/imageio/metadata/doc-files/png_metadata.html
 *
 * <!-- The pHYS chunk, containing the pixel size and aspect ratio -->
 * <!ATTLIST "pHYS" "pixelsPerUnitXAxis" #CDATA #REQUIRED>
 * <!-- The number of horizontal pixels per unit, multiplied by 1e5 -->
 * <!ATTLIST "pHYS" "pixelsPerUnitYAxis" #CDATA #REQUIRED>
 * <!-- The number of vertical pixels per unit, multiplied by 1e5 -->
 * <!ATTLIST "pHYS" "unitSpecifier" ("unknown" | "meter") #REQUIRED>
 * <!-- The unit specifier for this chunk (i.e., meters) -->
 *
 *
 * @param bufferedImage image
 * @param formatName PNG, TIFF, etc..
 * @param localOutputFile local filename whene to save te image
 * @param dpi               dot per inches of the image to save
 * @return true if successful, false otherwise
 * @throws IOException
 */
boolean saveImage(RenderedImage bufferedImage,
                  String formatName,
                  File localOutputFile,
                  int dpi) throws IOException {
    boolean success;

    if (formatName.equals(BlankPageMapRequest.ImageFormat.PNG.toString()))
    {
        ImageWriter writer = ImageIO.getImageWritersByFormatName("png").next();

        ImageWriteParam writeParam = writer.getDefaultWriteParam();
        ImageTypeSpecifier typeSpecifier = ImageTypeSpecifier.createFromBufferedImageType(BufferedImage.TYPE_INT_RGB);

        IIOMetadata metadata = writer.getDefaultImageMetadata(typeSpecifier, writeParam);

        final String pngMetadataFormatName = "javax_imageio_png_1.0";

        // Convert dpi (dots per inch) to dots per meter
        final double metersToInches = 39.3701;
        int dotsPerMeter = (int) Math.round(dpi * metersToInches);

        IIOMetadataNode pHYs_node = new IIOMetadataNode("pHYs");
        pHYs_node.setAttribute("pixelsPerUnitXAxis", Integer.toString(dotsPerMeter));
        pHYs_node.setAttribute("pixelsPerUnitYAxis", Integer.toString(dotsPerMeter));
        pHYs_node.setAttribute("unitSpecifier", "meter");

        IIOMetadataNode root = new IIOMetadataNode(pngMetadataFormatName);
        root.appendChild(pHYs_node);

        metadata.mergeTree(pngMetadataFormatName, root);

        writer.setOutput(ImageIO.createImageOutputStream(localOutputFile));
        writer.write(metadata, new IIOImage(bufferedImage, null, metadata), writeParam);
        writer.dispose();

        success = true;
    }
    else
    {
        success = ImageIO.write(bufferedImage, formatName, localOutputFile);
    }

    return success;
}

@stweil : Thanks for looking into this. Funny that problem was with psm 0 only. Others psm works as expected.

Just very easy and short internet search suggest this modification: mogrify -set units PixelsPerInch -density 300 image.jpg

That is not directly supported by Tesseract, but could be implemented by a wrapper script.

The current Tesseract release 5.0.0 tries to guess the correct resolution if there is no explicit information from the image file.

I’ve been using Tesseract for a while and got the same error. I just want to confirm that it is never about the metadata. I got the error while using image_to_osd for photos captured using the same device and this happened to only 3 of 50 images. I’ve checked the image details and the dpi already exists.

I’ve been trying to see if the error disappears when I crop the background around the objects in the image, and I saw that the error disappeared for 2 of them, not all of them. I still don’t really know the reason, but if it was the metadata, the third one would have worked.

However, I believe that it has something related to the number of characters. The images that resulted in the error that disappeared after cropping were somewhat rotated. The text angle was around 30-40. Tesseract was giving me rotations of 90,180, and 270 only for the images that worked. When it comes to the image that gave error in both cases, it already has a low number of characters. This is why it would be interesting if more people try this so we can figure out if it’s really the reason.

Try 300x300 with the mogrify command.

On Fri, Oct 18, 2019, 10:02 AM zdenop notifications@github.com wrote:

I tried it and it works for my image.jpg. If you are using tesseract >=4 you can use --dpi option of tesseract executable.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/tesseract-ocr/tesseract/issues/1702?email_source=notifications&email_token=ABF3NJXAR7FBSXBYDSVQN7LQPHT3JA5CNFSM4FGUSYKKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEBVEZZI#issuecomment-543837413, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABF3NJUPTPMR4QZWDNUIDGTQPHT3JANCNFSM4FGUSYKA .

I tried it and it works for my image.jpg. If you are using tesseract >=4 you can use --dpi option of tesseract executable.