opencv: Nearest neighbor interpolation does not give expected results

System information (version)
  • OpenCV => 2.4.9
  • Operating System / Platform => Ubuntu 16.04
  • Compiler => gcc 5.4.0
Detailed description

Nearest neighbor interpolation using cv2.resize does not give expected results.

Steps to reproduce
import cv2
import numpy as np

a = np.array([[0, 1, 2, 3, 4]], dtype=np.uint8)
print 'nearest', cv2.resize(a, dsize=(3, 1), interpolation=cv2.INTER_NEAREST)

gives

nearest [[0 1 3]]

expected

nearest [[0 2 4]]

About this issue

  • Original URL
  • State: closed
  • Created 7 years ago
  • Reactions: 6
  • Comments: 22 (10 by maintainers)

Commits related to this issue

Most upvoted comments

Just copy here from this thread:

Results of reading and resizing can be different in cv2 and Pilllow. This creates problems when you want to reuse a model (neural network) trained using cv2 with Pillow.

import cv2
from PIL import Image

im = cv2.imread('_in.png', cv2.IMREAD_COLOR)
cv2.resize(im, (3, 3), interpolation=cv2.INTER_NEAREST)
cv2.imwrite('_out.cv2.png', im)

im = Image.open('_in.png')
im = im.resize((3, 3), Image.NEAREST)
im.save('_out.pil.png')

Please, look at the sample:

_in

This image with the uniform gradient (from 100% white to 100% black) allows us to find out which pixels are used by each library. This is the same image after resizing to (3, 3). Left is CV2, right is Pillow:

_out cv2 _out pil

OpenCV uses the topmost left white pixel from the source image, but the bottommost right pixel on the result is too bright. Here are maps of pixels on the source image:

_in cv2 _in pillow

It’s clear to me there are some problems with rounding in OpenCV there.

@sergiud, @alalek I just tried this on my machine (version 3.3.0 on a mac via python bindings) and the behaviour is as described by @gerhardneuhold. He is quite right that your earlier example from July introduces sub-pixel translation, and if this is the logic that is implemented in the code than it needs to be fixed.

When resizing the image it’s not helpful to think in terms of pixel centers, you have to think in terms of image edges, and pixel spans. Pixel 0 of the destination image spans from the left edge of the source image pixel 0 and into pixel 1, middle pixel of the destination image spans spans from pixel 1 all the way into 3, see diagram below:

         0       1       2       3       4
  5: |---+---|---+---|---+---|---+---|---+---|
phy: |~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~|
  3: |------+-----|------+------|------+-----|
            0'           1'             2'

So if we assume pixel centers are at 0,0, then expected coords should be 1/3, 2, 3+2/3. Remember that source and destination images should span the same “physical region”.

What current code seems to be doing is this, instead

            0       1       2       3       4
     5: |---+---|---+---|---+---|---+---|---+---|
   phy: |~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~|
  3: |------+-----|------+------|------+-----|
            0'           1'             2'

So you are sampling from a slightly different “physical region”, introducing sub-pixel translation.

I don’t think those differences come from rounding behaviour differences or anything like that, it comes from subpixel translation introduced by resize method. The error is in your coordinate computation math. And yes resize should return 0,2,4 in the example above and not 0,1,3.

User error. You aren’t passing the INTER value in the correct position. Please direct your usage questions to the forum or Stack Overflow.

@ppwwyyxx, there is a statement that

INTER_NEAREST_EXACT will produce same results as the nearest neighbor method in PIL, scikit-image or Matlab.

So the goal of https://github.com/opencv/opencv/pull/23634 is to resolve https://github.com/opencv/opencv/issues/22204. However, the results are not the same for all the scales. For example:

Input scale OpenCV 4.x https://github.com/opencv/opencv/pull/23634 Scikit-Image 0.20.0 PIL 9.4.0
[0 1 2 3 4 5 6 7 8 9] 0.3 [1 4 8] [1 4 8] [1 5 8] [1 5 8]
[0 1 2 3 4 5] 5/6 [0 1 2 4 5] [0 1 2 4 5] [0 1 3 4 5] [0 1 3 4 5]
[0 1 2 3 4 5 6 7] 0.75 [0 1 3 4 5 7] [0 1 3 4 5 7] [0 2 3 4 6 7] [0 2 3 4 5 7]
[0 1 2 3 4 5] 0.5 [0 2 4] [1 3 5] [1 3 5] [1 3 5]

But I believe that at least the case with x2 nearest downsampling from even sizes should be deterministic. So it’s not about correctness but portability.

Possibly fixed by cv::InterpolationFlags::INTER_LINEAR_EXACT in #18053 .

Great - just for completeness, the related interpolation mode for this issue here is cv2.INTER_NEAREST_EXACT (instead of INTER_LINEAR_EXACT).

Possibly fixed by cv::InterpolationFlags::INTER_LINEAR_EXACT cv::InterpolationFlags::INTER_NEAREST_EXACT in #18053 .

using warpAffine with custom matrix respecting pixel coordinate flavors would be a workaround. notice that I use INTER_LINEAR so you can see roughly where sampling is done. INTER_LINEAR appears to quantize a little; the values should be exact multiples of thirds.

The convention whether pixels are points or whether pixels are areas is a common debate in computer graphics. It’s the same issue in OpenGL, Direct3D, …

import numpy as np
import cv2 as cv

src = np.float32([[0, 1, 2, 3, 4]])
(srows, scols) = src.shape

(dcols, drows) = dsize = (3,1)

# some composition functions

def translate(tvec):
	H = np.eye(3)
	H[0,2] = tvec[0]
	H[1,2] = tvec[1]
	return H

def scale(svec):
	H = np.eye(3)
	H[0,0] = svec[0]
	H[1,1] = svec[1]
	return H

# don't need rotation, diy
# or use getRotationMatrix2D(center, angle, scale)

def project(H):
	# it's a projective space,
	# where (x,y) is represented by all (x,y,1)*w coordinates, i.e. a ray through (x,y,1)
	# these are homogeneous coordinates
	# project onto w=1
	M = H / H[2,2]
	
	# check that this isn't a perspective transform, affine only
	assert np.allclose(M[2,0:2], 0)
	# if you want perspective transforms, return the full 3x3 matrix and use warpPerspective

	return M[0:2, 0:3]

#                 0.0         1.0         2.0         3.0         4.0
#     5:     |-----+-----|-----+-----|-----+-----|-----+-----|-----+-----|
#   phy:     |~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~|
#     3: |---------+---------|---------+---------|---------+---------|
#                  0'                  1'                  2'

# define dst->src transformation
# for sampling dest point value in src space
M = project(
	scale([scols/dcols, srows/drows])
)

dst1 = cv.warpAffine(src, M, dsize=dsize, flags=cv.WARP_INVERSE_MAP | cv.INTER_LINEAR)
print(dst1)
# => [[0.      1.65625 3.34375]]
dst1 = cv.warpAffine(src, M, dsize=dsize, flags=cv.WARP_INVERSE_MAP | cv.INTER_NEAREST)
print(dst1)
# => [[0. 2. 3.]]


#             0.0         1.0         2.0         3.0         4.0
#     5: |-----+-----|-----+-----|-----+-----|-----+-----|-----+-----|
#   phy: |~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~|
#     3: |---------+---------|---------+---------|---------+---------|
#                  0'                  1'                  2'

# define dst->src transformation
# for sampling dest point value in src space
M = project(
	translate([-0.5, -0.5]) @
	scale([scols/dcols, srows/drows]) @
	translate([+0.5, +0.5])
)

dst2 = cv.warpAffine(src, M, dsize=dsize, flags=cv.WARP_INVERSE_MAP | cv.INTER_LINEAR)
print(dst2)
# => [[0.34375 2.      3.65625]]
dst2 = cv.warpAffine(src, M, dsize=dsize, flags=cv.WARP_INVERSE_MAP | cv.INTER_NEAREST)
print(dst2)
# => [[0. 2. 4.]]

The story about resize compatibility across the libraries is pretty old.

Basically this is not a story about compatibility, this is story of wrong behavior.

Bit exact nearest neighbor interpolation.

What does mean by “bit exact” if neighbor interpolation doesn’t isn’t interpolation actually and don’t operates pixels’ values? It should be “Correct nearest neighbor interpolation”.

This will produce same results as the nearest neighbor method in PIL, scikit-image or Matlab.

It will not produce the same results as PIL since PIL originally had the same issue and it was fixed in Pillow (https://github.com/python-pillow/Pillow/pull/2022) which is different library.

There is no exact formula in documentation (and probably will not be there for general case - for performance reason and rounding issues). Existed resize tests are passed. So this issue looks like the question about the used resize formula in OpenCV.

How did you come up with expected result?

The downsampling factor is 3/5 (in horizontal direction) meaning destination columns 0, 1, 2 are mapped to source columns with indices 0, 1 * 5/3 and 2 * 5/3 (i.e., 0, 1.67 and 3.33). After rounding half up (probably what you expect), the values of the corresponding columns are 0, 2, 3.

Either way, it seems OpenCV rounds towards zero instead of rounding half away from zero (not verified).