dlib: Facial landmark detection failing when RGBA is converted to RGB
Hi, I understand that the facial landmark detection fails if I give a RGBA image, as it expects either a greyscale or a RGB image. But, when I try removing the alpha channel, it still fails.
import os
from shutil import copyfileobj
try:
from urllib2 import urlopen
except ImportError:
from urllib.request import urlopen
import dlib
from skimage.io import imread
def download(url, filename, overwrite=False):
if not os.path.exists(filename) or overwrite:
response = urlopen(url)
with open(filename, 'wb') as out_file:
copyfileobj(response, out_file)
download('https://upload.wikimedia.org/wikipedia/commons/thumb/a/af/Beatrix_Podolska_pedagog_muzykolog_Krakow_2008.xcf/720px-Beatrix_Podolska_pedagog_muzykolog_Krakow_2008.xcf.png',
'Beatrix.png')
im = imread('Beatrix.png')
detector = dlib.get_frontal_face_detector()
print('Shape of image data', im.shape)
im = im[:, :, :3]
print('Shape of image data after processing', im.shape)
print(detector.run(im))
Here is an example of a PNG which is downloaded, it initially has the shape (1024, 720, 4) and after removing the alpha channel has (1024, 720, 3).
But it still gives:
Shape of image data (1024, 720, 4)
Shape of image data after processing (1024, 720, 3)
Traceback (most recent call last):
File "beatrix.py", line 32, in <module>
print(detector.run(im))
RuntimeError: Unsupported image type, must be 8bit gray or RGB image.
EDIT: Note that this works perfectly fine if i give a PNG with RGB only or even a jpg.
About this issue
- Original URL
- State: closed
- Created 8 years ago
- Comments: 18 (18 by maintainers)
Commits related to this issue
- Fixes #128. Added support to discontiguous Numpy arrays — committed to mljli/dlib by mljli 8 years ago
- Fixes #128. Added support to discontiguous Numpy arrays — committed to mljli/dlib by mljli 8 years ago
- Fixes #128. Added support to discontiguous Numpy arrays — committed to mljli/dlib by mljli 8 years ago
- Fixes #128. Added support to discontiguous Numpy arrays — committed to mljli/dlib by mljli 8 years ago
- Fixes #128. Added support to discontiguous Numpy arrays — committed to mljli/dlib by mljli 8 years ago
- Fixes #128. Added support to discontiguous Numpy arrays — committed to mljli/dlib by mljli 8 years ago
Got bored of preparing for the exam. LOL
I doubt that. It seems that Dlib simply passes a
numpy_rgb_imagewhich holds a pointer to the underlying buffer of aPy_Bufferobject to anobject_detectorand I did’t see any copyings until computing the FHOG features. From what I understand, if an array is non-contiguous and shares the buffer with its ‘base’ array, Dlib can’t handle it. Theimage_viewtype requires C-contiguous buffers to correctly find the offset of a pixel.The solution is obvious, copying the array to a contiguous buffer if necessary. We can use either
PyBuffer_ToContiguousorPyBuffer_GetPointer.Things should have become more complex if advanced indexing is used. But thanks to Numpy’s optimization, if a view has discontiguous columns (like array
Bbelow), it becomes Fortran-style contiguous. And if a view has both discontiguous rows and columns (like arrayCbelow), a new contiguous buffer will be allocated for it. However, whether the conclusion is true is unknown.To conclude, the array passed in is either contiguous (C or Fortran) or has the same strides with its ‘base’ array. So theoretically we can handle any arrays with the help of strides. The remaining problem is the way Dlib views an image buffer. For example,
rgb_pixelassumes RGB channel to be contiguous which obviously may not be true. So for multi-channel images with discontiguous third dimension, we have no choice but to copy the buffer?Correct me if I was wrong.
References