tensorflow: tf.image.decode_png doesn't work for palette-based images
System information
- Have I written custom code
- Linux Ubuntu 18.04
- TensorFlow installed from source
- TensorFlow version: v1.12.0-0-ga6d8ffae09 1.12.0
- Python version: 3.6.7
- CUDA/cuDNN version: V10.0.130
- GPU model and memory: RTX2080 Ti
Describe the current behavior
Pixel values should be the same regardless if loading the image by PIL or by TF
Describe the expected behavior
Pixel values are different
Code to reproduce the issue Provide a reproducible test case that is the bare minimum necessary to generate the problem.
from PIL import Image
import numpy as np
import tensorflow as tf
tf.enable_eager_execution()
PATH = '/tmp/42313738-65c10f7c-807e-11e8-8f11-9db821e3c3cc.png'
im = Image.open(PATH)
ar = np.asarray(im)
pil_max = np.max(ar)
print(pil_max)
im = tf.gfile.FastGFile(PATH, 'rb').read()
ar = tf.image.decode_png(im, channels=1)
tf_max = tf.reduce_max(ar)
print(tf_max)
assert tf_max == pil_max
image: here
Other info / logs I suspect that the problem is caused by tensorflow loading the first RGB channel, (so the red channel) instead of the color indexes, for palette based png images like given example.
related to #20028
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Comments: 17 (8 by maintainers)
You can do something similar using,
This is not Build/Installation or Bug/Performance issue. Please post this kind of support questions at Stackoverflow. There is a big community to support and learn from your questions. GitHub is mainly for addressing bugs in installation and performance. Thanks!
This is not Build/Installation or Bug/Performance issue. Please post this kind of support questions at Stackoverflow. There is a big community to support and learn from your questions. GitHub is mainly for addressing bugs in installation and performance. Thanks!
Sorry for the confusion. The Main reason for this error is, PIL opens images in palette mode. This means the color from the channels are mapped to a color palette and that palette index is provided in each location of the image. However tensorflow, decodes the image into Channels and doesn’t use this concept of palette. So, when the Pillow Image object is converted to numpy array, the values at the various positions is not intensity of the pixels of the color channel, which is in the case of tensorflow.
The correct way to verify this is:
And about the weighted average thing. When you are loading a color image in channel mode and request it to convert to grayscale (
channels=0
does that). A weighted average on all the channels(R,G,B) is calculated to produce the value for one channel. (That’s why the number of channels for grayscaled image is 1 and RGB colored image, is 3) Reference:A clean solution would be to re-implement a custom op to decode a PNG without palette conversion.
Currently, the conversion is done at core level:
If you are at TF 1.X, you can wrap the PIL-call with
py_func
in order to get the desired behavior, like:and then build your pipe, like:
Note: in TF1.X this works only in graph-mode. In TF2.X a similar trick should be possible with
tf.numpy_function
ortf.py_function
.thanks for the response capitan-pool So the whole point is that I actually only care about these palette indexes, not about the RGB values. Those palette index values are the target class IDs for semantic segmentation.