imageio: volread on tifffile produces the wrong shape
This is a regression from 2.9.0 first shared by @mkcor in https://github.com/scikit-image/scikit-image/pull/5262.
Reproducing example:
img = iio.volread('http://cmci.embl.de/sampleimages/NPCsingleNucleus.tif')
img.shape
# in v2.10.3 (30, 180, 183)
# in v2.9.0 (15, 2, 180, 183)
It traces back to something that, at least to me, is unexpected behavior in v2.9.0. Calling imageio.imread
on the above tiff (in either version of ImageIO) returns a single page of the file, not a single image. I.e.,
import imageio as iio
img = iio.imread('http://cmci.embl.de/sampleimages/NPCsingleNucleus.tif')
img.shape # (180, 183) not the expected (2, 180, 183)
Consequentially, iterating over single images and stacking them, doesn’t yield a stack of channel-first images, but a stack of pages:
reader = iio.get_reader('http://cmci.embl.de/sampleimages/NPCsingleNucleus.tif', mode="i")
img = np.stack([reader.get_data(index=x) for x in range(reader.get_length())])
img.shape # (30, 180, 183)
Reading a volume, on the other hand, does the expected thing and produces a stack of channel-first images (in v2.9.0)
img = iio.volread('http://cmci.embl.de/sampleimages/NPCsingleNucleus.tif')
img.shape # (15, 2, 180, 183)
however, does the page-stacking thing in v2.10.3:
img = iio.volread('http://cmci.embl.de/sampleimages/NPCsingleNucleus.tif')
img.shape # (30, 180, 183)
@cgohlke @almarklein @mkcor (and others who use TIFF more than me) What is expected behavior here? Would you expect imread
to return a single image (shape: (2, 180, 183)) or a single page (shape: (180, 183))? I am leaning towards a single image, but I am open for comments here.
Depending on this answer, I would either look into a bugfix fir get_data(index=...)
(any version) or volread(...)
(v2.10.3).
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Comments: 21 (14 by maintainers)
@GenevieveBuckley The fix should roll out to PyPI tonight.
Indeed. I also just found the relevant section of code in our vendored tiffile: It literally just parses the metadata string and sets the shape according to the
channel
andframe
entries . . .https://github.com/imageio/imageio/blob/master/imageio/plugins/_tifffile.py#L2172-L2252
On the bright side though, it makes the fix efficient, because we don’t have to parse the full file to figure out the image’s shape. All that is needed is to read the first page and we can learn how many pages to read for a
imread(index=N)
call.🤣 lol.
Makes you wonder how @cgohlke figured out how to correctly read ImageJ hyperstacks in the first place. Big props.
I just ran into this as well, thanks for fixing it
Hi @FirefoxMetzger,
Yes, absolutely! Thank you, @kmilos. This sentence I had to read a couple times:
Visible hyphenation helps understanding as you read (rather than afterwards):
🙏
Didn’t you mean “on 2 pages instead of 3 pages?” We want to access either channel when manipulating the image. Ok, I guess you meant
(30, 180, 183)
or(15, 2, 180, 183)
as opposed to(30, 180, 183, 3)
or(15, 2, 180, 183, 3)
.Huh, my bad. I somehow thought its the actual TIFF spec rather than a tech note on subIFDs. That’s a pretty cool feature actually that I should start using. The more you know 💯
On that note, @almarklein I think it is time to deprecate ImageIO’s vendored
tifffile
library. It doesn’t support subIFDs, but the actual library has since added support for it. We prefer a pip install over our vendored one, but I think we would make our life easier if we simply install the latest tifffile from pypi instead, e.g., via something likepip install imageio[tifffile]
.I agree, there is definitely a better way to do this, assuming it actually does rely on metadata. It is the way ImageJ does it though, which makes it standard in medical image analysis … by virtue of popularity >.< … and important enough for us to support since medical imaging is one of the areas relying on ImageIO.
On a more general note:
I removed my installation of
tifffile
and switched to our vendored version (no SubIFD support). I then tested the reading again andiio.volread
still produces the desired shape of(15, 2, 180, 183)
. From this, I conclude that the image, unfortunately, isn’t using SubIFDs to organize the data. More evidence towards it relying on the custom metadata string to format the data…@kmilos Do you have any other ideas on where the file might store relevant information for stacking, or happen to know how ImageJ solves this? (please don’t say “well, they use the metadata string you already found” xD)