rio-tiler: Bug in new `calculate_default_transform` from terracotta

While using the new calculate_default_transform from Terracotta seemed to help with the performance, I encountered a but when dealing with scene that crosses the dateline separation

The problem happens here: https://github.com/cogeotiff/rio-tiler/blob/06504ec8321d185602a5a91bc6b5415bdb94915c/rio_tiler/utils.py#L232-L241

which will return negative width/height

Tile
x=17
y=21
z=6

vrt_transfrom
|-386.76, 0.00,-9392582.04|
| 0.00, 6.31, 6887893.49|
| 0.00, 0.00, 1.00|

vrt_width=-1619
vrt_height=-99186

dataset link: https://www.dropbox.com/s/jyskcxqnbq3240c/ask_cog.tif?dl=0

Not sure if it means something but here is the result from both rasterio and terracotta’s calculate_default_transform for a random tile.

# rasterio
| 238.34, 0.00,-20037508.08|
| 0.00,-238.34, 13596067.67|
| 0.00, 0.00, 1.00|

# terracotta
|-386.54, 0.00,-20028306.76|
| 0.00, 6.31, 13596067.67|
| 0.00, 0.00, 1.00|

@dionhaefner, could you test terracotta with the dataset I shared, so I can confirm this is only happening in rio-tiler ?

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 16 (6 by maintainers)

Most upvoted comments

I think I understand the problem now.

It is relatively easy to construct a transform that yields higher quality reprojections than GDAL’s default transform. Just oversample the image a bit to make sure you don’t lose any data.

This changes though when overviews are involved. If you oversample the image, overviews are triggered earlier, leading to an overall degraded quality. You can consider a situation where we duplicate every pixel at the VRT level:

112233
112233
445566
445566

If we then zoom out by a factor of 2, we would ideally see

123
456

But by then, the first overview is triggered already! So we actually see something like

13
46

As you noticed, trying to force GDAL to use a specific overview is a can of worms that I can’t open at this point.

I suggest the following workaround:

dst_transform, _, _ = rasterio.warp.calculate_default_transform(
    src.crs, target_crs, src.width, src.height, *src.bounds
)

tile_transform = transform.from_bounds(*tile_bounds, *tile_size)

dst_res = (
    min(abs(dst_transform.a), abs(tile_transform.a)),
    min(abs(dst_transform.e), abs(tile_transform.e)),
)

At low zoom levels, this uses the default transform, where it performs reasonably well. At high zoom levels, where we’re at the pixel and sub-pixel level, it makes sure that the VRT always oversamples the tile, so we always get down to the true data density if we zoom far enough.

Comparison

ask-comparison

Performance difference is not above noise level.

Yikes, this image is pretty much the perfect storm for my method

Screen Shot 2020-03-20 at 14 50 08
  • It is rotated so much that the top right corner is further west than the bottom left corner
  • Only one corner of the raster crosses the date line
  • Top left and bottom right corner of the image lie almost on the same latitude
  • Bottom left and top right corner lie almost on the same longitude

In cases like these, we cannot possibly do better than GDAL’s default transform. So even after I fix the bugs and catch these edge-cases, the result will not be better than the default.

I’ll have to come up with a better way to solve this, in the meantime I suggest you revert to the default transform.

@dionhaefner, To be honest, I’m not sure I follow everything but I think you are telling me to switch back to the original way to calculating the vrt_transform and size https://github.com/cogeotiff/rio-tiler/blob/6b0d4df0b6aa1454c50312e8d352ed57f0a4e3cb/rio_tiler/utils.py#L269-L310

?

Yes, it seems like GDAL’s default transform does more than I thought. But you can also see in the figures above that it leaves quite a bit of resolution on the table, which becomes evident at higher zoom levels:

New

new-tile-corr-hizoom

GDAL

old-tile-hizoom

What would you think about the following workaround:

  • Detect whether dataset crosses the dateline
  • Compute the default transform with fake bounds that are shifted so the dataset doesn’t cross the dateline anymore

This only works for target CRS that don’t depend on longitude, like latlon or (web) Mercator.

Would you consider that too hacky?

Doesn’t work, but I’m close to a fix.

Oops, sorry! We actually have an open issue for datasets spanning the date line, since we never tested it (and I suspected it wouldn’t work):

https://github.com/DHI-GRAS/terracotta/issues/53

I’ll see if I can fix it, thanks for the test data.