invest: raster masking gives unexpected output if nodata is nan

There have been a few user forum posts recently that have caused issues because the users input raster had a nodata value of nan. Should checking / warning of nan nodata values be a part of UI validation? Or a part of model validation? Or both?

The recent forum post where this came up: https://community.naturalcapitalproject.org/t/no-data-value-ndr-model/2345/7

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 21 (21 by maintainers)

Commits related to this issue

Most upvoted comments

>>> numpy.isclose(numpy.nan, numpy.nan, equal_nan=True)
True
>>> numpy.isclose(numpy.nan, numpy.nan)
False

this might complicate things 😕

Dang, the nan case is an interesting one.

I wonder if it’s worth coming back to an idea that @emlys had some time ago: would it be worth having a utils function to compare nodata values? We now have a couple interesting reasons why such a function might be worth it:

  1. This nan case, requiring that we remember to pass an extra parameter to numpy.isclose every time we use it in the raster_calculator context and we have a user-defined nodata value
  2. Integer value comparison (a != b) is noticeably faster in numpy than numpy.isclose(a, b), but we can only do integer comparison on integer datasets and sometimes users provide a floating-point LULC so it’s a gamble to rely on integer equality.
  3. It would make testing the function a whole lot more direct, while still leaving open the option to include nan comparison in hypothesis-based testing later on

A centralized implementation of this might be as simple as

def this_is_a_bad_name_but_lets_check_nodata(array, nodata):
    # comparing an integer array against numpy.nan works correctly
    if numpy.issubdtype(array.dtype, numpy.integer):
        return array != nodata
    return numpy.isclose(array, nodata, equal_nan=True)

We have one other example of a code snippet we use basically everywhere in utils.make_suffix_string, so maybe this would make sense for us? I’m curious what everyone thinks.