astropy: broken FITS table round-trip with masked Tables

When the fill_value of a masked Table Column is equal to an unmasked value, the round-trip to a FITS format file is broken [EDIT taldcroft example changed]. Note that #7481 in astropy 3.1 provides a work-around by specifying a serialize_method, but does not fix the default behavior.

In [23]: from astropy.table.table_helpers import simple_table
In [24]: t = simple_table(masked=True)
In [25]: t['a'].fill_value = 2
In [26]: t.write('junk.fits', overwrite=True)
In [27]: t2 = Table.read('junk.fits')

In [28]: t
Out[28]: 
<Table masked=True length=3>
  a      b     c  
int64 float64 str1
----- ------- ----
   --     1.0    c
    2     2.0   --
    3      --    e

In [29]: t2
Out[29]: 
<Table masked=True length=3>
  a      b      c   
int64 float64 bytes1
----- ------- ------
   --     1.0      c
   --     2.0      N
    3     nan      e

This is because the Table is storing the fill_value in the TNULLn keywords to flag mask values, resulting in masking any row that has that value, even if it wasn’t originally masked. [EDIT taldcroft] In addition, handling of floating point masking via NaN is broken as well as string-type masking.

To fix this, it seems that the Table needs to separate out the concept of what values are masked (using TNULLn following the FITS standard, vs. what the fill_value should be for those masked elements. AFAIK, the FITS standard doesn’t define how to store a fill_value, but TFILLn would seem reasonable.

The current behavior is especially problematic when coupled with issue #4707 where filtering masked tables resets the fill_value to 1 – a common number that can result in any other 1 getting masked after round tripping to a FITS file.

About this issue

  • Original URL
  • State: open
  • Created 8 years ago
  • Comments: 15 (15 by maintainers)

Commits related to this issue

Most upvoted comments

I’m not sure if this is related to this issue (maybe the same underlying problem?), if it’s not I can open a separate issue but here goes:

When round tripping a binary table, and adding a row when the data is in it’s Astropy Table incarnation, the TNULL value actually gets changed to 999999. The only seems to happen when a row is added, not when editing any individual values in the table, or new columns are added. For example, the fitsdiff shows afterwords:

Keyword TNULL19  has different values:
        a> -2147483647
        b> 999999

To add to this issue, even without the corner case of the int NULL value overlapping with a valid data value, it appears that masked table support is fairly broken. It does not seem to round trip for float or string columns. I spent a fair bit of time on #7481 to provide and optional work-around for these problems, but somebody should work on the core FITS support for missing data. I think this is a FITS issue, not Table. 😄.

In [1]: astro
astropy=3.0.1

In [2]: t = simple_table(masked=True)

In [3]: t
Out[3]: 
<Table masked=True length=3>
  a      b     c  
int64 float64 str1
----- ------- ----
   --     1.0    c
    2     2.0   --
    3      --    e

In [4]: t.write('junk.fits', overwrite=True)

In [5]: t2 = Table.read('junk.fits')

In [6]: t2
Out[6]: 
<Table masked=True length=3>
  a      b      c   
int64 float64 bytes1
----- ------- ------
   --     1.0      c
    2     2.0      N
    3     nan      e

In [7]: t2['b'].mask
Out[7]: array([False, False, False], dtype=bool)

In [8]: t2['c'].mask
Out[8]: array([False, False, False], dtype=bool)