astropy: No longer able to read BinnedTimeSeries with datetime column saved as ECSV after upgrading from 4.2.1 -> 5.0+

Description

Hi, This commit merged in PR #11569 breaks my ability to read an ECSV file created using Astropy v 4.2.1, BinnedTimeSeries class’s write method, which has a datetime64 column. Downgrading astropy back to 4.2.1 fixes the issue because the strict type checking in line 177 of ecsv.py is not there.

Is there a reason why this strict type checking was added to ECSV? Is there a way to preserve reading and writing of ECSV files created with BinnedTimeSeries across versions? I am happy to make a PR on this if the strict type checking is allowed to be scaled back or we can add datetime64 as an allowed type.

Expected behavior

The file is read into a BinnedTimeSeries object from ecsv file without error.

Actual behavior

ValueError is produced and the file is not read because ECSV.py does not accept the datetime64 column. ValueError: datatype 'datetime64' of column 'time_bin_start' is not in allowed values ('bool', 'int8', 'int16', 'int32', 'int64', 'uint8', 'uint16', 'uint32', 'uint64', 'float16', 'float32', 'float64', 'float128', 'string')

Steps to Reproduce

The file is read using:
BinnedTimeSeries.read('<file_path>', format='ascii.ecsv') which gives a long error.

The file in question is a binned time series created by astropy.timeseries.aggregate_downsample. which itself is a binned version of an astropy.timeseries.TimeSeries instance with some TESS data. (loaded via TimeSeries.from_pandas(Tess.set_index(‘datetime’)). I.e., it has a datetime64 index. The file was written using the classes own .write method in Astropy V4.2.1 from an instance of said class:
myBinnedTimeSeries.write('<file_path>',format='ascii.ecsv',overwrite=True)

I’ll attach a concatenated version of the file (as it contains private data). However, the relevant part from the header is on line 4:

# %ECSV 0.9
# ---
# datatype:
# - {name: time_bin_start, datatype: datetime64}

as you can see, the datatype is datetime64. This works fine with ECSV V0.9 but not V1.0 as some sort of strict type checking was added.

Full error log:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Input In [3], in <module>
---> 49 tsrbin = BinnedTimeSeries.read('../Photometry/tsr_bin.dat', format='ascii.ecsv')

File ~/Apps/miniconda3/envs/py310_latest/lib/python3.10/site-packages/astropy/timeseries/binned.py:285, in BinnedTimeSeries.read(self, filename, time_bin_start_column, time_bin_end_column, time_bin_size_column, time_bin_size_unit, time_format, time_scale, format, *args, **kwargs)
    230 """
    231 Read and parse a file and returns a `astropy.timeseries.BinnedTimeSeries`.
    232 
   (...)
    279 
    280 """
    282 try:
    283 
    284     # First we try the readers defined for the BinnedTimeSeries class
--> 285     return super().read(filename, format=format, *args, **kwargs)
    287 except TypeError:
    288 
    289     # Otherwise we fall back to the default Table readers
    291     if time_bin_start_column is None:

File ~/Apps/miniconda3/envs/py310_latest/lib/python3.10/site-packages/astropy/table/connect.py:62, in TableRead.__call__(self, *args, **kwargs)
     59 units = kwargs.pop('units', None)
     60 descriptions = kwargs.pop('descriptions', None)
---> 62 out = self.registry.read(cls, *args, **kwargs)
     64 # For some readers (e.g., ascii.ecsv), the returned `out` class is not
     65 # guaranteed to be the same as the desired output `cls`.  If so,
     66 # try coercing to desired class without copying (io.registry.read
     67 # would normally do a copy).  The normal case here is swapping
     68 # Table <=> QTable.
     69 if cls is not out.__class__:

File ~/Apps/miniconda3/envs/py310_latest/lib/python3.10/site-packages/astropy/io/registry/core.py:199, in UnifiedInputRegistry.read(self, cls, format, cache, *args, **kwargs)
    195     format = self._get_valid_format(
    196         'read', cls, path, fileobj, args, kwargs)
    198 reader = self.get_reader(format, cls)
--> 199 data = reader(*args, **kwargs)
    201 if not isinstance(data, cls):
    202     # User has read with a subclass where only the parent class is
    203     # registered.  This returns the parent class, so try coercing
    204     # to desired subclass.
    205     try:

File ~/Apps/miniconda3/envs/py310_latest/lib/python3.10/site-packages/astropy/io/ascii/connect.py:18, in io_read(format, filename, **kwargs)
     16     format = re.sub(r'^ascii\.', '', format)
     17     kwargs['format'] = format
---> 18 return read(filename, **kwargs)

File ~/Apps/miniconda3/envs/py310_latest/lib/python3.10/site-packages/astropy/io/ascii/ui.py:376, in read(table, guess, **kwargs)
    374     else:
    375         reader = get_reader(**new_kwargs)
--> 376         dat = reader.read(table)
    377         _read_trace.append({'kwargs': copy.deepcopy(new_kwargs),
    378                             'Reader': reader.__class__,
    379                             'status': 'Success with specified Reader class '
    380                                       '(no guessing)'})
    382 # Static analysis (pyright) indicates `dat` might be left undefined, so just
    383 # to be sure define it at the beginning and check here.

File ~/Apps/miniconda3/envs/py310_latest/lib/python3.10/site-packages/astropy/io/ascii/core.py:1343, in BaseReader.read(self, table)
   1340 self.header.update_meta(self.lines, self.meta)
   1342 # Get the table column definitions
-> 1343 self.header.get_cols(self.lines)
   1345 # Make sure columns are valid
   1346 self.header.check_column_names(self.names, self.strict_names, self.guessing)

File ~/Apps/miniconda3/envs/py310_latest/lib/python3.10/site-packages/astropy/io/ascii/ecsv.py:177, in EcsvHeader.get_cols(self, lines)
    175 col.dtype = header_cols[col.name]['datatype']
    176 if col.dtype not in ECSV_DATATYPES:
--> 177     raise ValueError(f'datatype {col.dtype!r} of column {col.name!r} '
    178                      f'is not in allowed values {ECSV_DATATYPES}')
    180 # Subtype is written like "int64[2,null]" and we want to split this
    181 # out to "int64" and [2, None].
    182 subtype = col.subtype

ValueError: datatype 'datetime64' of column 'time_bin_start' is not in allowed values ('bool', 'int8', 'int16', 'int32', 'int64', 'uint8', 'uint16', 'uint32', 'uint64', 'float16', 'float32', 'float64', 'float128', 'string')

System Details

(For the version that does not work) Python 3.10.2 | packaged by conda-forge | (main, Feb 1 2022, 19:28:35) [GCC 9.4.0] Numpy 1.22.2 pyerfa 2.0.0.1 astropy 5.0.1 Scipy 1.8.0 Matplotlib 3.5.1

(For the version that does work) Python 3.7.11 (default, Jul 27 2021, 14:32:16) [GCC 7.5.0] Numpy 1.20.3 pyerfa 2.0.0.1 astropy 4.2.1 Scipy 1.7.0 Matplotlib 3.4.2

About this issue

Original URL
State: closed
Created 2 years ago
Comments: 20 (20 by maintainers)

Most upvoted comments

I would be highly supportive of a backwards compatibility bugfix for V0.9, and then an API change for V5.1 that changes the spec. I would be willing to work on a PR for it.

emirkmo on Feb 18, 2022

@weaverba137 @emirkmo - sorry that the updates in ECSV reading are breaking back-compatibility, I am definitely sensitive to that. Perhaps we can do a bug-fix release which checks for ECSV 0.9 (as opposed to 1.0) and silently reads them without warnings. This will work for files written with older astropy.

@weaverba137 - ~~can you provide an example file with an object column?~~ [EDIT - I saw the example and read the discussion in the linked issue]. Going forward (astropy >= 5.0), object columns are written (and read) as described at https://github.com/astropy/astropy-APEs/blob/main/APE6.rst#object-columns. This is limited to object types that can be serialized to standard JSON (without any custom representations).

taldcroft on Feb 18, 2022

Our group has recently encountered errors very closely related to this. In our case the ECSV 0.9 type is object. I think the ECSV 1.0 equivalent is string subtype: json, but I haven’t been able to to confirm that yet.

In general, what is the policy on backward-compatibility when reading ECSV files?

weaverba137 on Feb 17, 2022