astropy: Windows ascii (and Table) fails on integer data that can't be represented as int32 (but could be as int64)

For example:

ascii.read("""num,ra,dec,radius,mag
100000000000000000,32.23222,10.1211,0.8,18.1
2,38.12321,-88.1321,2.2,17.0
""")

gives the following warnings:

WARNING: OverflowError converting to IntType in column num, reverting to String. [astropy.io.ascii.fastbasic]
WARNING:astropy:OverflowError converting to IntType in column num, reverting to String.

However the first line could be easily interpreted as int64:

ascii.read("""num,ra,dec,radius,mag
100000000000000000,32.23222,10.1211,0.8,18.1
2,38.12321,-88.1321,2.2,17.0
""")['num'].astype(np.int64)

Could the default integer type (I know that’s a mess on windows) be set to int64?

About this issue

  • Original URL
  • State: closed
  • Created 7 years ago
  • Comments: 16 (16 by maintainers)

Most upvoted comments

I would vote for option 1 to be consistent. One thing to consider is that numpy 2.0 is changing the default int on 64-bit platforms to be int64 (https://numpy.org/devdocs/numpy_2_0_migration_guide.html#windows-default-integer). If we wait until astropy requires numpy >= 2.0 then I think we get this for free (meaning any downstream breakages are due to numpy not astropy).

Given that we have to_pandas and from_pandas, that gives a very strong case to go with Option 1. Thanks for checking!

pandas use np.int64 by default on Windows as well (option 1 behavior).

import pandas as pd; from io import StringIO

df = pd.read_csv(StringIO("""num,cadence,mag
100000000000000000,1,13.1
987,2,13.2
"""))
for c in df.columns:  print(c, df[c].dtype)

It gives:

num int64
cadence int64
mag float64