dask: Cannot compute min or max of dates in dask array when converted from dask dataframe using to_dask_array
What happened:
Computing the minimum date from a dask array causes an exception to be thrown. That dask array was converted from a dask dataframe using to_dask_array
.
What you expected to happen:
I expected to be able to compute the minimum date from the dask array.
Minimal Complete Verifiable Example:
using to_dask_array
raises an exception
import pandas as pd
from datetime import date
import dask.dataframe as dd
dates_df = pd.Series(pd.date_range(date(2014,1,1),date(2015,1,1), freq="M"))
dates_dd = dd.from_pandas(dates_df, npartitions=1)
dates_da = dates_dd.to_dask_array()
print(dates_da.min().compute())
UFuncTypeError: ufunc 'add' cannot use operands with types dtype('<M8[ns]') and dtype('<M8[ns]')
Using from_array
works as expected
import pandas as pd
from datetime import date
import dask.dataframe as dd
dates_df = pd.Series(pd.date_range(date(2014,1,1),date(2015,1,1), freq="M"))
dates_da = dd.from_array(dates_df.values)
print(dates_da.min().compute())
2014-01-31 00:00:00
Environment:
- Dask version: 2.16.0
- Python version: 3.7.6
- Operating System: macOS Catalina 10.15.5
- Install method (conda, pip, source): conda
About this issue
- Original URL
- State: open
- Created 4 years ago
- Comments: 16 (16 by maintainers)
I think it’s just in
Array.min()
,.sum()
, etc. Anything going throughreduction
.