xarray: "write to read-only" Error in xarray.open_mfdataset() with opendap datasets

Error in loading in data from a THREDDS server. Can’t find any info on what might be causing it based on the error messages themselves.

Code Sample

def list_dates(start, end):
    num_days = (end - start).days
    return [start + dt.timedelta(days=x) for x in range(num_days)]

start_date = dt.date(2017, 3, 1)
end_date = dt.date(2017, 3, 31)
date_list = list_dates(start_date, end_date)
window = dt.timedelta(days=5)

url = 'https://www.ncei.noaa.gov/thredds/dodsC/OisstBase/NetCDF/V2.0/AVHRR/{0:%Y%m}/avhrr-only-v2.{0:%Y%m%d}.nc'
data = []
cur_date = start_date
for cur_date in date_list:
    
    date_window = list_dates(cur_date - window, cur_date + window)
    url_list = [url.format(x) for x in date_window]
    window_data=xr.open_mfdataset(url_list).sst
    data.append(window_data.mean('time'))
    

dataf=xr.concat(data, dim=pd.DatetimeIndex(date_list, name='time'))

Expected Output No error with dataf containing a data array with the dates listed above.

Error Description Error 1:

KeyError: [<class 'netCDF4._netCDF4.Dataset'>, ('https://www.ncei.noaa.gov/thredds/dodsC/OisstBase/NetCDF/V2.0/AVHRR/201703/avhrr-only-v2.20170322.nc',), 'r', (('clobber', True), ('diskless', False), ('format', 'NETCDF4'), ('persist', False))]

Error 2: OSError: [Errno -37] NetCDF: Write to read only: b'https://www.ncei.noaa.gov/thredds/dodsC/OisstBase/NetCDF/V2.0/AVHRR/201703/avhrr-only-v2.20170322.nc'

Versions python: 3.7.4 xarray: 0.15.0 pandas: 0.25.1 numpy: 1.16.5 scipy: 1.3.1 netcdf4: 1.5.3

About this issue

Original URL
State: closed
Created 4 years ago
Comments: 26 (2 by maintainers)

Most upvoted comments

@EliT1626 , I confirmed that this problem exists on Windows, but not on Linux.

The error:

 IOError: [Errno -37] NetCDF: Write to read only: 'https://www.ncei.noaa.gov/thredds/dodsC/OisstBase/NetCDF/V2.1/AVHRR/201703/oisst-avhrr-v02r01.20170304.nc'

suggested some kind of cache problem, and as you noted it always fails after a certain number of dates, so I tried increasing the number of cached files from the default 128 to 256:

xr.set_options(file_cache_maxsize=256)

but that had no effect.

Just to see if it would fail earlier, I then tried decreasing the number of cached files:

xr.set_options(file_cache_maxsize=10)

and to my surprise, it ran all the way through: https://nbviewer.jupyter.org/gist/rsignell-usgs/c52fadd8626734bdd32a432279bc6779

I’m hoping someone who worked on the caching (@shoyer?) might have some idea of what is going on, but at least you can execute your workflow now on windows!

rsignell-usgs on Jun 4, 2020