astroquery: ESO retrieve data - Files keep failing
HI,
I am attempting to download some ESPRESSO data from ESO’s archive via the comment line using astroquery. To do so, I did the folowing (as per ESO’s Support suggestion):
- Install astroquery
- find its location in the python path (look for the directory: site-packages/astroquery/eso)
- create a parallel directory astroquery/esocas
- copy the content of astroquery/eso/ into astroquery/esocas
- modify astroquery/esocas by editing both the core.py and the init.py and replace any wdb/wdb/eso with wdb/wdb/cas
To query the ESO server, I use:
from astroquery.esocas import Eso
eso = Eso()
eso.ROW_LIMIT = -1
eso.USERNAME = username
eso.login(username, store_password=True)
query_results = eso.query_instrument('ESPRESSO', column_filters={'night':NIGHT, 'dp_cat':'SCIENCE'}, cache= False)
data_files = eso.retrieve_data(query_results['DP.ID'], with_calib='raw', destination=DATA_ROOT request_all_objects = True)
The problem I am facing is that sometimes the download fails on specific files when doing the above, although I am able to download them when I use the web interface.
For example when selecting NIGHT = '2018-09-01'
, the download consistently failed at filename “ESPRE.2018-09-05T18:26:46.039.fits.Z”, multiple times during the same night/over several hours.
here is the error:
Traceback (most recent call last):
File "dl_from_eso_archive.py", line 176, in <module>
main()
File "dl_from_eso_archive.py", line 121, in main
data_files = eso.retrieve_data(query_results['DP.ID'], with_calib='raw', destination=DATA_ROOT request_all_objects = True)
File "/opt/anaconda3/envs/esoPy/lib/python3.7/site-packages/astroquery/esocas/core.py", line 720, in retrieve_data
state = root.select('span[id=requestState]')[0].text
IndexError: list index out of range
I also had cases where the downloaded file was not a fits.Z file, but a bank login webpage, so I guess the connection somehow timed out?
Any idea what is happening?
PS: not sure if it help/causes issues, but I normally run the code inside “screen”.
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Comments: 25 (11 by maintainers)
HI @bsipocz I am sorry, though I had answered this already. Upgrading with
pip
solved my problem, thanks!Cheers Jorge
@keflavich
Yes, I agree. I only have a couple of suggestions that I believe might help: a) setting up a way to verify that the connection does not timeout from time to time (more difficult to implement), e.g. every 15/30 min; b) adding an option to the ‘retrieve_data’ function so it will only download the list of files and request number instead of the files. This would allow to download the datasets at a later time without the need to make a new dataset request (useful in the case of timeouts)
Thank you for all the help! Cheers, jorge