galaxy: ansible-galaxy collection install timeout
Bug Report
SUMMARY
We’ve seen ERROR! Unexpected Exception, this is probably a bug: ('The read operation timed out',)
(10 minute time out) quite a few times. Size of the collection doesn’t seem to be related.
Is there any logging on Galaxy to see how common this is?
ansible-galaxy -vvv collection install fortinet.fortios
01:49 Downloading https://galaxy.ansible.com/download/fortinet-fortios-1.0.7.tar.gz to /root/.ansible/tmp/ansible-local-666KgfAMW/tmpXSNpnv
# Note 10 minutes have passed
01:59 ERROR! Unexpected Exception, this is probably a bug: ('The read operation timed out',)
STEPS TO REPRODUCE
EXPECTED RESULTS
ACTUAL RESULTS
https://app.shippable.com/github/ansible-collections/community.general/runs/164/3/console
01:40 + ansible-galaxy -vvv collection install fortinet.fortios
01:43 [WARNING]: You are running the development version of Ansible. You should only
01:43 run Ansible from "devel" if you are modifying the Ansible engine, or trying out
01:43 features under development. This is a rapidly changing source of code and can
01:43 become unstable at any point.
01:43 [DEPRECATION WARNING]: Setting verbosity before the arg sub command is
01:43 deprecated, set the verbosity after the sub command. This feature will be
01:43 removed in version 2.13. Deprecation warnings can be disabled by setting
01:43 deprecation_warnings=False in ansible.cfg.
01:43 ansible-galaxy 2.10.0.dev0
01:43 config file = None
01:43 configured module search path = [u'/root/.ansible/plugins/modules', u'/usr/share/ansible/plugins/modules']
01:43 ansible python module location = /root/venv/lib/python2.7/site-packages/ansible
01:43 executable location = /root/venv/bin/ansible-galaxy
01:43 python version = 2.7.15+ (default, Feb 9 2019, 11:33:22) [GCC 5.4.0 20160609]
01:43 No config file found; using defaults
01:43 Found installed collection ansible.posix:0.1.1 at '/root/.ansible/ansible_collections/ansible/posix'
01:43 Found installed collection ansible.netcommon:0.0.2 at '/root/.ansible/ansible_collections/ansible/netcommon'
01:43 Found installed collection community.crypto:0.1.0 at '/root/.ansible/ansible_collections/community/crypto'
01:43 Found installed collection community.kubernetes:0.10.0 at '/root/.ansible/ansible_collections/community/kubernetes'
01:43 [WARNING]: Collection at '/root/.ansible/ansible_collections/community/general'
01:43 does not have a MANIFEST.json file, cannot detect version.
01:43 Found installed collection community.general:* at '/root/.ansible/ansible_collections/community/general'
01:43 Found installed collection f5networks.f5_modules:1.2.1 at '/root/.ansible/ansible_collections/f5networks/f5_modules'
01:43 Found installed collection cisco.intersight:1.0.3 at '/root/.ansible/ansible_collections/cisco/intersight'
01:43 Found installed collection cisco.mso:0.0.4 at '/root/.ansible/ansible_collections/cisco/mso'
01:43 Found installed collection check_point.mgmt:1.0.4 at '/root/.ansible/ansible_collections/check_point/mgmt'
01:43 Found installed collection ovirt.ovirt_collection:1.0.1 at '/root/.ansible/ansible_collections/ovirt/ovirt_collection'
01:43 Process install dependency map
01:43 Processing requirement collection 'fortinet.fortios'
01:43 Opened /root/.ansible/galaxy_token
01:45 Collection 'fortinet.fortios' obtained from server default https://galaxy.ansible.com/api/
01:49 Starting collection install process
01:49 Installing 'fortinet.fortios:1.0.7' to '/root/.ansible/ansible_collections/fortinet/fortios'
01:49 Downloading https://galaxy.ansible.com/download/fortinet-fortios-1.0.7.tar.gz to /root/.ansible/tmp/ansible-local-666KgfAMW/tmpXSNpnv
01:59 ERROR! Unexpected Exception, this is probably a bug: ('The read operation timed out',)
01:59 the full traceback was:
01:59
01:59 Traceback (most recent call last):
01:59 File "/root/venv/bin/ansible-galaxy", line 123, in <module>
01:59 exit_code = cli.run()
01:59 File "/root/venv/lib/python2.7/site-packages/ansible/cli/galaxy.py", line 479, in run
01:59 context.CLIARGS['func']()
01:59 File "/root/venv/lib/python2.7/site-packages/ansible/cli/galaxy.py", line 990, in execute_install
01:59 no_deps, force, force_deps, context.CLIARGS['allow_pre_release'])
01:59 File "/root/venv/lib/python2.7/site-packages/ansible/galaxy/collection.py", line 601, in install_collections
01:59 collection.install(output_path, b_temp_path)
01:59 File "/root/venv/lib/python2.7/site-packages/ansible/galaxy/collection.py", line 203, in install
01:59 self.b_path = self.download(b_temp_path)
01:59 File "/root/venv/lib/python2.7/site-packages/ansible/galaxy/collection.py", line 188, in download
01:59 headers=headers)
01:59 File "/root/venv/lib/python2.7/site-packages/ansible/galaxy/collection.py", line 1105, in _download_file
01:59 unredirected_headers=['Authorization'], http_agent=user_agent())
01:59 File "/root/venv/lib/python2.7/site-packages/ansible/module_utils/urls.py", line 1383, in open_url
01:59 unredirected_headers=unredirected_headers)
01:59 File "/root/venv/lib/python2.7/site-packages/ansible/module_utils/urls.py", line 1288, in open
01:59 return urllib_request.urlopen(request, None, timeout)
01:59 File "/usr/lib/python2.7/urllib2.py", line 154, in urlopen
01:59 return opener.open(url, data, timeout)
01:59 File "/usr/lib/python2.7/urllib2.py", line 429, in open
01:59 response = self._open(req, data)
01:59 File "/usr/lib/python2.7/urllib2.py", line 447, in _open
01:59 '_open', req)
01:59 File "/usr/lib/python2.7/urllib2.py", line 407, in _call_chain
01:59 result = func(*args)
01:59 File "/root/venv/lib/python2.7/site-packages/ansible/module_utils/urls.py", line 448, in https_open
01:59 req
01:59 File "/usr/lib/python2.7/urllib2.py", line 1201, in do_open
01:59 r = h.getresponse(buffering=True)
01:59 File "/usr/lib/python2.7/httplib.py", line 1121, in getresponse
01:59 response.begin()
01:59 File "/usr/lib/python2.7/httplib.py", line 438, in begin
01:59 version, status, reason = self._read_status()
01:59 File "/usr/lib/python2.7/httplib.py", line 394, in _read_status
01:59 line = self.fp.readline(_MAXLINE + 1)
01:59 File "/usr/lib/python2.7/socket.py", line 480, in readline
01:59 data = self._sock.recv(self._rbufsize)
01:59 File "/usr/lib/python2.7/ssl.py", line 772, in recv
01:59 return self.read(buflen)
01:59 File "/usr/lib/python2.7/ssl.py", line 659, in read
01:59 v = self._sslobj.read(len)
01:59 SSLError: ('The read operation timed out',)
About this issue
- Original URL
- State: open
- Created 4 years ago
- Reactions: 28
- Comments: 67 (3 by maintainers)
Commits related to this issue
- https://github.com/ansible/galaxy/issues/2302 — committed to alvistack/ansible-collection-ceph by hswong3i 4 years ago
- https://github.com/ansible/galaxy/issues/2302 — committed to alvistack/ansible-collection-gitlab by hswong3i 4 years ago
- https://github.com/ansible/galaxy/issues/2302 — committed to alvistack/ansible-collection-gnome by hswong3i 4 years ago
- https://github.com/ansible/galaxy/issues/2302 — committed to alvistack/ansible-collection-kubernetes by hswong3i 4 years ago
- https://github.com/ansible/galaxy/issues/2302 — committed to alvistack/ansible-role-ansible by hswong3i 4 years ago
- https://github.com/ansible/galaxy/issues/2302 — committed to alvistack/ansible-role-audacious by hswong3i 4 years ago
- https://github.com/ansible/galaxy/issues/2302 — committed to alvistack/ansible-role-audacity by hswong3i 4 years ago
- https://github.com/ansible/galaxy/issues/2302 — committed to alvistack/ansible-role-bamboo by hswong3i 4 years ago
- https://github.com/ansible/galaxy/issues/2302 — committed to alvistack/ansible-role-bitbucket by hswong3i 4 years ago
- https://github.com/ansible/galaxy/issues/2302 — committed to alvistack/ansible-role-bleachbit by hswong3i 4 years ago
- https://github.com/ansible/galaxy/issues/2302 — committed to alvistack/ansible-role-blender by hswong3i 4 years ago
- https://github.com/ansible/galaxy/issues/2302 — committed to alvistack/ansible-role-bootstrap by hswong3i 4 years ago
- https://github.com/ansible/galaxy/issues/2302 — committed to alvistack/ansible-role-buildah by hswong3i 4 years ago
- https://github.com/ansible/galaxy/issues/2302 — committed to alvistack/ansible-role-catatonit by hswong3i 4 years ago
- https://github.com/ansible/galaxy/issues/2302 — committed to alvistack/ansible-role-ceph_common by hswong3i 4 years ago
- https://github.com/ansible/galaxy/issues/2302 — committed to alvistack/ansible-role-ceph_mds by hswong3i 4 years ago
- https://github.com/ansible/galaxy/issues/2302 — committed to alvistack/ansible-role-ceph_mgr by hswong3i 4 years ago
- https://github.com/ansible/galaxy/issues/2302 — committed to alvistack/ansible-role-ceph_mon by hswong3i 4 years ago
- https://github.com/ansible/galaxy/issues/2302 — committed to alvistack/ansible-role-ceph_osd by hswong3i 4 years ago
- https://github.com/ansible/galaxy/issues/2302 — committed to alvistack/ansible-role-ceph_rgw by hswong3i 4 years ago
Right now (and earlier today), timeouts seem to happen a lot more.
I did workaround like someone already noted here by bypassing galaxy API directly to GIT. From that time i had no problem with timeouts… ( after they confirm problem fix ill revert it back to API )
Hitting this more frequently in the last week or so as well.
Services behind galaxy.ansible.com were restarted about an hour ago. Also some worker restart thresholds have been increased.
This was the workaround I used to bypass this
I see this a lot in openstack-ansible CI and it’s likely the most frequent cause of change-unrelated job failures currently.
Update from the Ansible side, it appears that someone is scraping galaxy.ansible.com on the hour (every hour) which is causing an increased load and other requests to time out. We are adding some logging in API service to log that from HTTP headers to help identify.
Looks like
--timeout 60
fixed it for me. Anyway I think that a timeout option should not be required in order to make the command work properly, but it seems like this is a server-side problem and that it can’t be fixed (properly, not in a hacky way appending timeout option) in client-side.A customer also reported this issue and I proposed the modification above to increase the timeout, they could resolve their problem.
Do we raise an RFE to ansible/ansible? If customer can configure a timeout values in ansible.cfg or something like that, it may be helpful.
I’m seeing this quite a bit in github actions: https://github.com/cognifloyd/community.mongodb/runs/1130183174?check_suite_focus=true
I’ve added some retry logic, but that only partially works. It looks like
ansible-galaxy
has a hard-coded 20 second timeout.https://github.com/ansible/ansible/blob/fa1fb2d13bdf948dc319be57e8465a9ef48c7fe3/lib/ansible/galaxy/api.py#L195-L197
I’ll go mention it in #ansible-galaxy
This appears to still be an issue. Happening for us weekly on various collections
“ERROR! Unknown error when attempting to call Galaxy at ‘https://galaxy.ansible.com/api/v3/collections/vyos/vyos/versions/4.0.2/’: The read operation timed out”
@mickaelvieira 2 minutes seems a bit high for one request and unlikely to succeed. Ansible should really get their galaxy servers in check, or at least try to fix this in code.
For anyone who are having this issue, increasing the timeout might help
Is there a canonical solution for this bug yet? Seeing:
python3 -m pip install https://github.com/WATonomous/ansible/archive/galaxy_timeout.tar.gz
🤷
I’d say so, since it was an API issue. Imho this is resolved for now, I hadn’t had issues end of last week but I have’t read any official announcement. Ansible only confirmed the problem but no update since then.
Well, yes and no. Your 1 sec just adds to the timeout. It’s not like it’s permanently hammering the API. In fact I used to have a
sleep 1
there first but dropped it.At his point any attempt is making the situation worst. That’s why I completely dropped galaxy for now and install via git.
If anyone needs an example, this is my quick and dirty solution:
I’m running into this right now. I am getting a CloudFlare branded 504 which means the origin server (Galaxy) gave a gateway timeout.
I’ve been trying to install
community.general
since yesterday evening. I managed to install part of the dependencies straight away, but then:By adding -vvv and
wget
ing the actual package url, it looks like there is a redirect to S3, which answers after some delay.What worked for me was to change the default 10 seconds delay to 30 seconds in
open_url
here: https://github.com/ansible/ansible/blob/7f0eb7ad799e531a8fbe5cc4f46046a4b1aeb093/lib/ansible/module_utils/urls.py#L1524.Isn’t 10 seconds a little too optimistic?
Luckily all these dependencies will be gone for community.general 2.0.0 😃
Is there any status page or api ?
And any workaround ? maybe a sed to change hardcoded value