salt: 2017.7.4 to 2018.3.0 upgrade issue: Salt request timed out. The master is not responding

Description of Issue/Question

I have a standby saltmaster on saltstack 2017.7.4 and I am trying to upgrade it to 8.3.0 salt just released. After 8.3.0 upgrade, I am getting following issue.

[root@salt01 ~]# sudo salt minion01 test.version
Salt request timed out. The master is not responding. 
You may need to run your command with `--async` in order to bypass the congested event bus. 
With `--async`, the CLI tool will print the job id (jid) and exit immediately without
 listening for responses.  You can then use `salt-run jobs.lookup_jid` to look up 
the results of the job in the job cache later.
[root@salt01 ~]#

Setup

(Please provide relevant configs and/or SLS files (Be sure to remove sensitive info).)

Steps to Reproduce Issue

  • snapshot salt01 VM (VMware image) for backup.
  • disable /etc/yum.repos.d/salt-latest.repo
  • yum update -y && reboot
  • enable /etc/yum.repos.d/salt-latest.repo
  • yum update -y salt && reboot
  • run test.ping on all minions or just one minion
[root@salt01 ~]# sudo salt minion01 test.version
Salt request timed out. The master is not responding. 
You may need to run your command with `--async` in order to bypass 
the congested event bus. With `--async`, the CLI tool will print the job id (jid) and 
exit immediately without listening for responses. You can then 
use `salt-run jobs.lookup_jid` to look up the results of the job in the job cache later.
[root@salt01 ~]#
  • If I revert the salt01’s VM image back to saltstack 7.4 version. this problem disappear.

Versions Report

  • errors in systemctl status -l salt-master.
[root@salt01 ~]# systemctl status -l salt-master
? salt-master.service - The Salt Master Server
   Loaded: loaded (/usr/lib/systemd/system/salt-master.service; enabled; vendor preset: disabled)
   Active: active (running) since Wed 2018-04-04 09:08:33 CDT; 19min ago
     Docs: man:salt-master(1)
           file:///usr/share/doc/salt/html/contents.html
           https://docs.saltstack.com/en/latest/contents.html
 Main PID: 1090 (salt-master)
   CGroup: /system.slice/salt-master.service
           +-1090 /usr/bin/python /usr/bin/salt-master
           +-1480 /usr/bin/python /usr/bin/salt-master
           +-1707 /usr/bin/python /usr/bin/salt-master
           +-1708 /usr/bin/python /usr/bin/salt-master
           +-1715 /usr/bin/python /usr/bin/salt-master
           +-1716 /usr/bin/python /usr/bin/salt-master
           +-1717 /usr/bin/python /usr/bin/salt-master
           +-1718 /usr/bin/python /usr/bin/salt-master
           +-1724 /usr/bin/python /usr/bin/salt-master
           +-1726 /usr/bin/python /usr/bin/salt-master
           +-1727 /usr/bin/python /usr/bin/salt-master
           +-1728 /usr/bin/python /usr/bin/salt-master
           +-1729 /usr/bin/python /usr/bin/salt-master
           +-1730 /usr/bin/python /usr/bin/salt-master

Apr 04 09:12:15 salt01 salt-master[1090]: pub = salt.crypt.get_rsa_pub_key(pubfn)
Apr 04 09:12:15 salt01 salt-master[1090]: File "/usr/lib/python2.7/site-packages/salt/crypt.py", line 210, in get_rsa_pub_key
Apr 04 09:12:15 salt01 salt-master[1090]: key = RSA.load_pub_key(path)
Apr 04 09:12:15 salt01 salt-master[1090]: File "/usr/lib64/python2.7/site-packages/M2Crypto/RSA.py", line 406, in load_pub_key
Apr 04 09:12:15 salt01 salt-master[1090]: return load_pub_key_bio(bio)
Apr 04 09:12:15 salt01 salt-master[1090]: File "/usr/lib64/python2.7/site-packages/M2Crypto/RSA.py", line 422, in load_pub_key_bio
Apr 04 09:12:15 salt01 salt-master[1090]: rsa_error()
Apr 04 09:12:15 salt01 salt-master[1090]: File "/usr/lib64/python2.7/site-packages/M2Crypto/RSA.py", line 302, in rsa_error
Apr 04 09:12:15 salt01 salt-master[1090]: raise RSAError, m2.err_reason_error_string(m2.err_get_error())
Apr 04 09:12:15 salt01 salt-master[1090]: RSAError: no start line
[root@salt01 ~]#

  • version report
[root@salt01 ~]# salt --versions-report
Salt Version:
           Salt: 2018.3.0

Dependency Versions:
           cffi: 1.6.0
       cherrypy: unknown
       dateutil: 1.5
      docker-py: Not Installed
          gitdb: 0.6.4
      gitpython: 1.0.1
          ioflo: 1.3.8
         Jinja2: 2.7.2
        libgit2: 0.24.6
        libnacl: 1.4.3
       M2Crypto: 0.21.1
           Mako: 0.8.1
   msgpack-pure: Not Installed
 msgpack-python: 0.5.1
   mysql-python: 1.2.5
      pycparser: 2.14
       pycrypto: 2.6.1
   pycryptodome: 3.4.3
         pygit2: 0.24.2
         Python: 2.7.5 (default, Aug  4 2017, 00:39:18)
   python-gnupg: Not Installed
         PyYAML: 3.11
          PyZMQ: 15.3.0
           RAET: Not Installed
          smmap: 0.9.0
        timelib: 0.2.4
        Tornado: 4.2.1
            ZMQ: 4.1.4

System Versions:
           dist: centos 7.4.1708 Core
         locale: UTF-8
        machine: x86_64
        release: 3.10.0-693.21.1.el7.x86_64
         system: Linux
        version: CentOS Linux 7.4.1708 Core

[root@salt01 ~]#

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Reactions: 1
  • Comments: 98 (65 by maintainers)

Most upvoted comments

I’m having this same issue with 2018.3.0. My salt-minion is also on 2018.3.0.

This is what I’m getting in /var/log/salt/master:

2018-04-04 15:40:43,095 [tornado.application:123 ][ERROR   ][2361] Future exception was never retrieved: Traceback (most recent call last):
  File "/usr/lib64/python2.7/site-packages/tornado/gen.py", line 230, in wrapper
    yielded = next(result)
  File "/usr/lib/python2.7/site-packages/salt/transport/zeromq.py", line 676, in handle_message
    stream.send(self.serial.dumps(self._auth(payload['load'])))
  File "/usr/lib/python2.7/site-packages/salt/transport/mixins/auth.py", line 436, in _auth
    pub = salt.crypt.get_rsa_pub_key(pubfn)
  File "/usr/lib/python2.7/site-packages/salt/crypt.py", line 210, in get_rsa_pub_key
    key = RSA.load_pub_key(path)
  File "/usr/lib64/python2.7/site-packages/M2Crypto/RSA.py", line 426, in load_pub_key
    return load_pub_key_bio(bio)
  File "/usr/lib64/python2.7/site-packages/M2Crypto/RSA.py", line 441, in load_pub_key_bio
    rsa_error()
  File "/usr/lib64/python2.7/site-packages/M2Crypto/RSA.py", line 330, in rsa_error
    raise RSAError(Err.get_error_message())
RSAError: no start line
# salt --versions-report
Salt Version:
           Salt: 2018.3.0

Dependency Versions:
           cffi: 1.6.0
       cherrypy: Not Installed
       dateutil: Not Installed
      docker-py: Not Installed
          gitdb: Not Installed
      gitpython: Not Installed
          ioflo: Not Installed
         Jinja2: 2.7.2
        libgit2: Not Installed
        libnacl: Not Installed
       M2Crypto: 0.28.2
           Mako: Not Installed
   msgpack-pure: Not Installed
 msgpack-python: 0.5.1
   mysql-python: Not Installed
      pycparser: 2.14
       pycrypto: 2.6.1
   pycryptodome: 3.4.3
         pygit2: Not Installed
         Python: 2.7.5 (default, Aug  4 2017, 00:39:18)
   python-gnupg: Not Installed
         PyYAML: 3.11
          PyZMQ: 15.3.0
           RAET: Not Installed
          smmap: Not Installed
        timelib: Not Installed
        Tornado: 4.2.1
            ZMQ: 4.1.4

System Versions:
           dist: centos 7.4.1708 Core
         locale: ANSI_X3.4-1968
        machine: x86_64
        release: 4.13.13-6-pve
         system: Linux
        version: CentOS Linux 7.4.1708 Core

When will 2018.3.1 be released? I applied the latest release (2018.3) for minions and master and now I cannot run even ‘cmd.run uptime’ or ‘grains.get saltversion’. I have tried restarting the master. Thanks!

# salt b* grains.get saltversion
Salt request timed out. The master is not responding. You may need to run your command with `--async` in order to bypass the congested event bus. With `--async`, the CLI tool will print the job id (jid) and exit immediately without listening for responses. You can then use `salt-run jobs.lookup_jid` to look up the results of the job in the job cache later.

David

@wongchao in your case that is definitely because your old version of msgpack-python (0.3.0). can you upgrade? @brianeclow can you check if msgpack-python on your minion is also 0.4.7? @garethgreenaway ext_hook doesn’t exist in msgpack-python listed in requirements.txt. that is probably causing atleast that class of error people are seeing

As @mattp- mentioned above, there are several issues going on in this thread. The key issue is fixed at the head of the 2018.3 branch in #46930 and will available in the 2018.3.1 release. The msgpack version will be fixed in the packages as @dmurphy18 mentioned above also for the next release.

If you are seeing issues that are not related to either of these two problems and find yourself here, please file a new issue with all of the relevant information so we can track that problem separately.

Thank you everyone for your help in tracking down these bugs and for your help with reproducing and testing. It is very much appreciated.

For posterity the issue was fix in pycryptodome version 3.4.7:

http://pycryptodome.readthedocs.io/en/latest/src/changelog.html#id16