salt: saltutil.cmd giving an exception on the minion end

Description of Issue

Trying to issue this on the salt-master:

salt 'minion1' saltutil.cmd '*' grains.items

or

salt-call saltutil.cmd '*' grains.items

results in an exception below, the log from the target minion:

2019-10-23 18:21:45,663 [salt.minion      :1605][INFO    ][9981] Starting a new job 20191023182145625298 with PID 9981
2019-10-23 18:21:45,668 [salt.utils.lazy  :107 ][DEBUG   ][9981] Could not LazyLoad {0}.allow_missing_func: '{0}.allow_missing_func' is not available.
2019-10-23 18:21:45,669 [salt.config      :2161][DEBUG   ][9981] Reading configuration from /etc/salt/minion
2019-10-23 18:21:45,671 [salt.config      :2265][DEBUG   ][9981] Missing configuration file: /root/.saltrc
2019-10-23 18:21:45,672 [salt.utils.event :324 ][DEBUG   ][9981] MasterEvent PUB socket URI: /var/run/salt/master/master_event_pub.ipc
2019-10-23 18:21:45,673 [salt.utils.event :327 ][DEBUG   ][9981] MasterEvent PULL socket URI: /var/run/salt/master/master_event_pull.ipc
2019-10-23 18:21:45,698 [salt.client      :1709][ERROR   ][9981] Unable to connect to the salt master publisher at /var/run/salt/master
2019-10-23 18:21:45,699 [salt.minion      :1738][WARNING ][9981] The minion function caused an exception
2019-10-23 18:21:45,699 [salt.utils.event :324 ][DEBUG   ][9981] SaltEvent PUB socket URI: /var/run/salt/minion/minion_event_7d1840618d_pub.ipc
2019-10-23 18:21:45,699 [salt.utils.event :327 ][DEBUG   ][9981] SaltEvent PULL socket URI: /var/run/salt/minion/minion_event_7d1840618d_pull.ipc
2019-10-23 18:21:45,699 [salt.transport.ipc:261 ][DEBUG   ][9981] Initializing new IPCClient for path: /var/run/salt/minion/minion_event_7d1840618d_pull.ipc
2019-10-23 18:21:45,701 [salt.utils.event :742 ][DEBUG   ][9981] Sending event: tag = _salt_error; data = {'message': u'The minion function caused an exception', 'args': ('The minion function caused an exception',), u'_stamp': '2019-10-23T16:21:45.701169'}
2019-10-23 18:21:45,701 [salt.transport.ipc:405 ][DEBUG   ][9981] Closing IPCMessageClient instance
2019-10-23 18:21:45,702 [salt.minion      :2355][DEBUG   ][1553] Minion of x.x.x.x is handling event tag '_salt_error'
2019-10-23 18:21:45,702 [salt.minion      :2528][DEBUG   ][1553] Forwarding salt error event tag=_salt_error
2019-10-23 18:21:45,702 [salt.minion      :767 ][DEBUG   ][9981] Minion return retry timer set to 10 seconds (randomized)
2019-10-23 18:21:45,703 [salt.minion      :1941][INFO    ][9981] Returning information for job: 20191023182145625298
2019-10-23 18:21:45,703 [salt.transport.zeromq:132 ][DEBUG   ][1553] Initializing new AsyncZeroMQReqChannel for (u'/etc/salt/pki/minion', u'minion1', u'tcp://x.x.x.x:4506', u'aes')
2019-10-23 18:21:45,703 [salt.transport.zeromq:132 ][DEBUG   ][9981] Initializing new AsyncZeroMQReqChannel for (u'/etc/salt/pki/minion', u'minion1', u'tcp://x.x.x.x:4506', u'aes')
2019-10-23 18:21:45,703 [salt.crypt       :463 ][DEBUG   ][9981] Initializing new AsyncAuth for (u'/etc/salt/pki/minion', u'minoin1', u'tcp://x.x.x.x:4506')
2019-10-23 18:21:45,703 [salt.crypt       :463 ][DEBUG   ][1553] Initializing new AsyncAuth for (u'/etc/salt/pki/minion', u'minoin1', u'tcp://x.x.x.x:4506')
2019-10-23 18:21:45,704 [salt.transport.zeromq:203 ][DEBUG   ][9981] Connecting the Minion to the Master URI (for the return server): tcp://x.x.x.x:4506
2019-10-23 18:21:45,704 [salt.transport.zeromq:203 ][DEBUG   ][1553] Connecting the Minion to the Master URI (for the return server): tcp://x.x.x.x:4506
2019-10-23 18:21:45,704 [salt.transport.zeromq:1178][DEBUG   ][9981] Trying to connect to: tcp://x.x.x.x:4506
2019-10-23 18:21:45,705 [salt.transport.zeromq:1178][DEBUG   ][1553] Trying to connect to: tcp://x.x.x.x:4506
2019-10-23 18:21:45,774 [salt.transport.zeromq:227 ][DEBUG   ][1553] Closing AsyncZeroMQReqChannel instance
2019-10-23 18:21:45,793 [salt.transport.zeromq:227 ][DEBUG   ][9981] Closing AsyncZeroMQReqChannel instance
2019-10-23 18:21:45,793 [salt.minion      :1783][DEBUG   ][9981] minion return:{u'fun_args': [u'minion1', u'grains.items'], u'jid': u'20191024113408001798', u'return': u'The minion function caused an exception: Traceback (most recent call last):\n  File "/usr/lib/python2.7/dist-packages/salt/minion.py", line 1660, in _thread_return\n    return_data = minion_instance.executors[fname](opts, data, func, args, kwargs)\n  File "/usr/lib/python2.7/dist-packages/salt/executors/direct_call.py", line 12, in execute\n    return func(*args, **kwargs)\n  File "/usr/lib/python2.7/dist-packages/salt/modules/saltutil.py", line 1490, in cmd\n    client, tgt, fun, arg, timeout, tgt_type, ret, kwarg, **kwargs)\n  File "/usr/lib/python2.7/dist-packages/salt/modules/saltutil.py", line 1454, in _exec\n    for ret_comp in _cmd(**cmd_kwargs):\n  File "/usr/lib/python2.7/dist-packages/salt/client/__init__.py", line 841, in cmd_iter\n    **kwargs)\n  File "/usr/lib/python2.7/dist-packages/salt/client/__init__.py", line 341, in run_job\n    \'The salt master could not be contacted. Is master running?\'\nSaltClientError: The salt master could not be contacted. Is master running?\n', u'retcode': 1, u'success': False, u'fun': u'saltutil.cmd', u'out': u'nested'}

the result to the master is:

    The minion function caused an exception: Traceback (most recent call last):
      File "/usr/lib/python2.7/dist-packages/salt/minion.py", line 1660, in _thread_return
        return_data = minion_instance.executors[fname](opts, data, func, args, kwargs)
      File "/usr/lib/python2.7/dist-packages/salt/executors/direct_call.py", line 12, in execute
        return func(*args, **kwargs)
      File "/usr/lib/python2.7/dist-packages/salt/modules/saltutil.py", line 1490, in cmd
        client, tgt, fun, arg, timeout, tgt_type, ret, kwarg, **kwargs)
      File "/usr/lib/python2.7/dist-packages/salt/modules/saltutil.py", line 1454, in _exec
        for ret_comp in _cmd(**cmd_kwargs):
      File "/usr/lib/python2.7/dist-packages/salt/client/__init__.py", line 841, in cmd_iter
        **kwargs)
      File "/usr/lib/python2.7/dist-packages/salt/client/__init__.py", line 341, in run_job
        'The salt master could not be contacted. Is master running?'
    SaltClientError: The salt master could not be contacted. Is master running?


Setup

salt master and minion are running flourine with python 2.7

Steps to Reproduce Issue

installed salt fluorine on both master and minion and then issued:

salt 'minion1' saltutil.cmd '*' grains.items

Versions Report

master:

Salt Version:
           Salt: 2019.2.0

Dependency Versions:
           cffi: Not Installed
       cherrypy: unknown
       dateutil: 2.5.3
      docker-py: Not Installed
          gitdb: 2.0.0
      gitpython: 2.1.1
          ioflo: Not Installed
         Jinja2: 2.9.4
        libgit2: 0.24.5
        libnacl: Not Installed
       M2Crypto: Not Installed
           Mako: Not Installed
   msgpack-pure: Not Installed
 msgpack-python: 0.4.8
   mysql-python: Not Installed
      pycparser: Not Installed
       pycrypto: 2.6.1
   pycryptodome: Not Installed
         pygit2: 0.24.2
         Python: 2.7.13 (default, Sep 26 2018, 18:42:22)
   python-gnupg: Not Installed
         PyYAML: 3.12
          PyZMQ: 16.0.2
           RAET: Not Installed
          smmap: 2.0.1
        timelib: Not Installed
        Tornado: 4.4.3
            ZMQ: 4.2.1

System Versions:
           dist: debian 9.8
         locale: UTF-8
        machine: x86_64
        release: 4.9.0-8-amd64
         system: Linux
        version: debian 9.8

minion:

Salt Version:
           Salt: 2019.2.0

Dependency Versions:
           cffi: 1.12.3
       cherrypy: Not Installed
       dateutil: 2.5.3
      docker-py: 4.0.2
          gitdb: Not Installed
      gitpython: Not Installed
          ioflo: Not Installed
         Jinja2: 2.9.4
        libgit2: Not Installed
        libnacl: Not Installed
       M2Crypto: Not Installed
           Mako: Not Installed
   msgpack-pure: Not Installed
 msgpack-python: 0.4.8
   mysql-python: Not Installed
      pycparser: 2.19
       pycrypto: 2.6.1
   pycryptodome: Not Installed
         pygit2: Not Installed
         Python: 2.7.13 (default, Sep 26 2018, 18:42:22)
   python-gnupg: Not Installed
         PyYAML: 3.13
          PyZMQ: 16.0.2
           RAET: Not Installed
          smmap: Not Installed
        timelib: Not Installed
        Tornado: 4.4.3
            ZMQ: 4.2.1

System Versions:
           dist: debian 9.11
         locale: UTF-8
        machine: x86_64
        release: 4.9.0-9-amd64
         system: Linux
        version: debian 9.11

About this issue

  • Original URL
  • State: open
  • Created 5 years ago
  • Comments: 20 (8 by maintainers)

Most upvoted comments

So, I was able to reproduce this. Here is my setup…

  • Ubuntu minion is running 2019.2.0 with python 2.7.14
  • Windows minion is running 2019.2.0. with python 3.5.4

I ran salt * saltutil.cmd '*' grains.items and the ubuntu minion returned, but my windows minion did not, and gave me this instead:

windows:
    The minion function caused an exception: Traceback (most recent call last):
      File "c:\dev\salt\salt\client\__init__.py", line 1730, in pub
        raise SaltReqTimeoutError()
    salt.exceptions.SaltReqTimeoutError

    During handling of the above exception, another exception occurred:

    Traceback (most recent call last):
      File "c:\dev\salt\salt\client\__init__.py", line 334, in run_job
        **kwargs)
      File "c:\dev\salt\salt\client\__init__.py", line 1734, in pub
        'Salt request timed out. The master is not responding. You '
    salt.exceptions.SaltReqTimeoutError: Salt request timed out. The master is not responding. You may need to run your command with `--async` in order to bypass the congested event bus. With `--async`, the CLI tool will print the job id (jid) and exit immediately without listening for responses. You can then use `salt-run jobs.lookup_jid` to look up the results of the job in the job cache later.

    During handling of the above exception, another exception occurred:

    Traceback (most recent call last):
      File "c:\dev\salt\salt\minion.py", line 1664, in _thread_return
        return_data = minion_instance.executors[fname](opts, data, func, args, kwargs)
      File "c:\dev\salt\salt\executors\direct_call.py", line 12, in execute
        return func(*args, **kwargs)
      File "c:\dev\salt\salt\modules\saltutil.py", line 1503, in cmd
        client, tgt, fun, arg, timeout, tgt_type, ret, kwarg, **kwargs)
      File "c:\dev\salt\salt\modules\saltutil.py", line 1467, in _exec
        for ret_comp in _cmd(**cmd_kwargs):
      File "c:\dev\salt\salt\client\__init__.py", line 838, in cmd_iter
        **kwargs)
      File "c:\dev\salt\salt\client\__init__.py", line 346, in run_job
        raise SaltClientError(general_exception)
    salt.exceptions.SaltClientError: Salt request timed out. The master is not responding. You may need to run your command with `--async` in order to bypass the congested event bus. With `--async`, the CLI tool will print the job id (jid) and exit immediately without listening for responses. You can then use `salt-run jobs.lookup_jid` to look up the results of the job in the job cache later.

So far it seems that 2019.2.0 and python3 is where we see this.

Same issue with Salt 3003 and Python 3.7.3 (master and minion)

This issue was successfully reproduced by @xeacott and therefore has not gone stale as it’s still just trying to live its life.

*snaps whip at bot*

I just gave this a test run and here’s my results…

root@DESKTOP-SQIU2I1:/home/eacott# salt-call saltutil.cmd '*' grains.items
local:
    ----------
    DESKTOP-SQIU2I1.localdomain:
        ----------
        jid:
            20191031135824729292
        out:
            nested
        ret:
            ----------
... (snipped) ...

And a versions report

root@DESKTOP-SQIU2I1:/home/eacott# salt --versions-report
Salt Version:
           Salt: 2019.2.0

Dependency Versions:
           cffi: Not Installed
       cherrypy: Not Installed
       dateutil: 2.6.1
      docker-py: 4.1.0
          gitdb: 2.0.3
      gitpython: 2.1.8
          ioflo: Not Installed
         Jinja2: 2.10
        libgit2: Not Installed
        libnacl: Not Installed
       M2Crypto: Not Installed
           Mako: 1.0.7
   msgpack-pure: Not Installed
 msgpack-python: 0.5.6
   mysql-python: Not Installed
      pycparser: Not Installed
       pycrypto: 2.6.1
   pycryptodome: Not Installed
         pygit2: Not Installed
         Python: 2.7.15+ (default, Oct  7 2019, 17:39:04)
   python-gnupg: 0.4.1
         PyYAML: 3.13
          PyZMQ: 16.0.2
           RAET: Not Installed
          smmap: 2.0.3
        timelib: Not Installed
        Tornado: 4.5.3
            ZMQ: 4.2.5

System Versions:
           dist: Ubuntu 18.04 bionic
         locale: UTF-8
        machine: x86_64
        release: 4.4.0-18362-Microsoft
         system: Linux
        version: Ubuntu 18.04 bionic

I’m not a maintainer or anything, so whenever they pick up this issue they might have more insight on this, but master/minion communication is handled by tornado and zmq. The issue is happening when the results of the command are being iterated through (according to your backtrace) which comes from the returner, so maybe fiddling around with either updating those two modules or tinkering with returner options might help.

I’ve had numerous problems with the different returners until I rewrote the etcd returner and have been using that. So if updating those two modules doesn’t work, and you’re in a tight place while waiting for a developer to get assigned this issue, you can maybe try using a different returner.