salt: Minion did not return. [No response] appears occasionally,but once happened,minion never returns

Description of Issue

Setup

(Please provide relevant configs and/or SLS files (Be sure to remove sensitive info).)

  • minion and master config file are as follows, sensitive info is replace by ***,because of more than one minion has been set up, we specify our conf file path as /etc/salt2019

cat minion | grep -v “^#” | grep -v “^$”

master: ***
user: root
pidfile: /var/run/salt2019-minion.pid
conf_file: /etc/salt2019/minion
pki_dir: /etc/salt2019/pki/minion
id: ***
cachedir: /var/cache/salt2019/minion
sock_dir: /var/run/salt2019/minion
file_roots: /srv/salt2019
log_file: /var/log/salt2019/minion
key_logfile: /var/log/salt2019/key
log_level: debug
tcp_keepalive: True
  • master config file cat master | grep -v “^#” | grep -v “^$”
user: ***
publisher_acl:
  user***:
    - test.*
    - state.*
    - cmd.*

Steps to Reproduce Issue

(Include debug logs if possible and relevant.)

The following is debug log,master and minion ip is replaced by 1.2.3.4 and 2.2.2.2. We are executing a sls file, the salt minion runs very well for a couple of days before it suddenly couldn’t return data to master. I located the source code, found out that minion just stopped in “ Connecting the Minion to the Master URI (for the return server)”, and no more debug log info was printed.

  • Here is the log when minion does not return
 2020-03-20 14:50:40,465 [salt.minion                                                      :1465][INFO    ][1397] User saltapi Executing command state.sls with jid 20200320145808931182
2020-03-20 14:50:40,493 [salt.minion                                                      :1472][DEBUG   ][1397] Command details {'fun': 'state.sls', 'ret': '', 'tgt': '1.2.3.4', 'user': 'saltapi', 'tgt_type': 'glob', 'jid': '20200320145808931182', 'arg': ['------sensitive info is replaced ----------', 'pillar={"tasktype":"ent-serial-collect"}', {'queue': True, '__kwarg__': True}]}
2020-03-20 14:50:40,526 [salt.minion                                                      :1605][INFO    ][17093] Starting a new job 20200320145808931182 with PID 17093
2020-03-20 14:50:40,550 [salt.utils.lazy                                                  :107 ][DEBUG   ][17093] Could not LazyLoad {0}.allow_missing_func: '{0}.allow_missing_func' is not available.
2020-03-20 14:50:40,570 [salt.utils.lazy                                                  :104 ][DEBUG   ][17093] LazyLoaded state.sls
2020-03-20 14:50:40,582 [salt.utils.lazy                                                  :104 ][DEBUG   ][17093] LazyLoaded saltutil.is_running
2020-03-20 14:50:40,587 [salt.utils.lazy                                                  :104 ][DEBUG   ][17093] LazyLoaded grains.get
2020-03-20 14:50:40,589 [salt.loader.2.2.2.2.int.module.config                       :398 ][DEBUG   ][17093] key: test, ret: _|-
2020-03-20 14:50:40,611 [salt.transport.zeromq                                            :132 ][DEBUG   ][17093] Initializing new AsyncZeroMQReqChannel for ('/etc/salt2019/pki/minion', '1.2.3.4', 'tcp://2.2.2.2:4506', 'aes')
2020-03-20 14:50:40,611 [salt.crypt                                                       :463 ][DEBUG   ][17093] Initializing new AsyncAuth for ('/etc/salt2019/pki/minion', '1.2.3.4', 'tcp://2.2.2.2:4506')
2020-03-20 14:50:40,613 [salt.transport.zeromq                                            :203 ][DEBUG   ][17093] Connecting the Minion to the Master URI (for the return server): tcp://2.2.2.2:4506
2020-03-20 14:50:41,995 [salt.minion                                                      :1465][INFO    ][1397] User saltapi Executing command state.sls with jid 20200320145810461355
2020-03-20 14:50:41,995 [salt.minion                                                      :1472][DEBUG   ][1397] Command details {'fun': 'state.sls', 'ret': '', 'tgt': '1.2.3.4', 'user': 'saltapi', 'tgt_type': 'glob', 'jid': '20200320145810461355', 'arg': ['------sensitive info is replaced ----------', 'pillar={"tasktype":"ent-serial-collect"}', {'queue': True, '__kwarg__': True}]}
2020-03-20 14:50:42,006 [salt.minion                                                      :1605][INFO    ][17213] Starting a new job 20200320145810461355 with PID 17213
2020-03-20 14:50:42,010 [salt.utils.lazy                                                  :107 ][DEBUG   ][17213] Could not LazyLoad {0}.allow_missing_func: '{0}.allow_missing_func' is not available.
2020-03-20 14:50:42,014 [salt.utils.lazy                                                  :104 ][DEBUG   ][17213] LazyLoaded state.sls
2020-03-20 14:50:42,018 [salt.utils.lazy                                                  :104 ][DEBUG   ][17213] LazyLoaded saltutil.is_running
2020-03-20 14:50:42,020 [salt.utils.lazy                                                  :104 ][DEBUG   ][17213] LazyLoaded grains.get
2020-03-20 14:50:42,022 [salt.loader.2.2.2.2.int.module.config                       :398 ][DEBUG   ][17213] key: test, ret: _|-
2020-03-20 14:50:42,038 [salt.transport.zeromq                                            :132 ][DEBUG   ][17213] Initializing new AsyncZeroMQReqChannel for ('/etc/salt2019/pki/minion', '1.2.3.4', 'tcp://2.2.2.2:4506', 'aes')
2020-03-20 14:50:42,038 [salt.crypt                                                       :463 ][DEBUG   ][17213] Initializing new AsyncAuth for ('/etc/salt2019/pki/minion', '1.2.3.4', 'tcp://2.2.2.2:4506')
2020-03-20 14:50:42,040 [salt.transport.zeromq                                            :203 ][DEBUG   ][17213] Connecting the Minion to the Master URI (for the return server): tcp://2.2.2.2:4506
2020-03-20 15:01:15,337 [salt.utils.schedule                                              :1627][DEBUG   ][1397] schedule: Job __mine_interval was scheduled with jid_include, adding to cache (jid_include defaults to True)
2020-03-20 15:01:15,337 [salt.utils.schedule                                              :1630][DEBUG   ][1397] schedule: Job __mine_interval was scheduled with a max number of 2
2020-03-20 15:01:15,337 [salt.utils.schedule                                              :1647][INFO    ][1397] Running scheduled job: __mine_interval
2020-03-20 15:01:15,458 [salt.utils.schedule                                              :689 ][DEBUG   ][10760] schedule.handle_func: adding this job to the jobcache with data {'fun': 'mine.update', 'pid': 10760, 'id': '1.2.3.4', 'jid': '20200320150115452799', 'schedule': '__mine_interval', 'fun_args': []}
2020-03-20 15:01:15,461 [salt.utils.lazy                                                  :104 ][DEBUG   ][10760] LazyLoaded mine.update
2020-03-20 15:01:15,463 [salt.utils.lazy                                                  :104 ][DEBUG   ][10760] LazyLoaded config.merge
2020-03-20 15:01:15,463 [salt.utils.schedule                                              :836 ][DEBUG   ][10760] schedule.handle_func: Removing /var/cache/salt2019/minion/proc/20200320150115452799

Versions Report

(Provided by running salt --versions-report. Please also mention any differences in master/minion versions.)

  • master version

[root@localhost chenyanyan]# salt-master -V

Salt Version:
           Salt: 2018.3.4

Dependency Versions:
           cffi: 1.6.0
       cherrypy: 5.6.0
       dateutil: 1.5
      docker-py: Not Installed
          gitdb: Not Installed
      gitpython: Not Installed
          ioflo: Not Installed
         Jinja2: 2.7.2
        libgit2: Not Installed
        libnacl: Not Installed
       M2Crypto: 0.21.1
           Mako: 0.8.1
   msgpack-pure: Not Installed
 msgpack-python: 0.4.6
   mysql-python: Not Installed
      pycparser: 2.14
       pycrypto: 2.6.1
   pycryptodome: Not Installed
         pygit2: Not Installed
         Python: 2.7.5 (default, Aug 27 2018, 16:21:36)
   python-gnupg: Not Installed
         PyYAML: 3.10
          PyZMQ: 15.3.0
           RAET: Not Installed
          smmap: Not Installed
        timelib: Not Installed
        Tornado: 4.2.1
            ZMQ: 4.1.4

System Versions:
           dist:
         locale: UTF-8
        machine: x86_64
        release: 3.10.0-693.21.1.el7.x86_64
         system: Linux
        version: Not Installed
  • minion version

linux-jc57:/var/log # /root/miniconda3/bin/salt-minion -V

Salt Version:
           Salt: 2019.2.0

Dependency Versions:
           cffi: Not Installed
       cherrypy: Not Installed
       dateutil: Not Installed
      docker-py: Not Installed
          gitdb: Not Installed
      gitpython: Not Installed
          ioflo: Not Installed
         Jinja2: 2.10.1
        libgit2: Not Installed
        libnacl: Not Installed
       M2Crypto: Not Installed
           Mako: Not Installed
   msgpack-pure: Not Installed
 msgpack-python: 0.6.1
   mysql-python: Not Installed
      pycparser: Not Installed
       pycrypto: 2.6.1
   pycryptodome: Not Installed
         pygit2: Not Installed
         Python: 3.5.2 |Continuum Analytics, Inc.| (default, Jul  2 2016, 17:53:06)
   python-gnupg: Not Installed
         PyYAML: 3.11
          PyZMQ: 15.3.0
           RAET: Not Installed
          smmap: Not Installed
        timelib: Not Installed
        Tornado: 4.5.3
            ZMQ: 4.1.5

System Versions:
           dist: SuSE 11 x86_64
         locale: UTF-8
        machine: x86_64
        release: 3.0.76-0.11-default
         system: Linux
        version: SUSE Linux Enterprise Server  11 x86_64

About this issue

  • Original URL
  • State: open
  • Created 4 years ago
  • Comments: 30 (11 by maintainers)

Most upvoted comments

We found out that the Python 3. 5 had a bug that caused the problem : here’s the bug report  https: //bugs. python. org/issue29386 Please check your python version first ,and what I want to know is what have you do,did you use salt to execute some nested shell scripts? 

发自我的iPhone

------------------ Original ------------------ From: Dmitry Kuzmenko <notifications@github.com> Date: Tue,Sep 1,2020 6:56 PM To: saltstack/salt <salt@noreply.github.com> Cc: marilyn6483 <1280129660@qq.com>, Mention <mention@noreply.github.com> Subject: Re: [saltstack/salt] Minion did not return. [No response] appears occasionally,but once happened,minion never returns (#56467)

@sagetherage @DmitryKuzmenko and I had a conversation about turning on certain debugging, but I’m on holiday atm and haven’t seen the issue for a little while. Unfortunately it seems extremely intermittent - but I’ll be sure to confirm if it comes up again what the logs are.

@sagetherage - hey, I’d really love to get that help. This issue is driving me mad, sorry to chase - can we try and setup that session with a team member?

@sagetherage, that’d be great.

If you want to take a take a look at https://calend.ly/edhgoose that’s probably a good start? Generally evenings (UK time) are pretty good too.