salt: daily highstate fails after 2019.2 upgrade

After upgrading masters and minions to 2019.02, observed this behavior:

daily scheduled highstate fails to run UNLESS a manual pillar has been run recently (last 12 hours or so), but this doesn’t always make it run
when the highstate runs, if there are any failures, I receive 20 or so reports of failure per minion via the returner, as opposed to just one with 2018.03.4; it appears that the minion is trying state.apply over and over again for about two hours, after which it stops

This happens on two different masters.

Here’s is a pillar/schedule.sls for one of the masters:

schedule:
  highstate:
    function: state.highstate
    when: 4:30am
    splay: 600
    maxrunning: 1
    returner: highstate

Downgrading salt and salt-minion on the minions fixed the problem.

Here is one of the upgraded minions:

Salt Version:
           Salt: 2019.2.0
 
Dependency Versions:
           cffi: Not Installed
       cherrypy: Not Installed
       dateutil: 1.5
      docker-py: Not Installed
          gitdb: Not Installed
      gitpython: Not Installed
          ioflo: Not Installed
         Jinja2: 2.7.2
        libgit2: Not Installed
        libnacl: Not Installed
       M2Crypto: 0.31.0
           Mako: Not Installed
   msgpack-pure: Not Installed
 msgpack-python: 0.5.6
   mysql-python: 1.2.5
      pycparser: Not Installed
       pycrypto: 2.6.1
   pycryptodome: Not Installed
         pygit2: Not Installed
         Python: 2.7.5 (default, Apr  9 2019, 14:30:50)
   python-gnupg: Not Installed
         PyYAML: 3.11
          PyZMQ: 15.3.0
           RAET: Not Installed
          smmap: Not Installed
        timelib: Not Installed
        Tornado: 4.2.1
            ZMQ: 4.1.4
 
System Versions:
           dist: centos 7.6.1810 Core
         locale: UTF-8
        machine: x86_64
        release: 3.10.0-957.12.2.el7.x86_64
         system: Linux
        version: CentOS Linux 7.6.1810 Core

Here is the master:

Salt Version:
           Salt: 2019.2.0
 
Dependency Versions:
           cffi: 1.6.0
       cherrypy: Not Installed
       dateutil: 1.5
      docker-py: Not Installed
          gitdb: Not Installed
      gitpython: Not Installed
          ioflo: Not Installed
         Jinja2: 2.7.2
        libgit2: 0.26.3
        libnacl: Not Installed
       M2Crypto: 0.31.0
           Mako: Not Installed
   msgpack-pure: Not Installed
 msgpack-python: 0.5.6
   mysql-python: 1.2.5
      pycparser: 2.14
       pycrypto: 2.6.1
   pycryptodome: Not Installed
         pygit2: 0.26.4
         Python: 2.7.5 (default, Apr  9 2019, 14:30:50)
   python-gnupg: Not Installed
         PyYAML: 3.11
          PyZMQ: 15.3.0
           RAET: Not Installed
          smmap: Not Installed
        timelib: 0.2.4
        Tornado: 4.2.1
            ZMQ: 4.1.4
 
System Versions:
           dist: centos 7.6.1810 Core
         locale: UTF-8
        machine: x86_64
        release: 3.10.0-957.12.2.el7.x86_64
         system: Linux
        version: CentOS Linux 7.6.1810 Core

Thanks.

About this issue

Original URL
State: closed
Created 5 years ago
Comments: 38 (8 by maintainers)

Most upvoted comments

Was not able to test today. Will try to have a test result by Monday.

H20-17 on Jan 23, 2020

Alright. I’m hoping this is somewhat helpful or at least might give us a direction to move in.

recap: all minions that server one master and that master have been upgraded to the latest version. Yesterday I set three minions to run in the foreground and the master.

One of my states that runs at highstate is to make sure that the salt-minion service is running, so on the three minions I was running in the foreground, the highstate ran and told the salt-minion service to start. So those boxes now look like this:

[root@host ~]# ps auxwww | grep salt
root 2646 0.0 0.0 112712 964 pts/1 S+ 10:05 0:00 grep --color=auto salt
root 6889 0.0 0.5 314180 21768 pts/0 S+ Jan22 0:00 /usr/bin/python /usr/bin/salt-minion -l debug
root 6892 0.1 1.3 791584 51616 pts/0 Sl+ Jan22 1:16 /usr/bin/python /usr/bin/salt-minion -l debug
root 6894 0.0 0.5 395456 20256 pts/0 S+ Jan22 0:00 /usr/bin/python /usr/bin/salt-minion -l debug
root 25321 0.0 0.5 314188 21760 ? Ss 04:34 0:00 /usr/bin/python /usr/bin/salt-minion
root 25328 0.1 1.3 711484 53772 ? Sl 04:34 0:25 /usr/bin/python /usr/bin/salt-minion
root 25333 0.0 0.5 395456 20236 ? S 04:34 0:00 /usr/bin/python /usr/bin/salt-minion
root 31842 0.0 0.5 395456 20236 ? S Jan21 0:00 /usr/bin/python /usr/bin/salt-minion

Drilling down a bit, on this particular host, we have another job scheduled, not highstate, to run every morning, but it started this morning and has kept running every few minutes. Here’s the pillar schedule file:

schedule:
oh-my-zsh:
function: state.apply
args:
- library.name.user
# run between 4:10-4:20am
# - pull in the updates to the oh-my-zsh repo
# - do not use the highstate returner (we don't need email alerts about the
# active development we pull in daily from this repo)
# - complete before scheduled nightly state.apply starts at 4:30am
# - minimize chances that the scheduled nightly state.apply will have changes
# because of a fresh commit to the oh-my-zsh repo
when: 4:10am
# splay job start time over 600 seconds (10 minutes)
splay: 600

And here’s the debug data for what has been running on that minion every few mintues since this morning:

[DEBUG ] schedule: Job oh-my-zsh was scheduled with jid_include, adding to cache (jid_include defaults to True)
[DEBUG ] schedule: Job oh-my-zsh was scheduled with a max number of 1
[INFO ] Running scheduled job: oh-my-zsh (runtime missed by 0.594065 seconds)
[DEBUG ] Subprocess SignalHandlingMultiprocessingProcess-1:101-Schedule-oh-my-zsh added
[DEBUG ] schedule.handle_func: adding this job to the jobcache with data {u'fun_args': [], u'jid': u'20200123095749634377', u'schedule': 'oh-my-zsh', u'pid': 1456, u'fun': 'state.apply', u'id': u'host.wheaton.edu'}
[DEBUG ] LazyLoaded state.apply
[DEBUG ] LazyLoaded saltutil.is_running
[DEBUG ] LazyLoaded grains.get
[DEBUG ] LazyLoaded config.get
[DEBUG ] key: test, ret: _|-
[DEBUG ] Initializing new AsyncZeroMQReqChannel for (u'/etc/salt/pki/minion', u'host.wheaton.edu', u'tcp://172.20.1.132:4506', u'aes')
[DEBUG ] Initializing new AsyncAuth for (u'/etc/salt/pki/minion', u'host.wheaton.edu', u'tcp://172.20.1.132:4506')
[DEBUG ] Connecting the Minion to the Master URI (for the return server): tcp://172.20.1.132:4506
[DEBUG ] Trying to connect to: tcp://172.20.1.132:4506
[DEBUG ] Gathering pillar data for state run
[DEBUG ] Determining pillar cache
[DEBUG ] Initializing new AsyncZeroMQReqChannel for (u'/etc/salt/pki/minion', u'host.wheaton.edu', u'tcp://172.20.1.132:4506', u'aes')
[DEBUG ] Initializing new AsyncAuth for (u'/etc/salt/pki/minion', u'host.wheaton.edu', u'tcp://172.20.1.132:4506')
[DEBUG ] Connecting the Minion to the Master URI (for the return server): tcp://172.20.1.132:4506
[DEBUG ] Trying to connect to: tcp://172.20.1.132:4506
[DEBUG ] salt.crypt.get_rsa_key: Loading private key
[DEBUG ] Loaded minion key: /etc/salt/pki/minion/minion.pem
[DEBUG ] Closing AsyncZeroMQReqChannel instance
[DEBUG ] Finished gathering pillar data for state run
[INFO ] Loading fresh modules for state activity
[DEBUG ] LazyLoaded jinja.render
[DEBUG ] LazyLoaded yaml.render
[DEBUG ] In saltenv 'base', looking at rel_path 'library/name/user.sls' to resolve 'salt://library/name/user.sls'
[DEBUG ] In saltenv 'base', ** considering ** path '/var/cache/salt/minion/files/base/library/name/user.sls' to resolve 'salt://library/name/user.sls'
[DEBUG ] compile template: /var/cache/salt/minion/files/base/library/name/user.sls
[DEBUG ] Jinja search path: [u'/var/cache/salt/minion/files/base']
[DEBUG ] Initializing new AsyncZeroMQReqChannel for (u'/etc/salt/pki/minion', u'host.wheaton.edu', u'tcp://172.20.1.132:4506', u'aes')
[DEBUG ] Initializing new AsyncAuth for (u'/etc/salt/pki/minion', u'host.wheaton.edu', u'tcp://172.20.1.132:4506')
[DEBUG ] Connecting the Minion to the Master URI (for the return server): tcp://172.20.1.132:4506
[DEBUG ] Trying to connect to: tcp://172.20.1.132:4506
[PROFILE ] Time (in seconds) to render '/var/cache/salt/minion/files/base/library/name/user.sls' using 'jinja' renderer: 0.0152320861816
[DEBUG ] Rendered data from file: /var/cache/salt/minion/files/base/library/name/user.sls:

Install and configure ZSH
install ZSH:
pkg.installed:
- pkgs:
- zsh

install oh-my-zsh:
git.latest:
- name: https://github.com/robbyrussell/oh-my-zsh.git
- target: /home/username/.oh-my-zsh
- user: username
- require:
- cmd: create name's home dir
- pkg: install ZSH

manage oh-my-zsh permissions:
file.directory:
- name: /home/username/.oh-my-zsh
- user: username
- group: domain users
- dir_mode: 700
- recurse:
- user
- group
- require:
- git: install oh-my-zsh

install antigen:
file.managed:
- name: /home/username/.antigen.zsh
- user: username
- group: domain users
- source: salt://library/name/antigen.zsh
- require:
- git: install oh-my-zsh

set .zshrc:
file.managed:
- name: /home/username/.zshrc
- user: username
- group: domain users
- source: salt://library/name/.zshrc
- require:
- file: install antigen

Make sure there is my user
create name's home dir:
cmd.run:
- name: /usr/sbin/mkhomedir_helper username 0077
- creates: /home/username

make sure there is a user .ssh directory
ssh directory:
file.directory:
- name: /home/username/.ssh
- user: username
- group: domain users
- dir_mode: 700
- require:
- cmd: create name's home dir

ssh public key
make username's public key:
file.managed:
- name: /home/username/.ssh/authorized_keys
- user: username
- group: domain users
- mode: 600
- source: salt://library/name/authorized_keys
- require:
- file: ssh directory

[DEBUG ] Results of YAML rendering:
OrderedDict([(u'install ZSH', OrderedDict([(u'pkg.installed', [OrderedDict([(u'pkgs', [u'zsh'])])])])), (u'install oh-my-zsh', OrderedDict([(u'git.latest', [OrderedDict([(u'name', u'https://github.com/robbyrussell/oh-my-zsh.git')]), OrderedDict([(u'target', u'/home/username/.oh-my-zsh')]), OrderedDict([(u'user', u'username')]), OrderedDict([(u'require', [OrderedDict([(u'cmd', u"create name's home dir")]), OrderedDict([(u'pkg', u'install ZSH')])])])])])), (u'manage oh-my-zsh permissions', OrderedDict([(u'file.directory', [OrderedDict([(u'name', u'/home/username/.oh-my-zsh')]), OrderedDict([(u'user', u'username')]), OrderedDict([(u'group', u'domain users')]), OrderedDict([(u'dir_mode', 700)]), OrderedDict([(u'recurse', [u'user', u'group'])]), OrderedDict([(u'require', [OrderedDict([(u'git', u'install oh-my-zsh')])])])])])), (u'install antigen', OrderedDict([(u'file.managed', [OrderedDict([(u'name', u'/home/username/.antigen.zsh')]), OrderedDict([(u'user', u'username')]), OrderedDict([(u'group', u'domain users')]), OrderedDict([(u'source', u'salt://library/name/antigen.zsh')]), OrderedDict([(u'require', [OrderedDict([(u'git', u'install oh-my-zsh')])])])])])), (u'set .zshrc', OrderedDict([(u'file.managed', [OrderedDict([(u'name', u'/home/username/.zshrc')]), OrderedDict([(u'user', u'username')]), OrderedDict([(u'group', u'domain users')]), OrderedDict([(u'source', u'salt://library/name/.zshrc')]), OrderedDict([(u'require', [OrderedDict([(u'file', u'install antigen')])])])])])), (u"create name's home dir", OrderedDict([(u'cmd.run', [OrderedDict([(u'name', u'/usr/sbin/mkhomedir_helper username 0077')]), OrderedDict([(u'creates', u'/home/username')])])])), (u'ssh directory', OrderedDict([(u'file.directory', [OrderedDict([(u'name', u'/home/username/.ssh')]), OrderedDict([(u'user', u'username')]), OrderedDict([(u'group', u'domain users')]), OrderedDict([(u'dir_mode', 700)]), OrderedDict([(u'require', [OrderedDict([(u'cmd', u"create name's home dir")])])])])])), (u"make username's public key", OrderedDict([(u'file.managed', [OrderedDict([(u'name', u'/home/username/.ssh/authorized_keys')]), OrderedDict([(u'user', u'username')]), OrderedDict([(u'group', u'domain users')]), OrderedDict([(u'mode', 600)]), OrderedDict([(u'source', u'salt://library/name/authorized_keys')]), OrderedDict([(u'require', [OrderedDict([(u'file', u'ssh directory')])])])])]))])
[PROFILE ] Time (in seconds) to render '/var/cache/salt/minion/files/base/library/name/user.sls' using 'yaml' renderer: 0.0042450428009
[DEBUG ] LazyLoaded config.option
[DEBUG ] LazyLoaded pkg.install
[DEBUG ] LazyLoaded pkg.installed
[DEBUG ] schedule.handle_func: Adding splay of 328 seconds to next run.
[DEBUG ] key: ifttt.secret_key, ret: |-
[DEBUG ] key: ifttt:secret_key, ret: |-
[DEBUG ] key: pushbullet.api_key, ret: |-
[DEBUG ] key: pushbullet:api_key, ret: |-
[DEBUG ] LazyLoaded systemd.booted
[DEBUG ] You should upgrade pyOpenSSL to at least 0.14.1 to enable the use of X509 extensions in the tls module
[DEBUG ] key: victorops.api_key, ret: |-
[DEBUG ] key: victorops:api_key, ret: |-
[DEBUG ] DSC: Only available on Windows systems
[DEBUG ] Module PSGet: Only available on Windows systems
[DEBUG ] LazyLoaded platform.is_windows
[DEBUG ] Could not LazyLoad pkg.ex_mod_init: 'pkg.ex_mod_init' is not available.
[INFO ] Running state [install ZSH] at time 09:57:51.526328
[INFO ] Executing state pkg.installed for [install ZSH]
[DEBUG ] Could not LazyLoad pkg.resolve_capabilities: 'pkg.resolve_capabilities' is not available.
[INFO ] Executing command [u'rpm', u'-qa', u'--queryformat', u'%{NAME}|-%{EPOCH}|-%{VERSION}|-%{RELEASE}|-%{ARCH}|-(none)|-%{INSTALLTIME}'] in directory '/root'
[INFO ] All specified packages are already installed
[INFO ] Completed state [install ZSH] at time 09:57:53.265552 (duration_in_ms=1739.224)
[INFO ] Executing command [u'git', u'--version'] in directory '/root'
[DEBUG ] stdout: git version 1.8.3.1
[DEBUG ] LazyLoaded git.latest
[DEBUG ] LazyLoaded cmd.run
[INFO ] Running state [/usr/sbin/mkhomedir_helper username 0077] at time 09:57:53.285793
[INFO ] Executing state cmd.run for [/usr/sbin/mkhomedir_helper username 0077]
[INFO ] /home/username exists
[INFO ] Completed state [/usr/sbin/mkhomedir_helper username 0077] at time 09:57:53.287441 (duration_in_ms=1.648)
[INFO ] Running state [https://github.com/robbyrussell/oh-my-zsh.git] at time 09:57:53.287824
[INFO ] Executing state git.latest for [https://github.com/robbyrussell/oh-my-zsh.git]
[INFO ] Executing command [u'git', u'config', u'--global', u'--get-regexp', u'filter\.lfs\.'] in directory '/root'
[DEBUG ] retcode: 1
[INFO ] Checking remote revision for https://github.com/robbyrussell/oh-my-zsh.git
[INFO ] Executing command [u'git', u'ls-remote', u'https://github.com/robbyrussell/oh-my-zsh.git'] as user 'username' in directory '/home/username'
[DEBUG ] env command: (u'su', u'-s', '/bin/bash', u'-', u'username', u'-c', '/usr/bin/python')
[DEBUG ] stdout: 4e45e12dc355e3ba34e7e40ce4936fb222f0155c HEAD
4e45e12dc355e3ba34e7e40ce4936fb222f0155c refs/heads/master
...
I'm cutting a bunch of these out for brevity's sake...
...
974b111e08e9845cda5bea17212c523eeef45985 refs/pull/999/merge
[INFO ] Executing command [u'git', u'for-each-ref', u'--format', u'%(refname:short)', u'refs/heads/'] as user 'username' in directory '/home/username/.oh-my-zsh'
[DEBUG ] env command: (u'su', u'-s', '/bin/bash', u'-', u'username', u'-c', '/usr/bin/python')
[DEBUG ] stdout: master
[INFO ] Executing command [u'git', u'for-each-ref', u'--format', u'%(refname:short)', u'refs/tags/'] as user 'username' in directory '/home/username/.oh-my-zsh'
[DEBUG ] env command: (u'su', u'-s', '/bin/bash', u'-', u'username', u'-c', '/usr/bin/python')
[INFO ] Checking local revision for /home/username/.oh-my-zsh
[INFO ] Executing command [u'git', u'rev-parse', u'HEAD'] as user 'username' in directory '/home/username/.oh-my-zsh'
[DEBUG ] env command: (u'su', u'-s', '/bin/bash', u'-', u'username', u'-c', '/usr/bin/python')
[DEBUG ] stdout: 4e45e12dc355e3ba34e7e40ce4936fb222f0155c
[INFO ] Checking local branch for /home/username/.oh-my-zsh
[INFO ] Executing command [u'git', u'rev-parse', u'--abbrev-ref', u'HEAD'] as user 'username' in directory '/home/username/.oh-my-zsh'
[DEBUG ] env command: (u'su', u'-s', '/bin/bash', u'-', u'username', u'-c', '/usr/bin/python')
[DEBUG ] stdout: master
[INFO ] Executing command [u'git', u'remote', u'--verbose'] as user 'username' in directory '/home/username/.oh-my-zsh'
[DEBUG ] env command: (u'su', u'-s', '/bin/bash', u'-', u'username', u'-c', '/usr/bin/python')
[DEBUG ] stdout: origin https://github.com/robbyrussell/oh-my-zsh.git (fetch)
origin https://github.com/robbyrussell/oh-my-zsh.git (push)
[INFO ] Executing command [u'git', u'diff', u'HEAD'] as user 'username' in directory '/home/username/.oh-my-zsh'
[DEBUG ] env command: (u'su', u'-s', '/bin/bash', u'-', u'username', u'-c', '/usr/bin/python')
[INFO ] Executing command [u'git', u'merge-base', u'--is-ancestor', u'4e45e12dc355e3ba34e7e40ce4936fb222f0155c', u'4e45e12dc355e3ba34e7e40ce4936fb222f0155c'] as user 'username' in directory '/home/username/.oh-my-zsh'
[DEBUG ] env command: (u'su', u'-s', '/bin/bash', u'-', u'username', u'-c', '/usr/bin/python')
[INFO ] Executing command [u'git', u'rev-parse', u'--abbrev-ref', u'master@{upstream}'] as user 'username' in directory '/home/username/.oh-my-zsh'
[DEBUG ] env command: (u'su', u'-s', '/bin/bash', u'-', u'username', u'-c', '/usr/bin/python')
[DEBUG ] stdout: origin/master
[INFO ] Executing command [u'git', u'ls-remote', u'--tags', u'origin'] as user 'username' in directory '/home/username/.oh-my-zsh'
[DEBUG ] env command: (u'su', u'-s', '/bin/bash', u'-', u'username', u'-c', '/usr/bin/python')
[INFO ] Executing command [u'git', u'rev-parse', u'HEAD'] as user 'username' in directory '/home/username/.oh-my-zsh'
[DEBUG ] env command: (u'su', u'-s', '/bin/bash', u'-', u'username', u'-c', '/usr/bin/python')
[DEBUG ] stdout: 4e45e12dc355e3ba34e7e40ce4936fb222f0155c
[INFO ] Repository /home/username/.oh-my-zsh is up-to-date
[INFO ] Completed state [https://github.com/robbyrussell/oh-my-zsh.git] at time 09:57:55.771382 (duration_in_ms=2483.557)
[DEBUG ] LazyLoaded file.directory
[INFO ] Running state [/home/username/.oh-my-zsh] at time 09:57:55.778111
[INFO ] Executing state file.directory for [/home/username/.oh-my-zsh]
[INFO ] The directory /home/username/.oh-my-zsh is in the correct state
[INFO ] Completed state [/home/username/.oh-my-zsh] at time 09:57:57.057594 (duration_in_ms=1279.483)
[INFO ] Running state [/home/username/.antigen.zsh] at time 09:57:57.058069
[INFO ] Executing state file.managed for [/home/username/.antigen.zsh]
[DEBUG ] Initializing new AsyncZeroMQReqChannel for (u'/etc/salt/pki/minion', u'host.wheaton.edu', u'tcp://172.20.1.132:4506', u'aes')
[DEBUG ] Initializing new AsyncAuth for (u'/etc/salt/pki/minion', u'host.wheaton.edu', u'tcp://172.20.1.132:4506')
[DEBUG ] Connecting the Minion to the Master URI (for the return server): tcp://172.20.1.132:4506
[DEBUG ] Trying to connect to: tcp://172.20.1.132:4506
[DEBUG ] In saltenv 'base', looking at rel_path 'library/name/antigen.zsh' to resolve 'salt://library/name/antigen.zsh'
[DEBUG ] In saltenv 'base', ** considering ** path '/var/cache/salt/minion/files/base/library/name/antigen.zsh' to resolve 'salt://library/name/antigen.zsh'
[INFO ] File /home/username/.antigen.zsh is in the correct state
[INFO ] Completed state [/home/username/.antigen.zsh] at time 09:57:57.092618 (duration_in_ms=34.549)
[INFO ] Running state [/home/username/.zshrc] at time 09:57:57.093287
[INFO ] Executing state file.managed for [/home/username/.zshrc]
[DEBUG ] In saltenv 'base', looking at rel_path 'library/name/.zshrc' to resolve 'salt://library/name/.zshrc'
[DEBUG ] In saltenv 'base', ** considering ** path '/var/cache/salt/minion/files/base/library/name/.zshrc' to resolve 'salt://library/name/.zshrc'
[INFO ] File /home/username/.zshrc is in the correct state
[INFO ] Completed state [/home/username/.zshrc] at time 09:57:57.109653 (duration_in_ms=16.366)
[INFO ] Running state [/home/username/.ssh] at time 09:57:57.110128
[INFO ] Executing state file.directory for [/home/username/.ssh]
[INFO ] The directory /home/username/.ssh is in the correct state
[INFO ] Completed state [/home/username/.ssh] at time 09:57:57.111696 (duration_in_ms=1.567)
[INFO ] Running state [/home/username/.ssh/authorized_keys] at time 09:57:57.112288
[INFO ] Executing state file.managed for [/home/username/.ssh/authorized_keys]
[DEBUG ] In saltenv 'base', looking at rel_path 'library/name/authorized_keys' to resolve 'salt://library/name/authorized_keys'
[DEBUG ] In saltenv 'base', ** considering ** path '/var/cache/salt/minion/files/base/library/name/authorized_keys' to resolve 'salt://library/name/authorized_keys'
[INFO ] File /home/username/.ssh/authorized_keys is in the correct state
[INFO ] Completed state [/home/username/.ssh/authorized_keys] at time 09:57:57.129189 (duration_in_ms=16.9)
[DEBUG ] File /var/cache/salt/minion/accumulator/139640369767056 does not exist, no need to cleanup
[DEBUG ] LazyLoaded state.check_result
[DEBUG ] SaltEvent PUB socket URI: /var/run/salt/minion/minion_event_f84fe5becb_pub.ipc
[DEBUG ] SaltEvent PULL socket URI: /var/run/salt/minion/minion_event_f84fe5becb_pull.ipc
[DEBUG ] Sending event: tag = schedule_return; data = {u'fun_args': ['library.name.user'], u'jid': u'req', u'return': {u'file|-install antigen|-/home/username/.antigen.zsh_|-managed': {u'comment': u'File /home/username/.antigen.zsh is in the correct state', u'name': u'/home/username/.antigen.zsh', u'start_time': '09:57:57.058069', u'result': True, u'duration': 34.549, u'run_num': 4, u'sls': u'library.name.user', u'changes': {}, u'id': u'install antigen'}, u'file_|-manage oh-my-zsh permissions_|-/home/username/.oh-my-zsh_|-directory': {u'comment': u'The directory /home/username/.oh-my-zsh is in the correct state', u'name': u'/home/username/.oh-my-zsh', u'start_time': '09:57:55.778111', u'result': True, u'duration': 1279.483, u'run_num': 3, u'sls': u'library.name.user', u'changes': {}, u'id': u'manage oh-my-zsh permissions'}, u'file_|-ssh directory_|-/home/username/.ssh_|-directory': {u'comment': u'The directory /home/username/.ssh is in the correct state', u'name': u'/home/username/.ssh', u'start_time': '09:57:57.110129', u'result': True, u'duration': 1.567, u'run_num': 6, u'sls': u'library.name.user', u'changes': {}, u'id': u'ssh directory'}, u'git_|-install oh-my-zsh_|-https://github.com/robbyrussell/oh-my-zsh.git_|-latest': {u'comment': u'Repository /home/username/.oh-my-zsh is up-to-date', u'name': u'https://github.com/robbyrussell/oh-my-zsh.git', u'start_time': '09:57:53.287825', u'result': True, u'duration': 2483.557, u'run_num': 2, u'sls': u'library.name.user', u'changes': {}, u'id': u'install oh-my-zsh'}, u'pkg_|-install ZSH_|-install ZSH_|-installed': {u'comment': u'All specified packages are already installed', u'name': u'zsh', u'start_time': '09:57:51.526328', u'result': True, u'duration': 1739.224, u'run_num': 0, u'sls': u'library.name.user', u'changes': {}, u'id': u'install ZSH'}, u"file_|-make username's public key_|-/home/username/.ssh/authorized_keys_|-managed": {u'comment': u'File /home/username/.ssh/authorized_keys is in the correct state', u'name': u'/home/username/.ssh/authorized_keys', u'start_time': '09:57:57.112289', u'result': True, u'duration': 16.9, u'run_num': 7, u'sls': u'library.name.user', u'changes': {}, u'id': u"make username's public key"}, u'file_|-set .zshrc_|-/home/username/.zshrc_|-managed': {u'comment': u'File /home/username/.zshrc is in the correct state', u'name': u'/home/username/.zshrc', u'start_time': '09:57:57.093287', u'result': True, u'duration': 16.366, u'run_num': 5, u'sls': u'library.name.user', u'changes': {}, u'id': u'set .zshrc'}, u"cmd_|-create name's home dir_|-/usr/sbin/mkhomedir_helper username 0077_|-run": {u'comment': u'/home/username exists', u'id': u"create name's home dir", u'name': u'/usr/sbin/mkhomedir_helper username 0077', u'duration': 1.648, u'run_num': 1, u'start_time': '09:57:53.285793', u'sls': u'library.name.user', u'changes': {}, u'result': True}}, u'retcode': 0, u'success': True, u'schedule': 'oh-my-zsh', u'cmd': u'_return', u'pid': 1456, u'_stamp': '2020-01-23T15:57:57.135470', u'fun': 'state.apply', u'id': u'host.wheaton.edu'}
[DEBUG ] schedule.handle_func: Removing /var/cache/salt/minion/proc/20200123095749634377
[DEBUG ] Subprocess SignalHandlingMultiprocessingProcess-1:101-Schedule-oh-my-zsh cleaned up

I don’t know (a) why that scheduled job keeps running every few minutes, (b) if it could be caused by having more than one minion process running, or © if this is at all related to the original problem for which this issue was opened. On some other minions on which I am not running in the foreground, I can see a highstate run that repeated over and over again this morning, even though there were no failures in the run. The salt-minion logs on those minions are empty, so I don’t have any data to report from them. It seems like it could be a similar problem, though, to the one for which I /do/ have debug data here, even though this is not for a highstate run.

Hope this helps somehow. Let me know what to try next. Thanks much!

jbeaird on Feb 10, 2020

Catching up on this and I might have been able to reproduce it. Need to do a bit more testing.

garethgreenaway on Feb 11, 2020

I’m going to try and reproduce the issue with what we got, I’ll let you know if I need more info, thank you!

Akm0d on Jan 23, 2020

I am now testing with a 24 hour interval (debug level logging).

H20-17 on Jan 22, 2020

Thanks @H20-17 for all the work.

I need to upgrade my master to the latest update, which I will do today.

I’ll upgrade a bunch of minions, too, and let you know what happens.

In my experience, running the highstate oftener than every 8-12 hours will keep the bug from showing up. So I’m going to leave it at every 24 hours for a few days.

Thanks, @Akm0d for putting this at the top of your list.

jbeaird on Jan 22, 2020

I’m running a highstate every 15 minutes now on a test minion and I’m failing to reproduce the issue (schedules from pillar on the master). WHen I was experiencing this I was running a highstate just once a week on every mininon (also scheduled from pillar on the master). When it was happening, it would happen on just about every Windows minion and all it took was the one scheduled highstate. I will keep my test minion highstating all night (every 15 minutes). It’s possible that the issue was inadvertantly resolved due to changes (other bug fixes perhaps?).

I’l let you guys know if I find anything. For the time being everything is working. (I’m now fully up to date with 2019.2.3 (py3) on both the master and the Windows minion.)

H20-17 on Jan 22, 2020

I will look into this too. I havent tried running scheduled high states since 2019.2.0 because of this but I’ll see what happens now

H20-17 on Jan 22, 2020

Thanks, @Akm0d . Sorry for not clarifiying–the errors were state.apply failures, not errors per se.

I should note that this has followed me into CentOS 7.7, and persists even with the latest 2019.2 versions.

jbeaird on Jan 21, 2020