salt: service does not restart after watched config

os: opensuse service_provider: rh_service salt-call --versions-report Salt: 2014.1.5 Python: 2.7.2 (default, Aug 19 2011, 20:41:43) [GCC] Jinja2: 2.6 M2Crypto: 0.21.1 msgpack-python: 0.2.4 msgpack-pure: Not Installed pycrypto: 2.3 PyYAML: 3.10 PyZMQ: 13.0.0 ZMQ: 3.2.2

scenario:

on opensuse install postgresql-server package,
start service so it’s generate default configs in /var/lib/pgsql
edit pg_hba with file.replace
expect service restart, but it just check if postgresql running and returns true

if we edit pg_hba so file.replace runs again then service restarts ok, but not for the first run. I think it’s somehow related to postgres_first_run starting service, but can’t find anything relevant in the logs

example states from debug output:

postgresql:
  pkg:
    - installed
    - name: postgresql-server
  service:
    - running
    - enable: True
    - watch:  
      - file: /var/lib/pgsql/data/pg_hba.conf 

/var/lib/pgsql/data/pg_hba.conf: 
  file.replace:
    - pattern: (^host.*127\.0\.0\.1/32\s*).*$ 
    - repl: \1md5 
    - watch: 
      - pkg: postgresql 

postgres_first_run: 
  cmd.wait:
    - name: /etc/init.d/postgresql start >/dev/null 2>&1; sleep 5; echo -e '\nchanged=yes\n' 
    - stateful: True 
    - watch: 
      - pkg: postgresql 
    - watch_in: 
      - file: /var/lib/pgsql/data/pg_hba.conf

About this issue

Original URL
State: closed
Created 10 years ago
Reactions: 1
Comments: 66 (42 by maintainers)

Commits related to this issue

Allow automatic plugins installation service restart in jenkins/plugins.sls doesn't work at the moment (see https://github.com/saltstack/salt/issues/14183) — committed to dlax/jenkins-formula by deleted user 9 years ago
Allow automatic plugins installation service restart in jenkins/plugins.sls doesn't work at the moment (see https://github.com/saltstack/salt/issues/14183) — committed to dlax/jenkins-formula by deleted user 9 years ago

Most upvoted comments

@cmclaughlin Okay, so I discovered that you are definitely finding some unexpected behavior! It took a while for me to nail down exactly what’s happening - I’m not sure if it’s just a bug in our documentation, or if it’s an actual bug.

What’s happening here is kind of crazy.

If you remove the enable: True, or split it into

enable-apache:
  service.enabled:
    - name: httpd24

Everything works 🎉

The question you might have then is, “Buy why??”

Which is a darn good question 🤣

Turns out that the answer is pretty simple though - the only thing that’s going wrong here is that by starting (but not enabling) a service, when Salt looks at the service it says, “Ah, yeah, running is good. Oh, but let me enable this service. Cool, I’ve made some changes to it. What, you have a watch? No, I don’t need to do anything, I’ve already changed this service!”

Oops :trollface:

I definitely agree that it seems strange, or perhaps even unintuitive, but it is at least consistent, and now I understand exactly why 🙂

Just a had a quick conversation, and this is definitely intended behavior - I’m going to see about adding some information to our docs that calls it out, and perhaps gives some better explanation.

waynew on Jan 17, 2019

This is a 4 year old bug and it still exists in the latest version 2018.3.2. Any idea when or whether this will get a fix? @basepi @rallytime

rongshengfang on Oct 2, 2018

Here is an extract of my state file:

in apache/install.sls:

apache_service:
  service.running:
    - name: apache2
    - enable: True

in apache/site.sls:

/etc/apache2/sites-available/{{ name }}{{ site_extension }}:
  file.managed:
    - user: root
    - group: root
    - mode: 644
    - source: salt://apache/files/apache_conf/{{ name }}{{ site_extension }}
    - watch_in:
      - service: apache_service

I cannot see what is wrong with it.

As a temporary by-pass I put in apache/install.sls this extra state (on changes is just there to ensure it is not launched if nothing is changed):

apache_service_restart:
  cmd.run:
    - name: service apache2 restart
    - user: root
    - group: root
    - onchanges:
      - service: apache_service

and in apache/site.sls:

/etc/apache2/sites-available/{{ name }}{{ site_extension }}: 
  file.managed: 
    - user: root 
    - group: root 
    - mode: 644 
    - source: salt://apache/files/apache_conf/{{ name }}{{ site_extension }} 
    - onchanges_in: 
      - cmd: apache_service_restart

And it works fine.

I cannot figure out what is wrong in my first version.

doc75 on Jun 29, 2016

@waynew I finally had time to come up with a streamlined setup of what we are doing here when the problem occurs. I’m attaching a tarball that makes it 100% reproducible with the help of terraform and aws (only free tier resources are used). Running the exhibit.sh script will do the trick and it should be fairly easy to follow.

In short, it spins up an ubuntu server, install the latest salt in master-less mode and copy the states files and pillar and apply the highstate. Then it will change a pillar value and re-apply the highstate at which time the bug will be triggered and the service not-restarted as expected.

In this case, it boils down to a bad interaction between onchanges and watch. Here is the sls for reference:

/tmp/test/index.html:
  file.managed:
    - makedirs: True
    - contents:
      - Hello World
{# this will break the watch on `test_running` #}
    - onchanges_in:
      - test_running

test_systemd_unit:
  file.managed:
    - name: /etc/systemd/system/test.service
    - source: salt://test.service.jinja
    - template: jinja
    - mode: 600
  module.run:
    - name: service.systemctl_reload
    - onchanges:
      - file: test_systemd_unit

test_running:
  service.running:
    - name: test
    - enable: True
    - watch:
      - module: test_systemd_unit

After changing the pillar, the salt-call output is:

local:
----------
          ID: /tmp/test/index.html
    Function: file.managed
      Result: True
     Comment: File /tmp/test/index.html is in the correct state
     Started: 15:48:28.159757
    Duration: 26.06 ms
     Changes:
----------
          ID: test_systemd_unit
    Function: file.managed
        Name: /etc/systemd/system/test.service
      Result: True
     Comment: File /etc/systemd/system/test.service updated
     Started: 15:48:28.186076
    Duration: 24.858 ms
     Changes:
              ----------
              diff:
                  ---
                  +++
                  @@ -9,7 +9,7 @@
                   Type=simple
                   Restart=on-failure
                   WorkingDirectory=/tmp/test
                  -ExecStart=/usr/bin/python3 -m http.server 8080
                  +ExecStart=/usr/bin/python3 -m http.server 8081

                   [Install]
                   WantedBy=multi-user.target
----------
          ID: test_systemd_unit
    Function: module.run
        Name: service.systemctl_reload
      Result: True
     Comment: Module function service.systemctl_reload executed
     Started: 15:48:28.212099
    Duration: 474.369 ms
     Changes:
              ----------
              ret:
                  True
----------
          ID: test_running
    Function: service.running
        Name: test
      Result: True
     Comment: State was not run because none of the onchanges reqs changed
     Started: 15:48:28.687950
    Duration: 0.015 ms
     Changes:

Summary for local
------------
Succeeded: 4 (changed=2)
Failed:    0
------------
Total states run:     4
Total run time: 525.302 ms

As you can see, the test_systemd_unit is triggered and produces changes which is picked-up by the onchanges on module.run that does reload the systemd unit file but then the test_running state is not triggered, pretending that none of the onchanges reqs changed. This is technically true as its onchanges points to the file.managed of /tmp/test/index.html which didn’t change but it happily ignores the watch which does trigger it correctly if you remove the onchanges_in of the index.html.

Cheers,

Olivier

odormond on Jan 17, 2019

Thanks for your patience… I’m all over the place on this one. Maybe my internal code doesn’t match up with the tests we’ve come up with here. I’ll take a closer look at my internal Salt code and see if I can track down the problem.

cmclaughlin on Jan 15, 2019

Same here… Amazon Linux and Apache. My scenario is:

Install Apache
Manage with service.running
Add new config file or cert with watch_in on the service
Service does not restart

However, editing the config file or cert after the initial write to the filesystem and re-applying the state does restart the service. At least it my case, it seems watch_in only applies to changes to existing files.

cmclaughlin on Oct 13, 2017