salt: Using mine.get while targeting grain returns data from dead minions
Hello,
I’m currently running 2014.7.2 and after removing some minions that were VMs using Foreman with the foreman-salt plugging (hence deleting their keys), the data I get from mine while targeting using grain is always from the minions that are now gone.
I’ve tried using salt-run cache.clear_all tgt='*'
, salt '*' saltutil.clear_cache
and salt '*' mine.flush
and nothing seems to change the outcome.
What I’m trying to run is salt-call mine.get 'kernel:Linux' backend_ip_addr grain
in a template so it looks like {% for host, ips in salt['mine.get']('kernel:Linux', 'backend_ip_addr', 'grain').items() %}
.
If I do salt-call mine.get '*' backend_ip_addr
I get data without the dead minions. Weirdly enough If do something like salt-call mine.get 'os:Ubuntu' backend_ip_addr grain
. I only get data from the same host as the minion I’m running this on even if there is many more minions that are running Ubuntu.
I’ve tried to target minions using pillar and I get either nothing or things that doesn’t make sense…
Thank you.
About this issue
- Original URL
- State: closed
- Created 9 years ago
- Reactions: 2
- Comments: 45 (13 by maintainers)
What I ended up having to do was to go in
/var/cache/salt/master
and delete all the minions that are dead by removing the whole directory. Than runningsalt '*' saltutil.clear_cache
. This seemed to finally worked and the data I receive is the correct one.As far as I see, flush_mine_on_destroy must be supported by salt-cloud driver, for now I see it only for nova and divers with libcloud.
I’ve done more to debug issue #35439 since I last wrote, and I believe this issue and that one are the same thing. I wanted to give some information to other ops professionals out there who might read this, and also workarounds.
tl;dr:
saltutil.cmd
in an orchestration, if orchestration is an option.Grain Targeting is Broken
Deeply situated in the logic of salt is a function dealing with how to target minions based on grains. When salt does this, it consults the minion data cache. By default, the minion data cache is found in
/var/cache/salt/master/minions/
. If there is an entry for a particular minion in the cache, salt uses it to determine if the minion should be targeted. If there is not an entry for the minion, salt assumes the minion should be targeted.The last sentence is what fundamentally messes things up for mine users. It is actually a fairly safe assumption in the case where you are using grain targeting to run something like
state.apply
, since when a minion is targeted which shouldn’t be targeted (that is, the grain used in targeting isn’t set on that minion), it simply ignores the request and you get “Minion did not return” when the call returns.However, the same logic is used for figuring out what entries are or even should be in the mine. On
mine.get
calls, say in the jinja of a state file, this causes minions which shouldn’t be returned bymine.get
to be returned, causing mayhem. I’m hazier on the details here, but I’m sure this is what is happening.So in reality, there is no problem with the mine code; it’s deeper than that. It’s in the grain targeting code in salt.
If you want this fixed, please comment on issue #35439 . It’s where I documented all of my debugging work, and where I found out why this is happening.
Workaround: Use Glob Targeting
You might not be able to do this, but if you are, it’s easily the best way to get around this issue. For example, instead of using this:
You may want to use this instead:
This seems to work under all sorts of conditions. It also is sad 😦, since grain targeting is awesome.
Workaround: Salt Orchestration
@almoore pointed this one out 😃
One way to get around using the mine entirely is to use
saltutil.cmd
in conjunction inline pillars in an orchestration.As an example, the following salt orchestration provides network IP address information of other minions as a pillar, instead of using the mine to accomplish the same thing:
This workaround has the advantage of getting the “mined data” immediately before the state is called.
It has the disadvantage that it can’t be called using
salt-call
; this orchestration must be run from the master.Workaround: Consul
You can use Consul as a makeshift mine. You would create one state to populate consul with “mine data”, and one state to consume the mine data. All the minions that would need to put data into consul get the “populate” state applied, and all the minions that would need to consume it have states run on them which contain the appropriate
salt['consul.get']()
calls.The advantage is that you can use this as a drop-in replacement for the mine. Since consul is populated using a state call, it should be safe to use grain targeting using this option.
salt-call
s should work as well as calling the states from the master.The disadvantange is that it’s a bit complicated to set up. That said, I have set up a POC, and I know it works.
Or just this:
I’m using this solution in orchestrate state for 3 months already without any issues.
salt '*' mine.flush
removes the/var/cache/salt/master/minions/<node>/mine.p
file but only for minions that are ‘alive’. The removed minions still have their directory containing mine.p and data.p. Neithersalt-run cache.clear_all tgt='*'
norsalt '*' saltutil.clear_cache
appears to do anything.My resolution steps, not all steps might be necessary but I wasn’t taking chances: