mitogen: Templating broken when constructing value for `ansible_ssh_common_args`
Hi, I’m on ansible-core-2.12.2 (thx for all the work in getting that done) and mitogen v0.3.2.
We have some basic jinja inside one of our vars files:
---
# Use the correct jump host
ansible_ssh_common_args: >-
-o ProxyCommand='ssh -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -W %h:%p admin@{{ hostvars.jumphost.public_ip_address }}'
This causes errors:
TASK [Waiting for connection] *********************************************************************************************************************************************************
task path: /Users/dick.visser/git/deploy_dick/data/acc/site.yml:417
[WARNING]: Unhandled error in Python interpreter discovery for host acc_proxy1: EOF on stream; last 100 lines received: ssh: Could not resolve hostname {{: nodename nor servname
provided, or not known kex_exchange_identification: Connection closed by remote host
If I hardcode it like this:
---
# Use the correct jump host
ansible_ssh_common_args: >-
-o ProxyCommand='ssh -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -W %h:%p admin@18.13.233.18'
then things work…
The jinja inside the inventory works fine with ansible v3.4.0 (ansible-base 2.10.x)
Any thoughts?
About this issue
- Original URL
- State: open
- Created 2 years ago
- Reactions: 14
- Comments: 29 (6 by maintainers)
Same issues for while installing kubespray with mitogen 0.3.3.
After playing a little with the python script and the responsible file (thx @Zocker1999NET), I find a way to fix it. However, I didn’t took the time yet to check whether the change can have side effects or generate issues, as my guess was hostvars if the view of vars for each host. Hope this is right!
The fix is to replace
self._task_vars.get("vars", {})withself._task_vars.get("hostvars", {}).get(self._inventory_name, {})in PlayContextSpec, around lines 483 (method ssh_args).Result looks like:
I won’t be able to verify the fix until August, but if someone can play with it, let’s share the result! Edit: meaning I was not able to run the plyaybool till the end to be sure it works, but it defintely fixes the blocking task.
Any way to get this merged? I also need to apply the patch to get my setup working…
Hello, PR #956 sent.
You’re right, any other change of this commit does not affect the outcome of my tests. But, it may also be that an internal change in Ansible that (also) causes this bug, but I’m not quite sure:
While trying to find the smallest partial revert of commit c61c063b4f9b2b63dcaa86443631a268c9f72870, I detected a difference in the result of my small ping test depending on the version of Ansible used.
Beginning from tag v0.3.2, after applying the diff at the end (which reverts the commit partially), running
ansible -m ping hostwith a small test inventory works for Ansible 2.10 as expected but stops working for Ansible 5.4.0 (core 2.12.3) with the same error message:So partially reverting this change does work for older Ansible versions (~ 2.10) but not for newer ones (~ 5.4.0 / 2.12.3).
This is the diff from the mention above:
Just to update that
No more error
ansible.posix.synchronizeaboveHello,
Thanks for the patch @momiji It works for bastion host with
ansible_ssh_common_argsin template. Unfortunately, after applied the patch in both ssh_args methods inmitogen/ansible_mitogen/transport_config.py, it introduces another issue withansible.posix.synchronizemodule (ansible.posix collection 1.2.0). When usinguse_ssh_args: truefor rsync folder, template seems doesn’t work for synchronize. https://docs.ansible.com/ansible/latest/collections/ansible/posix/synchronize_module.htmlI have playbook tasks:
Playbook run error:
@momiji I think this can be a valid fix for this issue. I applied the change to both ssh_args methods in
mitogen/ansible_mitogen/transport_config.pyand ran a relatively huge Ansible repo I maintain in check mode and everything seemed fine. It could connect to all hosts expected even with templates in ansible_ssh_common_args and did not report any new diffs or errors. Can you create a PR with this patch so it might be reviewed?I took the time to inspect further and found a difference in the calling of
C.config.get_config_valuebetween Ansible and Mitogen.For getting the configuration of
ssh_common_args, Mitogen calls:https://github.com/mitogen-hq/mitogen/blob/89c0cc94d16218e2647bb8bb32b011231def0fd7/ansible_mitogen/transport_config.py#L478
Ansible plugins (here ssh) use a helper
AnsiblePlugin.get_optionwhich does (if GitHub does not render the code, click on the links):https://github.com/ansible/ansible/blob/b104478f171a4030c0cd96ef4d99db65bf04dceb/lib/ansible/plugins/connection/ssh.py#L743-L744
https://github.com/ansible/ansible/blob/b104478f171a4030c0cd96ef4d99db65bf04dceb/lib/ansible/plugins/__init__.py#L55-L62
Intercepting these calls to
get_config_valuereveals, that the calls from the official ssh plugin sets the argumentvariablesto a dict containing all host variables already resolved (a.k.a. not in their template form after Jinja2). However Mitogen’s connection plugin sets the argument to a dict containing the probably the task variables unresolved (a.k.a. in their template form before Jinja2).Meaning in practice: Given these example host vars:
Then the argument
variablesofget_config_valuelooks like{…, "ansible_ssh_common_args": "--my-option", …}if called from Ansible’s ssh plugin{…, "ansible_ssh_common_args": "{{ other var }}", …}if called from Mitogen’s connection pluginI do not know Ansible’s Python code good enough to fix this, probably by resolving the variables properly before passing them to
get_config_value, but maybe this helps someone else.Encountering the same issue