mitogen: Templating broken when constructing value for `ansible_ssh_common_args`

Hi, I’m on ansible-core-2.12.2 (thx for all the work in getting that done) and mitogen v0.3.2.

We have some basic jinja inside one of our vars files:

---
# Use the correct jump host
ansible_ssh_common_args: >-
  -o ProxyCommand='ssh -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -W %h:%p admin@{{ hostvars.jumphost.public_ip_address }}'

This causes errors:

TASK [Waiting for connection] *********************************************************************************************************************************************************
task path: /Users/dick.visser/git/deploy_dick/data/acc/site.yml:417
[WARNING]: Unhandled error in Python interpreter discovery for host acc_proxy1: EOF on stream; last 100 lines received: ssh: Could not resolve hostname {{: nodename nor servname
provided, or not known  kex_exchange_identification: Connection closed by remote host

If I hardcode it like this:

---
# Use the correct jump host
ansible_ssh_common_args: >-
  -o ProxyCommand='ssh -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -W %h:%p admin@18.13.233.18'

then things work…

The jinja inside the inventory works fine with ansible v3.4.0 (ansible-base 2.10.x)

Any thoughts?

About this issue

Original URL
State: open
Created 2 years ago
Reactions: 14
Comments: 29 (6 by maintainers)

Most upvoted comments

Same issues for while installing kubespray with mitogen 0.3.3.

After playing a little with the python script and the responsible file (thx @Zocker1999NET), I find a way to fix it. However, I didn’t took the time yet to check whether the change can have side effects or generate issues, as my guess was hostvars if the view of vars for each host. Hope this is right!

The fix is to replace self._task_vars.get("vars", {}) with self._task_vars.get("hostvars", {}).get(self._inventory_name, {}) in PlayContextSpec, around lines 483 (method ssh_args).

Result looks like:

    def ssh_args(self):
        return [
            mitogen.core.to_text(term)
            for s in (
                C.config.get_config_value("ssh_args", plugin_type="connection", plugin_name="ssh", variables=self._task_vars.get("hostvars", {}).get(self._inventory_name, {})),
                C.config.get_config_value("ssh_common_args", plugin_type="connection", plugin_name="ssh", variables=self._task_vars.get("hostvars", {}).get(self._inventory_name, {})),
                C.config.get_config_value("ssh_extra_args", plugin_type="connection", plugin_name="ssh", variables=self._task_vars.get("hostvars", {}).get(self._inventory_name, {}))
            )
            for term in ansible.utils.shlex.shlex_split(s or '')
        ]

I won’t be able to verify the fix until August, but if someone can play with it, let’s share the result! Edit: meaning I was not able to run the plyaybool till the end to be sure it works, but it defintely fixes the blocking task.

momiji on Jul 7, 2022

Any way to get this merged? I also need to apply the patch to get my setup working…

sebastianreloaded on Oct 6, 2022

Hello, PR #956 sent.

momiji on Aug 22, 2022

It may be ansible_mitogen/transport_config.py

You’re right, any other change of this commit does not affect the outcome of my tests. But, it may also be that an internal change in Ansible that (also) causes this bug, but I’m not quite sure:

While trying to find the smallest partial revert of commit c61c063b4f9b2b63dcaa86443631a268c9f72870, I detected a difference in the result of my small ping test depending on the version of Ansible used.

Beginning from tag v0.3.2, after applying the diff at the end (which reverts the commit partially), running ansible -m ping host with a small test inventory works for Ansible 2.10 as expected but stops working for Ansible 5.4.0 (core 2.12.3) with the same error message:

host | UNREACHABLE! => {
    "changed": false,
    "msg": "EOF on stream; last 100 lines received:\nssh: Could not resolve hostname {%: Name or service not known\r",
    "unreachable": true
}

So partially reverting this change does work for older Ansible versions (~ 2.10) but not for newer ones (~ 5.4.0 / 2.12.3).

This is the diff from the mention above:

diff --git a/ansible_mitogen/transport_config.py b/ansible_mitogen/transport_config.py
index 4babbde3..344c3d84 100644
--- a/ansible_mitogen/transport_config.py
+++ b/ansible_mitogen/transport_config.py
@@ -467,9 +467,9 @@ class PlayContextSpec(Spec):
         return [
             mitogen.core.to_text(term)
             for s in (
-                C.config.get_config_value("ssh_args", plugin_type="connection", plugin_name="ssh", variables=self._task_vars.get("vars", {})),
-                C.config.get_config_value("ssh_common_args", plugin_type="connection", plugin_name="ssh", variables=self._task_vars.get("vars", {})),
-                C.config.get_config_value("ssh_extra_args", plugin_type="connection", plugin_name="ssh", variables=self._task_vars.get("vars", {}))
+                getattr(self._play_context, 'ssh_args', ''),
+                getattr(self._play_context, 'ssh_common_args', ''),
+                getattr(self._play_context, 'ssh_extra_args', '')
             )
             for term in ansible.utils.shlex.shlex_split(s or '')
         ]
@@ -696,9 +696,22 @@ class MitogenViaSpec(Spec):
         return [
             mitogen.core.to_text(term)
             for s in (
-                C.config.get_config_value("ssh_args", plugin_type="connection", plugin_name="ssh", variables=self._task_vars.get("vars", {})),
-                C.config.get_config_value("ssh_common_args", plugin_type="connection", plugin_name="ssh", variables=self._task_vars.get("vars", {})),
-                C.config.get_config_value("ssh_extra_args", plugin_type="connection", plugin_name="ssh", variables=self._task_vars.get("vars", {}))
+                (
+                    self._host_vars.get('ansible_ssh_args') or
+                    getattr(C, 'ANSIBLE_SSH_ARGS', None) or
+                    os.environ.get('ANSIBLE_SSH_ARGS')
+                    # TODO: ini entry. older versions.
+                ),
+                (
+                    self._host_vars.get('ansible_ssh_common_args') or
+                    os.environ.get('ANSIBLE_SSH_COMMON_ARGS')
+                    # TODO: ini entry.
+                ),
+                (
+                    self._host_vars.get('ansible_ssh_extra_args') or
+                    os.environ.get('ANSIBLE_SSH_EXTRA_ARGS')
+                    # TODO: ini entry.
+                ),
             )
             for term in ansible.utils.shlex.shlex_split(s)
             if s

Zocker1999NET on Mar 7, 2022

Hello,

Thanks for the patch @momiji It works for bastion host with ansible_ssh_common_args in template. Unfortunately, after applied the patch in both ssh_args methods in mitogen/ansible_mitogen/transport_config.py, it introduces another issue with ansible.posix.synchronize module (ansible.posix collection 1.2.0). When using use_ssh_args: true for rsync folder, template seems doesn’t work for synchronize. https://docs.ansible.com/ansible/latest/collections/ansible/posix/synchronize_module.html

I have playbook tasks:

  tasks:
    - name: Sync scripts
      ansible.posix.synchronize:
        src: ../roles/my_server/files/opt/scripts/
        dest: /opt/scripts/
        recursive: true
        use_ssh_args: true
        archive: false
        rsync_opts:
          - '--chmod=0750'
          - '-o'
          - '-g'
          - '--chown=root:mycustomgroup'

Playbook run error:

{
  "rc": 255,
  "cmd": "sshpass -d18 /usr/bin/rsync --delay-updates -F --compress --recursive --rsh=/usr/bin/ssh -S none -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o ProxyCommand=\"ssh -W %h:%p {{ bastion_user }}@{{ bastion_hostname }} -i $BASTION_SSH_PRIVATE_KEY -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null\" --rsync-path=sudo rsync --chmod=0750 -o -g --chown=root:mycustomgroup --out-format=<<CHANGED>>%i %n%L /runner/project/roles/my_server/files/opt/scripts/ ansible@myserver:/opt/scripts/",
  "msg": "ssh: Could not resolve hostname {{: Name or service not known\r\nkex_exchange_identification: Connection closed by remote host\r\nrsync: connection unexpectedly closed (0 bytes received so far) [sender]\nrsync error: unexplained error (code 255) at io.c(226) [sender=3.1.3]\n",
  "invocation": {
    "module_args": {
      "src": "/runner/project/roles/my_server/files/opt/scripts/",
      "dest": "ansible@myserver:/opt/scripts/",
      "recursive": true,
      "archive": false,
      "rsync_opts": [
        "--chmod=0750",
        "-o",
        "-g",
        "--chown=root:mycustomgroup"
      ],
      "_local_rsync_path": "rsync",
      "_local_rsync_password": "VALUE_SPECIFIED_IN_NO_LOG_PARAMETER",
      "private_key": null,
      "rsync_path": "sudo rsync",
      "ssh_args": "-o ProxyCommand=\"ssh -W %h:%p {{ bastion_user }}@{{ bastion_hostname }} -i $BASTION_SSH_PRIVATE_KEY -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null\"",
      "delete": false,
      "_substitute_controller": false,
      "checksum": false,
      "compress": true,
      "existing_only": false,
      "dirs": false,
      "copy_links": false,
      "set_remote_user": true,
      "rsync_timeout": 0,
      "ssh_connection_multiplexing": false,
      "partial": false,
      "verify_host": false,
      "mode": "push",
      "dest_port": null,
      "links": null,
      "perms": null,
      "times": null,
      "owner": null,
      "group": null,
      "link_dest": null
    }
  },
  "_ansible_no_log": false,
  "changed": false
}

Just to update that

with ansible.posix 1.4.0
applying patch https://github.com/mitogen-hq/mitogen/pull/956 on latest commit https://github.com/mitogen-hq/mitogen/commit/572636a9d3c5a4ac4e8591c42f29763cb56fe602

No more error ansible.posix.synchronize above

hungpr0 on Oct 3, 2022

Hello,

I have playbook tasks:

  tasks:
    - name: Sync scripts
      ansible.posix.synchronize:
        src: ../roles/my_server/files/opt/scripts/
        dest: /opt/scripts/
        recursive: true
        use_ssh_args: true
        archive: false
        rsync_opts:
          - '--chmod=0750'
          - '-o'
          - '-g'
          - '--chown=root:mycustomgroup'

Playbook run error:

{
  "rc": 255,
  "cmd": "sshpass -d18 /usr/bin/rsync --delay-updates -F --compress --recursive --rsh=/usr/bin/ssh -S none -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o ProxyCommand=\"ssh -W %h:%p {{ bastion_user }}@{{ bastion_hostname }} -i $BASTION_SSH_PRIVATE_KEY -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null\" --rsync-path=sudo rsync --chmod=0750 -o -g --chown=root:mycustomgroup --out-format=<<CHANGED>>%i %n%L /runner/project/roles/my_server/files/opt/scripts/ ansible@myserver:/opt/scripts/",
  "msg": "ssh: Could not resolve hostname {{: Name or service not known\r\nkex_exchange_identification: Connection closed by remote host\r\nrsync: connection unexpectedly closed (0 bytes received so far) [sender]\nrsync error: unexplained error (code 255) at io.c(226) [sender=3.1.3]\n",
  "invocation": {
    "module_args": {
      "src": "/runner/project/roles/my_server/files/opt/scripts/",
      "dest": "ansible@myserver:/opt/scripts/",
      "recursive": true,
      "archive": false,
      "rsync_opts": [
        "--chmod=0750",
        "-o",
        "-g",
        "--chown=root:mycustomgroup"
      ],
      "_local_rsync_path": "rsync",
      "_local_rsync_password": "VALUE_SPECIFIED_IN_NO_LOG_PARAMETER",
      "private_key": null,
      "rsync_path": "sudo rsync",
      "ssh_args": "-o ProxyCommand=\"ssh -W %h:%p {{ bastion_user }}@{{ bastion_hostname }} -i $BASTION_SSH_PRIVATE_KEY -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null\"",
      "delete": false,
      "_substitute_controller": false,
      "checksum": false,
      "compress": true,
      "existing_only": false,
      "dirs": false,
      "copy_links": false,
      "set_remote_user": true,
      "rsync_timeout": 0,
      "ssh_connection_multiplexing": false,
      "partial": false,
      "verify_host": false,
      "mode": "push",
      "dest_port": null,
      "links": null,
      "perms": null,
      "times": null,
      "owner": null,
      "group": null,
      "link_dest": null
    }
  },
  "_ansible_no_log": false,
  "changed": false
}

hungpr0 on Jul 13, 2022

@momiji I think this can be a valid fix for this issue. I applied the change to both ssh_args methods in mitogen/ansible_mitogen/transport_config.py and ran a relatively huge Ansible repo I maintain in check mode and everything seemed fine. It could connect to all hosts expected even with templates in ansible_ssh_common_args and did not report any new diffs or errors. Can you create a PR with this patch so it might be reviewed?

Zocker1999NET on Jul 7, 2022

I took the time to inspect further and found a difference in the calling of C.config.get_config_value between Ansible and Mitogen.

For getting the configuration of ssh_common_args, Mitogen calls:

https://github.com/mitogen-hq/mitogen/blob/89c0cc94d16218e2647bb8bb32b011231def0fd7/ansible_mitogen/transport_config.py#L478

Ansible plugins (here ssh) use a helper AnsiblePlugin.get_option which does (if GitHub does not render the code, click on the links):

https://github.com/ansible/ansible/blob/b104478f171a4030c0cd96ef4d99db65bf04dceb/lib/ansible/plugins/connection/ssh.py#L743-L744

https://github.com/ansible/ansible/blob/b104478f171a4030c0cd96ef4d99db65bf04dceb/lib/ansible/plugins/__init__.py#L55-L62

Intercepting these calls to get_config_value reveals, that the calls from the official ssh plugin sets the argument variables to a dict containing all host variables already resolved (a.k.a. not in their template form after Jinja2). However Mitogen’s connection plugin sets the argument to a dict containing the probably the task variables unresolved (a.k.a. in their template form before Jinja2).

Meaning in practice: Given these example host vars:

ansible_ssh_common_args: "{{ other_var }}"
other_var: "--my-option"

Then the argument variables of get_config_value looks like

{…, "ansible_ssh_common_args": "--my-option", …} if called from Ansible’s ssh plugin
{…, "ansible_ssh_common_args": "{{ other var }}", …} if called from Mitogen’s connection plugin

I do not know Ansible’s Python code good enough to fix this, probably by resolving the variables properly before passing them to get_config_value, but maybe this helps someone else.

Zocker1999NET on May 10, 2022

Encountering the same issue

ansible 2.10.17
mitogen-0.3.2

[ssh_connection]
pipelining = True
ssh_args = -o ControlMaster=auto -o ControlPersist=3m -o ForwardAgent=yes
control_path = ~/.ssh/ansible-%%C

ansible_ssh_jumphost: "{{ hostvars[groups['jumphost_servers'][0]]['ansible_host'] }}"
ansible_ssh_common_args: '-o ProxyCommand="ssh -W %h:%p -q {{ ansible_ssh_user }}@{{ ansible_ssh_jumphost }}"'

kex_exchange_identification: Connection closed by remote host

guytet on Mar 12, 2022