cloudflared: Tunnel errors with "cannot unmarshal DNS message"

INFO[0000] ResolveEdgeIPs err
ERRO[0000] Quitting due to error                         error="lookup cloudflarewarp.com on 127.0.0.53:53: cannot unmarshal DNS message"
INFO[0000] Metrics server stopped

I’m running Ubuntu 18.04.2 LTS with kernel 4.15.0-45-generic.

The 127.0.0.53 is the default Ubuntu Server caching DNS server provided by systemd-resolve. It’s set up to cache queries from 1.1.1.1 and 1.0.0.1

If I change my DNS servers in /etc/resolv.conf and hard-code to 1.1.1.1 , the tunnel will start. However, it’s weird that it won’t work on a default Ubuntu install.

Can you check to see if you can reproduce on your end?

# systemd-resolve --status
Global
         DNS Servers: 1.1.1.1
                      1.0.0.1
                      94.237.127.9
                      94.237.40.9
          DNSSEC NTA: 10.in-addr.arpa
                      16.172.in-addr.arpa
                      168.192.in-addr.arpa
                      17.172.in-addr.arpa
                      18.172.in-addr.arpa
                      19.172.in-addr.arpa
                      20.172.in-addr.arpa
                      21.172.in-addr.arpa
                      22.172.in-addr.arpa
                      23.172.in-addr.arpa
                      24.172.in-addr.arpa
                      25.172.in-addr.arpa
                      26.172.in-addr.arpa
                      27.172.in-addr.arpa
                      28.172.in-addr.arpa
                      29.172.in-addr.arpa
                      30.172.in-addr.arpa
                      31.172.in-addr.arpa
                      corp
                      d.f.ip6.arpa
                      home
                      internal
                      intranet
                      lan
                      local
                      private
                      test

Link 4 (eth2)
      Current Scopes: none

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Comments: 16 (2 by maintainers)

Most upvoted comments

I have the exact same issue on Ubuntu on GCP. I changed my resolve.conf temporarily to 1.1.1.1 to get the tunnel service to start. Any idea if the tunnel will stay up when the resolve.conf gets overwritten by the dhcp client?

Hi, I’m trying to run cloudflared on Ubuntu 18 NVIDIA Jetson Nano (arm64), but also facing this issue even with latest 2023.7.1. Tried to install and use dnsmasq, bust still facing the same error. Any idea how to fix it?

Log:

2023-07-17T10:47:18Z INF Starting tunnel tunnelID=9ecf8095-0fb5-4161-b419-874a8ceef77c
2023-07-17T10:47:18Z INF Version 2023.7.1
2023-07-17T10:47:18Z INF GOOS: linux, GOVersion: go1.19.11, GoArch: arm64
2023-07-17T10:47:18Z INF Settings: map[no-autoupdate:true token:*****]
2023-07-17T10:47:18Z INF Generated Connector ID: fb01e07c-6696-4411-abc7-01314b417ee2
2023-07-17T10:47:18Z INF Initial protocol quic
2023-07-17T10:47:18Z INF ICMP proxy will use 172.17.0.2 as source for IPv4
2023-07-17T10:47:18Z INF ICMP proxy will use :: as source for IPv6
2023-07-17T10:47:18Z ERR edge discovery: error looking up Cloudflare edge IPs: the DNS query failed error="lookup argotunnel.com on 100.100.100.100:53: cannot unmarshal DNS message" event=0
2023-07-17T10:47:18Z ERR Please try the following things to diagnose this issue: event=0
2023-07-17T10:47:18Z ERR   1. ensure that argotunnel.com is returning "origintunneld" service records. event=0
2023-07-17T10:47:18Z ERR      Run your system's equivalent of: dig srv _origintunneld._tcp.argotunnel.com event=0
2023-07-17T10:47:18Z ERR   2. ensure that your DNS resolver is not returning compressed SRV records. event=0
2023-07-17T10:47:18Z ERR      See GitHub issue https://github.com/golang/go/issues/27546 event=0
2023-07-17T10:47:18Z ERR      For example, you could use Cloudflare's 1.1.1.1 as your resolver: event=0
2023-07-17T10:47:18Z ERR      https://developers.cloudflare.com/1.1.1.1/setting-up-1.1.1.1/ event=0
2023-07-17T10:47:18Z INF Starting metrics server on 127.0.0.1:41147/metrics
2023-07-17T10:47:18Z ERR edge discovery: error looking up Cloudflare edge IPs: the DNS query failed error="lookup argotunnel.com on 100.100.100.100:53: cannot unmarshal DNS message" event=0
2023-07-17T10:47:18Z ERR Please try the following things to diagnose this issue: event=0
2023-07-17T10:47:18Z ERR   1. ensure that argotunnel.com is returning "origintunneld" service records. event=0
2023-07-17T10:47:18Z ERR      Run your system's equivalent of: dig srv _origintunneld._tcp.argotunnel.com event=0
2023-07-17T10:47:18Z ERR   2. ensure that your DNS resolver is not returning compressed SRV records. event=0
2023-07-17T10:47:18Z ERR      See GitHub issue https://github.com/golang/go/issues/27546 event=0
2023-07-17T10:47:18Z ERR      For example, you could use Cloudflare's 1.1.1.1 as your resolver: event=0
2023-07-17T10:47:18Z ERR      https://developers.cloudflare.com/1.1.1.1/setting-up-1.1.1.1/ event=0
2023-07-17T10:47:18Z INF Tunnel server stopped
2023-07-17T10:47:18Z ERR Initiating shutdown error="Could not lookup srv records on _v2-origintunneld._tcp.argotunnel.com: lookup argotunnel.com on 100.100.100.100:53: cannot unmarshal DNS message"
2023-07-17T10:47:18Z INF Metrics server stopped
Could not lookup srv records on _v2-origintunneld._tcp.argotunnel.com: lookup argotunnel.com on 100.100.100.100:53: cannot unmarshal DNS message

I can confirm it works in Ubuntu on GCP now. Thanks for the quick fix.

We have just released 2019.3.0, which addresses this issue.

Would y’all please try that version and let us know if it resolves your errors?

I add to reduce/kill the tunnels to dodge this bug. Anything using ubuntu 18.04 is failing.

A simplistic solution could be to use DoH with cloudflared. curl -v 'https://1.1.1.1/dns-query?ct=application/dns-json&name=cloudflare.com'

I’ve tested both dnsmasq 2.79 and unbound 1.9.0 as a local dns cache on Ubuntu 18.04.02, as well as switching /etc/resolv.conf to use nameserver 1.1.1.1 at the top. With the above being tested, pretty sure this is purely a systemd-resolved issue. Hopefully they can fix and backport it soon.

Perhaps cloudflare can look into the possibility to specify the DNS server IP address to use on the cli/config?

Thanks for the report!

I suspect this is related to https://github.com/golang/go/issues/27546. In particular, a commenter there specifically mentions SRV responses from systemd-resolved on Ubuntu 18.04. There’s an upstream systemd commit to address the issue: https://github.com/systemd/systemd/pull/9828

But I have here a GCP Ubuntu instance running systemd version 239 –– and that commit landed in version 240 –– and I’m seeing correct resolution. So that may not be the whole story.