spec: IPSocket.getaddress with unknown host returnng IP address

Hello,

I am seeing a weird behavior about the IPSocket.getaddress on RubyCI’s arm64 Ubuntu jammy server “arm64-neoverse-n1”. This server is not visible on rubyci.org site yet.

The spec file library/socket/ipsocket/getaddress_spec.rb is failing in the make test-spec.

https://rubyci.s3.amazonaws.com/arm64-neoverse-n1/ruby-master/recent.html https://rubyci.s3.amazonaws.com/arm64-neoverse-n1/ruby-master/log/20231013T130005Z.fail.html.gz

I can reproduce this failure on the ruby/spec latest commit 59bdcb4ea95c60159bb2bfc8c73022364da8ec0d too with the relatively latest master branch ruby 511571b5ff3aaab3ac013edc166a1bcf61f6d6d4 by the following command.

$ ruby -v
ruby 3.3.0dev (2023-10-13T14:21:33Z master 511571b5ff) [aarch64-linux]

$ ../mspec/bin/mspec library/socket/ipsocket/getaddress_spec.rb
$ ruby /home/jaruga/git/ruby/mspec/bin/mspec-run library/socket/ipsocket/getaddress_spec.rb
ruby 3.3.0dev (2023-10-13T14:21:33Z master 511571b5ff) [aarch64-linux]
                                                                                             
1)
Socket::IPSocket#getaddress raises an error on unknown hostnames FAILED
Expected SocketError but no exception was raised ("170.178.183.18" was returned)
/home/jaruga/git/ruby/spec/library/socket/ipsocket/getaddress_spec.rb:21:in `block (2 levels) in <top (required)>'
/home/jaruga/git/ruby/spec/library/socket/ipsocket/getaddress_spec.rb:4:in `<top (required)>'
[- | ==================100%================== | 00:00:00]      1F      0E 

Finished in 0.078555 seconds

1 file, 3 examples, 5 expectations, 1 failure, 0 errors, 0 tagged

A minimal reproducer

On the arm64 server

Seeing the library/socket/ipsocket/getaddress_spec.rb, the following command is expected to raise SocketError. However, it returns the IP address 170.178.183.18 for the host “rubyspecdoesntexist.fallingsnow.net”.

$ ruby -v
ruby 3.3.0dev (2023-10-13T14:21:33Z master 511571b5ff) [aarch64-linux]
$ ruby -e 'require "socket"; p IPSocket.getaddress("rubyspecdoesntexist.fallingsnow.net")'
"170.178.183.18"

It also returns the same IP for the different subdomain host.

$ ruby -e 'require "socket"; p IPSocket.getaddress("a.fallingsnow.net")'
"170.178.183.18"

It raises the SocketError as exepected when specifying our managed domain “rubyspecdoesntexist.ruby-lang.org”.

$ ruby -e 'require "socket"; p IPSocket.getaddress("rubyspecdoesntexist.ruby-lang.org")'
-e:1:in `getaddress': getaddrinfo: Name or service not known (SocketError)
  from -e:1:in `<main>'

The following DNS client tool nslookup and dig are not returning the domain right?

$ nslookup rubyspecdoesntexist.fallingsnow.net
Server:		127.0.0.53
Address:	127.0.0.53#53

** server can't find rubyspecdoesntexist.fallingsnow.net: NXDOMAIN
$ dig rubyspecdoesntexist.fallingsnow.net

; <<>> DiG 9.18.12-0ubuntu0.22.04.3-Ubuntu <<>> rubyspecdoesntexist.fallingsnow.net
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 389
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 65494
;; QUESTION SECTION:
;rubyspecdoesntexist.fallingsnow.net. IN	A

;; AUTHORITY SECTION:
fallingsnow.net.	300	IN	SOA	ns1.dnsimple.com. admin.dnsimple.com. 2013022005 86400 7200 604800 300

;; Query time: 32 msec
;; SERVER: 127.0.0.53#53(127.0.0.53) (UDP)
;; WHEN: Mon Oct 16 15:21:30 UTC 2023
;; MSG SIZE  rcvd: 122

So, do you know why this happened? What is the used domain “fallingsnow.net”?

As the arm64 sever is managed on Equinix Cloud, if the issue comes from the server’s DNS server, I can ask the admin to correct the serer, if we can reproduce the issue with the general DNS client tool.

On my local Fedora Linux x86_64

Raising SocketError as expected.

$ ruby -v
ruby 3.3.0dev (2023-10-13T14:21:33Z master 511571b5ff) [x86_64-linux]

$ ruby -e 'require "socket"; p IPSocket.getaddress("rubyspecdoesntexist.fallingsnow.net")'
-e:1:in `getaddress': getaddrinfo: Name or service not known (SocketError)
	from -e:1:in `<main>'

About this issue

  • Original URL
  • State: closed
  • Created 8 months ago
  • Comments: 16 (14 by maintainers)

Most upvoted comments

Ah, it looks like it’s because systemd-resolved starts before bond0 appears (note the status=1/FAILURE on the ExecStartPost job in the systemctl status call) so my hacky plan needs adjustment. I don’t know much about Ubuntu’s networking setup, so you’ll have to fiddle a bit, but the trick is making sure the job runs after the interface appears.

You will want to remove the override (another call to systemctl edit or just deleting the override file should work). You can then add a new unit with systemctl edit --force --full hacky.service

[Unit]
Description="Gross hack to work around Ubuntu's broken ifupdown scripts"
After=network-online.target

[Service]
Type=oneshot
ExecStart=/usr/bin/resolvectl domain bond0 ""

[Install]
WantedBy=multi-user.target

And then enable it. It’ll run after network-online.target which might be good enough. There’s probably a unit running the ifupdown script so setting After=<whatever that service is> might work too.

Edit: actually reading the logs more closely helps, I see a ifup@bond0.service service mentioned, try setting the hacky.service (or whatever more helpful thing you name it) to run after that with After=ifup@bond0.service, I think.

@jeremycline Thank you! I was able to fix the issue on your way! After rebooting OS, I don’t see any error, and the /etc/resolv.conf is set properly!

# systemctl edit --force --full systemd-resolved-hacky.service

I added the following content that is essentially same with your way. The After=network-online.target worked.

# This service fixes the domain value by replacing the `search DOMAINS` with
# `search .` in /etc/resolv.conf.
# https://github.com/ruby/spec/issues/1095
# https://bugs.launchpad.net/ubuntu/+source/ifupdown/+bug/1978351
[Unit]
Description="Gross hack to work around Ubuntu's broken ifupdown scripts"
After=network-online.target

[Service]
Type=oneshot
# Fix the domain value.
# $ resolvectl status
# ...
# Link 4 (bond0)
# Current Scopes: none
#      Protocols: -DefaultRoute +LLMNR -mDNS -DNSOverTLS DNSSEC=no/unsupported
#     DNS Domain: DOMAINS
# ...
ExecStart=/usr/bin/resolvectl domain bond0 ""

[Install]
WantedBy=multi-user.target
$ sudo systemctl enable systemd-resolved-hacky.service
Created symlink /etc/systemd/system/multi-user.target.wants/systemd-resolved-hacky.service → /etc/systemd/system/systemd-resolved-hacky.service.
$ sudo systemctl start systemd-resolved-hacky.service

Then I tested with resolvectl status.

$ sudo reboot
$ resolvectl status
Global
           Protocols: -LLMNR -mDNS -DNSOverTLS DNSSEC=no/unsupported
    resolv.conf mode: stub
  Current DNS Server: NNN.NNN.NNN.NNN (masking by myself)
         DNS Servers: NNN.NNN.NNN.NNN NNN.NNN.NNN.NNN (masking by myself)
Fallback DNS Servers: NNN.NNN.NNN.NNN NNN.NNN.NNN.NNN (masking by myself)

Link 2 (enp1s0f0)
Current Scopes: none
     Protocols: -DefaultRoute +LLMNR -mDNS -DNSOverTLS DNSSEC=no/unsupported

Link 3 (enp1s0f1)
Current Scopes: none
     Protocols: -DefaultRoute +LLMNR -mDNS -DNSOverTLS DNSSEC=no/unsupported

Link 4 (bond0)
Current Scopes: none
     Protocols: -DefaultRoute +LLMNR -mDNS -DNSOverTLS DNSSEC=no/unsupported
$ grep -v ^# /etc/resolv.conf

nameserver 127.0.0.53
options edns0 trust-ad
search .
$ sudo systemctl status systemd-resolved-hacky.service
○ systemd-resolved-hacky.service - "Gross hack to work around Ubuntu's broken ifupdown scripts"
     Loaded: loaded (/etc/systemd/system/systemd-resolved-hacky.service; enabled; vendor preset: enabled)
     Active: inactive (dead) since Thu 2023-10-19 10:21:53 UTC; 11min ago
    Process: 2105 ExecStart=/usr/bin/resolvectl domain bond0  (code=exited, status=0/SUCCESS)
   Main PID: 2105 (code=exited, status=0/SUCCESS)
        CPU: 11ms

Oct 19 10:21:53 ruby1 systemd[1]: Starting "Gross hack to work around Ubuntu's broken ifupdown scripts"...
Oct 19 10:21:53 ruby1 systemd[1]: systemd-resolved-hacky.service: Deactivated successfully.
Oct 19 10:21:53 ruby1 systemd[1]: Finished "Gross hack to work around Ubuntu's broken ifupdown scripts".

And the following ruby command fails as error as expected now.

$ ruby -e 'require "socket"; p IPSocket.getaddress("a.fallingsnow.net")'
-e:1:in `getaddress': getaddrinfo: Name or service not known (SocketError)
	from -e:1:in `<main>'

Thank you! The issue was fixed.