coredns: Segfault when forwarder list contains local server itself

I recently migrated from dnsmasq and very much like coredns overall (now running everywhere), but of course also immediately found a bug. 😃

With the following config:

. {
	forward . /etc/resolv.conf
	cache
	log
}

and the following resolv.conf:

# host-local instance for resilience/caching
nameserver 127.0.0.1
# LAN-local instance that forwards to my ISP etc.
nameserver 192.168.100.222

coredns repeatably crashes with the follwing log a few seconds after start & a bit of initial random lookup activity:

coredns -conf crash.conf
.:53
2018/10/05 01:25:11 [INFO] CoreDNS-1.2.2
2018/10/05 01:25:11 [INFO] linux/amd64, go1.11, eb51e8bac90fac86d34c9e1cb89b04ea0936b034
CoreDNS-1.2.2
linux/amd64, go1.11, eb51e8bac90fac86d34c9e1cb89b04ea0936b034
127.0.0.1:56626 - [05/Oct/2018:01:25:15 +0200] 35427 "PTR IN 166.241.155.62.in-addr.arpa. udp 45 false 512" NXDOMAIN qr,rd,ra 139 0.017405332s
127.0.0.1:50454 - [05/Oct/2018:01:25:15 +0200] 35427 "PTR IN 166.241.155.62.in-addr.arpa. udp 45 false 512" NXDOMAIN qr,rd,ra 139 0.01795297s
127.0.0.1:56209 - [05/Oct/2018:01:25:15 +0200] 22338 "PTR IN 38.118.5.217.in-addr.arpa. udp 43 false 512" NXDOMAIN qr,rd,ra 140 0.000773091s
127.0.0.1:51105 - [05/Oct/2018:01:25:24 +0200] 57904 "AAAA IN ntp3.ptb.de. udp 29 false 512" NOERROR qr,rd,ra 112 0.000827708s
127.0.0.1:58737 - [05/Oct/2018:01:25:24 +0200] 61978 "A IN ntp3.ptb.de. udp 29 false 512" NOERROR qr,rd,ra 100 0.000752208s
127.0.0.1:56626 - [05/Oct/2018:01:25:24 +0200] 61978 "A IN ntp3.ptb.de. udp 29 false 512" NOERROR qr,rd,ra 100 0.001321987s
127.0.0.1:51105 - [05/Oct/2018:01:25:24 +0200] 61978 "A IN ntp3.ptb.de. udp 29 false 512" NOERROR qr,rd,ra 100 0.001900355s
127.0.0.1:60269 - [05/Oct/2018:01:25:25 +0200] 63069 "PTR IN 3.0.1.0.0.0.0.0.0.0.0.0.0.0.0.0.1.0.e.b.0.1.6.0.8.3.6.0.1.0.0.2.ip6.arpa. udp 90 false 512" NOERROR qr,rd,ra 191 0.000858299s
127.0.0.1:59572 - [05/Oct/2018:01:25:41 +0200] 29616 "AAAA IN time1.google.com. udp 34 false 512" NOERROR qr,rd,ra 78 0.009314037s
127.0.0.1:56671 - [05/Oct/2018:01:25:41 +0200] 64920 "A IN time1.google.com. udp 34 false 512" NOERROR qr,rd,ra 66 0.022090466s
127.0.0.1:59572 - [05/Oct/2018:01:25:41 +0200] 64920 "A IN time1.google.com. udp 34 false 512" NOERROR qr,rd,ra 66 0.022850007s
127.0.0.1:56671 - [05/Oct/2018:01:25:42 +0200] 53596 "PTR IN 2.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.f.0.0.8.4.0.3.1.0.0.0.0.3.0.0.2.ip6.arpa. udp 90 false 512" NXDOMAIN qr,rd,ra 181 0.068109175s
127.0.0.1:42052 - [05/Oct/2018:01:25:42 +0200] 53596 "PTR IN 2.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.f.0.0.8.4.0.3.1.0.0.0.0.3.0.0.2.ip6.arpa. udp 90 false 512" NXDOMAIN qr,rd,ra 181 0.068753824s
127.0.0.1:58481 - [05/Oct/2018:01:25:42 +0200] 10822 "PTR IN 1.0.0.0.1.0.0.0.0.0.0.0.0.0.0.0.8.4.0.0.0.0.0.8.0.5.4.1.0.0.a.2.ip6.arpa. udp 90 false 512" NXDOMAIN qr,rd,ra 184 0.027052733s
127.0.0.1:56671 - [05/Oct/2018:01:25:42 +0200] 10822 "PTR IN 1.0.0.0.1.0.0.0.0.0.0.0.0.0.0.0.8.4.0.0.0.0.0.8.0.5.4.1.0.0.a.2.ip6.arpa. udp 90 false 512" NXDOMAIN qr,rd,ra 184 0.027765411s
127.0.0.1:36055 - [05/Oct/2018:01:25:42 +0200] 10822 "PTR IN 1.0.0.0.1.0.0.0.0.0.0.0.0.0.0.0.8.4.0.0.0.0.0.8.0.5.4.1.0.0.a.2.ip6.arpa. udp 90 false 512" NXDOMAIN qr,rd,ra 184 0.028311023s
127.0.0.1:50097 - [05/Oct/2018:01:25:42 +0200] 20716 "PTR IN c.6.1.2.0.0.0.0.0.0.0.0.0.0.0.0.1.0.0.0.0.0.0.0.0.6.8.4.1.0.0.2.ip6.arpa. udp 90 false 512" NXDOMAIN qr,rd,ra 192 0.042064483s
127.0.0.1:44948 - [05/Oct/2018:01:25:42 +0200] 23716 "PTR IN 1.1.0.0.0.0.0.0.0.0.0.0.0.0.0.0.f.d.1.1.0.0.0.0.0.6.8.4.1.0.0.2.ip6.arpa. udp 90 false 512" NXDOMAIN qr,rd,ra 192 0.036674568s
127.0.0.1:58481 - [05/Oct/2018:01:25:42 +0200] 60421 "PTR IN 4.7.8.f.0.0.0.4.c.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.6.8.4.1.0.0.2.ip6.arpa. udp 90 false 512" NXDOMAIN qr,rd,ra 192 0.029485152s
127.0.0.1:56671 - [05/Oct/2018:01:25:42 +0200] 60421 "PTR IN 4.7.8.f.0.0.0.4.c.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.6.8.4.1.0.0.2.ip6.arpa. udp 90 false 512" NXDOMAIN qr,rd,ra 192 0.030067642s
127.0.0.1:47395 - [05/Oct/2018:01:25:42 +0200] 60421 "PTR IN 4.7.8.f.0.0.0.4.c.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.6.8.4.1.0.0.2.ip6.arpa. udp 90 false 512" NXDOMAIN qr,rd,ra 192 0.030589227s
127.0.0.1:56061 - [05/Oct/2018:01:25:42 +0200] 49242 "PTR IN 1.4.e.c.0.0.0.4.c.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.6.8.4.1.0.0.2.ip6.arpa. udp 90 false 512" NXDOMAIN qr,rd,ra 192 0.02515867s
127.0.0.1:50750 - [05/Oct/2018:01:25:42 +0200] 26028 "PTR IN 8.8.3.d.0.0.0.4.c.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.6.8.4.1.0.0.2.ip6.arpa. udp 90 false 512" NXDOMAIN qr,rd,ra 192 0.022117572s
127.0.0.1:60533 - [05/Oct/2018:01:25:43 +0200] 54255 "PTR IN 9.0.1.d.0.0.0.0.c.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.6.8.4.1.0.0.2.ip6.arpa. udp 90 false 512" NXDOMAIN qr,rd,ra 192 0.125546181s
127.0.0.1:49160 - [05/Oct/2018:01:25:43 +0200] 6629 "PTR IN 2.6.3.e.0.0.0.4.c.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.6.8.4.1.0.0.2.ip6.arpa. udp 90 false 512" NXDOMAIN qr,rd,ra 192 0.117271752s
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0x11e3939]

goroutine 42 [running]:
github.com/coredns/coredns/plugin/forward.(*Proxy).Healthcheck.func1(0x4363f9, 0x1938450)
	/tmp/portage/net-dns/coredns-1.2.2/work/coredns-1.2.2/src/github.com/coredns/coredns/plugin/forward/proxy.go:54 +0x29
github.com/coredns/coredns/plugin/pkg/up.(*Probe).Do.func1(0xc0004a86b0, 0x1dcd6500, 0xc00014a5c0)
	/tmp/portage/net-dns/coredns-1.2.2/work/coredns-1.2.2/src/github.com/coredns/coredns/plugin/pkg/up/up.go:38 +0x36
created by github.com/coredns/coredns/plugin/pkg/up.(*Probe).Do
	/tmp/portage/net-dns/coredns-1.2.2/work/coredns-1.2.2/src/github.com/coredns/coredns/plugin/pkg/up/up.go:36 +0xa0

This happens both with my local build (Gentoo x86_64, Go 1.11) as well as the prebuilt binaries, both 1.2.0 and 1.2.2. I don’t know much golang so I’m afraid the stack trace above is all I can provide; however I can test & verify patches as long as they easily apply to 1.2.2. I understand of course that the scenario of forwarding to yourself is strictly nonsense, but IMHO it should nevertheless work (just like dnsmasq), e.g. by fixing the forwarder plugin to filter out its own server, or - as worst case - failing. Using resolv.conf as above was simply the first thing I tried as initial attempt, and of course found the bug; I now manually specify a forwarder list - which no longer contains the local instance - and all is well.

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Comments: 17 (11 by maintainers)

Commits related to this issue

Most upvoted comments

@hhoffstaette , could you cherry-pick the change from #2165 and check if it fixes your problem?

I mean, the problem definitely exists, and now we better understand which direction to dig