netty: DNS resolver failing to find valid DNS record

Expected behavior

The DNS resolver should find valid DNS records.

Actual behavior

Exception thrown:

Caused by: io.netty.resolver.dns.DnsNameResolverContext$SearchDomainUnknownHostException: Search domain query failed. Original hostname: 'host.toplevel' failed to resolve 'host.toplevel.search.domain' after 7 queries 
	at io.netty.resolver.dns.DnsNameResolverContext.finishResolve(DnsNameResolverContext.java:721)
	at io.netty.resolver.dns.DnsNameResolverContext.tryToFinishResolve(DnsNameResolverContext.java:663)
	at io.netty.resolver.dns.DnsNameResolverContext.query(DnsNameResolverContext.java:306)
	at io.netty.resolver.dns.DnsNameResolverContext.query(DnsNameResolverContext.java:295)
	at io.netty.resolver.dns.DnsNameResolverContext.tryToFinishResolve(DnsNameResolverContext.java:636)
	at io.netty.resolver.dns.DnsNameResolverContext$3.operationComplete(DnsNameResolverContext.java:342)
	at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:507)
	at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:481)
	at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:420)
	at io.netty.util.concurrent.DefaultPromise.trySuccess(DefaultPromise.java:104)
	at io.netty.resolver.dns.DnsQueryContext.setSuccess(DnsQueryContext.java:197)
	at io.netty.resolver.dns.DnsQueryContext.finish(DnsQueryContext.java:180)
	at io.netty.resolver.dns.DnsNameResolver$DnsResponseHandler.channelRead(DnsNameResolver.java:969)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
	at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
	at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1412)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
	at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:943)
	at io.netty.channel.nio.AbstractNioMessageChannel$NioMessageUnsafe.read(AbstractNioMessageChannel.java:93)
	at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:645)
	at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:580)
	at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:497)
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:459)
	at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:886)
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
	at java.lang.Thread.run(Thread.java:748)

Steps to reproduce

  1. Configure a top level domain someDomain on a DNS server you own
  2. Configure a host under the new top level domain someHost.someDomain
  3. Configure multiple resolvers on the DNS client machine that will run the Netty code. i.e. 8.8.8.8, 192.168.1.1, and 10.0.0.1 (I have 3 resolvers configured, each pointing to different DNS masters - global DNS, local personal private network, company private network over a VPN)
  4. Configure the search domain to not match the top level domain, i.e. search.otherDomain on the DNS client machine that will run the Netty code
  5. Ask netty to resolve someHost.someDomain
  6. failure.

Minimal yet complete reproducer code (or URL to code)

I’m not using Netty directly so I’m not sure what to put here. Do you want my Redisson code?

Netty version

Breaks when I upgrade to Reddison 3.6+ which pulls in Netty 4.1.20+ When forcing downgrade to Netty 4.1.13 the problem still shows, but with a slightly different stack trace.

JVM version (e.g. java -version)

java version “1.8.0_162” Java™ SE Runtime Environment (build 1.8.0_162-b12) Java HotSpot™ 64-Bit Server VM (build 25.162-b12, mixed mode)

OS version (e.g. uname -a)

Windows 10, Centos 7, Ubuntu 16.04

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Reactions: 3
  • Comments: 93 (47 by maintainers)

Commits related to this issue

Most upvoted comments

@normanmaurer @johnjaylward

Bingo! Finally managed to reproduce the issue. Below is the test code:

        NioEventLoopGroup niogroup = new NioEventLoopGroup();
        
        DnsAddressResolverGroup group = new DnsAddressResolverGroup(NioDatagramChannel.class, DnsServerAddressStreamProviders.platformDefault());
        ExecutorService es = Executors.newFixedThreadPool(2);
        CountDownLatch latch = new CountDownLatch(100);
        for (int i = 0; i < 100; i++) {
            es.execute(new Runnable() {
                @Override
                public void run() {
                    AddressResolver<InetSocketAddress> resolver = group.getResolver(niogroup.next());
                    try {
                        URI uri = new URI("redis://dev.myredis.com:6379");
                        Future<List<InetSocketAddress>> allNodes = resolver.resolveAll(InetSocketAddress.createUnresolved(uri.getHost(), uri.getPort()));
                        List<InetSocketAddress> list = allNodes.syncUninterruptibly().getNow();
                        System.out.println(list);
                    } catch (URISyntaxException e) {
                        // TODO Auto-generated catch block
                        e.printStackTrace();
                    } finally {
                        latch.countDown();
                    }
                }
            });
        }
        latch.await();
        group.close();
        es.shutdown();
        niogroup.shutdownGracefully();

MaraDNS (http://maradns.samiam.org/) running on 127.0.0.1:53 Here is the config for it https://github.com/netty/netty/files/1941275/maradns-config.zip

If server stands first in list of dns then all works fine:

Default DNS servers: [/127.0.0.1:53, /8.8.8.8:53, /87.98.175.85:53, /51.254.25.115:53] Log without errors: https://gist.github.com/mrniko/8ea3153888aa3dd8315b4549a77c34bb

But if it’s not, then lot of errors arise:

Default DNS servers: [/8.8.8.8:53, /127.0.0.1:53, /87.98.175.85:53, /51.254.25.115:53] Log with errors: https://gist.github.com/mrniko/59529c03450c69094d7b379515919ea2

Netty 4.1.30.Final JDK 11

Ok let me do a pr for the refresh in the meantime

Am 14.10.2018 um 09:01 schrieb Trustin Lee notifications@github.com:

I think there are two action items:

Refreshing the System DNS resolver configuration periodically This may or may not be relevant to this issue, but we should fix this. Changing the behavior of our DNS resolver so that it’s behavior is on par with that of the host OS. This probably will fix this issue but I’m not 100% sure if it’s worth fixing it given the complexity and its platform-dependent nature. It’d be natural to behave in the same way with host OS’s behavior from Windows users’ standpoint, though. Would love to listen to what other folks think about this. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

Also I checked the JDK impl again and it reloads the dns server config from time to time. We may should do the same…

Am 11.10.2018 um 19:45 schrieb Trustin Lee notifications@github.com:

This comment explains the difference between glibc and Windows resolver: https://bugzilla.redhat.com/show_bug.cgi?id=160914#c5

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

Any idea when this will be fixed?