uptime-kuma: [dns] query A fails

๐Ÿ‘Ÿ Reproduction steps

Setup a DNS Monitor using the default CloudFlare Resolver Server of 1.1.1.1

๐Ÿ‘ Expected behavior

Monitor shouldnโ€™t trigger as DOWN regularly when the actual domainโ€™s DNS is resolving just fine.

To better diagnose the underlying problem I setup a nearly identical UK DNS monitor using Google DNS (8.8.8.8/8.8.4.4), and no UK incidents have been experienced seen since! The other added bonus - Google DNS seems to support โ€˜ANY/ALLโ€™ DNS queries whereas CloudFlare does not, meaning we have a way to gather most of the DNS record types for the domain.

๐Ÿ‘Ž Actual Behavior

UK frequently detects the domainโ€™s DNS A record as DOWN with the message:

queryA ESERVFAIL domain.com

We have many A Record DNS Monitors in place for multiple domain names; experienced this across all of them.

๐Ÿป Uptime-Kuma version

1.9.1

๐Ÿ’ป Operating System

Ubuntu 20.04

๐ŸŒ Browser

Any

๐Ÿ‹ Docker

N/A

๐Ÿท๏ธ Docker Image Tag

N/A

๐ŸŸฉ NodeJS Version

14.8.1

๐Ÿ“ Relevant log output

Up	2021-10-31 01:16:24	Records: 123.123.123.123
Down	2021-10-31 01:15:01	queryA ESERVFAIL domain.com
Up	2021-10-30 19:24:56	Records: 123.123.123.123
Down	2021-10-30 19:23:32	queryA ESERVFAIL domain.com
Up	2021-10-30 15:42:27	Records: 123.123.123.123
Down	2021-10-30 15:41:04	queryA ESERVFAIL domain.com
Up	2021-10-30 12:49:59	Records: 123.123.123.123
Down	2021-10-30 12:48:35	queryA ESERVFAIL domain.com

โš ๏ธ Please verify that this bug has NOT been raised before.

  • I checked and didnโ€™t find similar issue

๐Ÿ›ก๏ธ Security Policy

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Reactions: 1
  • Comments: 30 (12 by maintainers)

Most upvoted comments

Cacheable-lookup is not working properly in some cases. With 1.19.x, DNS cache now could be disabled in Settings.

Iโ€™m just wondering if they problem w/ the short names is that the cached dns record is shared over every monitor using the same connection options? Should that key be more complex (include the ID of the monitor for example)?

image

I donโ€™t think so, because under same agent options, http agent is reusable. HTTP agent is not specified for only one domain.

You can see the example in https://github.com/szmarczak/cacheable-lookup#attaching-cacheablelookup-to-an-agent

And so far, I do not receive large amount of similar bug reports, so I assumed that it should be very specific issues like @ljurk said, he is using Windows DNS Server

I had a few other monitors one like this that started failing w/ the queryA ESERVFAIL and left the server rebooted. I left them and after 1 day they went away. There must be some other cache/matching that happens elsewhere causing it for meโ€ฆ I did reset the server dns cache (which is also probably what happened when the server rebooted).

I started getting this after release 18. The only change in the monitor code was the dns cache. Iโ€™m using an internal DNS with a ton of monitors, but only three specific monitors for Apache solr are failing. Other sites monitored on the same server resolve properly.

Wonder if it is because of the port or something? The failing urls are like http://server:8983/solr/ and passing urls are like http://dns-on-same-server.

I tried adding the server name to the host file but no luck.

Any other ideas?

https://github.com/louislam/uptime-kuma/commit/2073f0c28476bb46fb953ecefb9622273e8819d9