unbound: Resolving records through more than 8 CNAME fails due to hardcoded MAX_RESTART_COUNT
I’ve run into an edge case where a name passed through 9 CNAMEs (and 10 total resolutions to get the final answer). Because MAX_RESTART_COUNT
is hardcoded there’s no way to tune the configuration to work around this situation.
The original bug report is in the pfSense bug tracker here where detailed traces from drill
are available.
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Comments: 19 (6 by maintainers)
Commits related to this issue
- - Bump MAX_RESTART_COUNT to 11 from 8; in relation to #438. This allows longer CNAME chains in Unbound. — committed to NLnetLabs/unbound by gthess 3 years ago
Hi all,
Thanks for your input. For now we are bumping the value to 11 from the previous 8 for the next version. This will allow both examples given here (one of them is already fixed in DNS) to work.
Since this is the first time we get something similar regarding CNAME chain length, we are reluctant to jump to a configuration option at this moment.
Failures I ran into with this for the sake of anyone else confused:
Granted those are Microsoft Windows problems but 8 is arbitrarily low for a cheap way to prevent CNAME loops. PR #461 is nice but perhaps the default needs to be bumped up significantly too. There’s a lot of MS Windows systems and CDNs out there.
The only workaround I’ve found beyond hardcoding microsoft’s IPs is to not use unbound.
dig @8.8.8.8 logincdn.msauth.net A
returns full results so I figured forwarding msauth.net would work but unbound insists on prioritizing and chaining CNAMEs locally and hits the limit. I believe https://github.com/NLnetLabs/unbound/issues/132 is relevant here. It’s surprising to me that “forward” is injecting more behavior than the forwarded server would provide with no way to disable it rather than being a transparent true forwarder.This issue is very annoying, at least make it possible to be changed in configuration rather than hardcoded.
Yes, long chain of CNAME probably is not a good design, but we can’t just simply ask Akamai, Microsoft, Apple, other companies or CDN and Cloud companies say hey my software is not compatible with your design, can you change your design?
Chances are the answer will be: No, why others are fine, only you are complaining, and no, we will not change our design for minor users who is having issue with their own software. Just switch to another software that works.
So, please make this a configuration parameter rather than hardcoded.
With regard to why this change might be helpful - if you’re in a situation (like I was) where you’re maintaining a system that uses unbound (like PFSense) and swapping out the package for a custom build isn’t particularly easy or well supported then you’re forced to use another DNS resolver altogether to get unblocked. One of the reasons why I thought this would be valuable to have as a configuration option is to avoid defaulting everyone into a higher count just to cover the few cases where it could help but give people that have identified the problem a way to work around it.
Part of this discussion has become what it should be increased to and while that might help for the next instance I think that would just be kicking the can down the road. There’s still a good reason to have a limit; even without cycles in the resolution it would be possible to create a domain to DOS a resolver by cycling through an absurdly long CNAME chain so having a limit makes sense. But making that limit have the ability to override it in config would have allowed me to be unblocked while waiting for the service to be fixed.
I’d very much like #461 to be adopted; all of the defaults could remain in place to not cause new behavior for existing deployments but the next time I get into this state because someone did an unusual DNS configuration (very easy to accidentally do) and only tested it in a few configurations then I can still unblock myself.
@gthess Yeah, I fully understand that. We are currently dealing with this bug because we never thought someone would make a CNAME chain this long. To prevent this from coming up in the future, I would think we’d either need to make it a high number, or make it configurable so it’s an easy work around if something “stupid” happens in the future. Right now, there are a LOT Of downstream users that can’t (or don’t know how) to upgrade because they use unbound in a project such as PFSense. The bug won’t be fixed for them until a fix is released…and then PFSense upgrades the package. A configuration option would let users mitigate this in the future.