Unbound randomly fails to resolve names
RayG
rgsub1 at btinternet.com
Thu Jul 23 13:23:15 UTC 2020
Hi George,
OK thanks for the confirmation of the other issues I have seen.
With respect to the use-caps-for-id when you say "as long as the other side supports it" CloudFlare does support it because it works most of the time.
You also said:
"When unbound asks for an.ExaMple.domAin.NeT and the record is not cached in the forwarder, the answer will contain the correct case.
Afterwards, when the answer is cached, the wrong casing (always lowercase) will be used, and until the TTL expires I assume.
This results in a mismatch between query and reply if use-caps-for-id is used."
So am I right in the understanding that the initial query goes to CloudFlare and CloudFlare has to go off and resolve the name itself (its not in the cache) then the correct case is preserved. Now CloudFlare has the name stored in its cache. A subsequent query for the same DNS name is then requested where the casing is randomly different and then a case insensitive match is found in CloudFlare's cache and the cache entry casing is returned rather than the query with the different casing that the client just requested. Then you get the error. This sounds like a bug to me and something CloudFlare should be aware of and fix? Is that what you were referring to when you said "I will try to reach the people involved"?
For the "tcp error" it would be nice to have a bit more information in the log as to what actually failed e.g. connection refused, timed out etc. It would help to show it was the other end not Unbound.
I was using just CloudFlare to make sure that when I submitted the logs etc. that I had a repeatable, consistent issue to look at. And yes since setting 'use-caps-for-id' to 'no' there have been no issues.
I have now set 'use-caps-for-id' to 'yes' and changed the list of forwarders to see what happens:
forward-zone: # MyForwardZones.conf
name: "."
forward-tls-upstream: yes
forward-first: yes
# Cloudflare DNS
forward-addr: 1.1.1.1 at 853#cloudflare-dns.com
forward-addr: 1.0.0.1 at 853#cloudflare-dns.com
forward-addr: 2606:4700:4700::1111 at 853#cloudflare-dns.com
forward-addr: 2606:4700:4700::1001 at 853#cloudflare-dns.com
# Quad9
forward-addr: 2620:fe::fe at 853#dns.quad9.net
forward-addr: 9.9.9.9 at 853#dns.quad9.net
#forward-addr: 2620:fe::9 at 853#dns.quad9.net
#forward-addr: 149.112.112.112 at 853#dns.quad9.net
# Google
forward-addr: 8.8.8.8 at 853#Dns.google
forward-addr: 8.8.4.4 at 853#Dns.google
forward-addr: 2001:4860:4860::8888 at 853#Dns.google
forward-addr: 2001:4860:4860::8844 at 853#Dns.google
# DNS Privacy
forward-addr: 94.130.110.185 at 853#ns1.dnsprivacy.at
forward-addr: 94.130.110.178 at 853#ns2.dnsprivacy.at
Thanks again for the information it has been useful to me and I suspect others.
Ray
-----Original Message-----
From: George Thessalonikefs <george at nlnetlabs.nl>
Sent: 22 July 2020 17:30
To: RayG <rgsub1 at btinternet.com>
Cc: unbound-users at lists.nlnetlabs.nl
Subject: Re: Unbound randomly fails to resolve names
Hi Ray,
On 21/07/2020 13:26, RayG wrote:
> Hi George, Oliver, Andi,
>
> @George: Thanks for your reply.
>
> I have made the adjustment we will see how it goes.
>
> But as Oliver Psotta says at https://calomel.org/unbound_dns.html there are good reasons for having it enabled.
>
> Also on the page: https://www.grc.com/dns/dns.htm there is a "Spoofabity" test which also suggests having mixed case is good.
Having the option enabled is good as long as the other side supports it.
This is not the case for you, at least for now.
If you want to keep it enabled you can enrich your forwarders configuration with other public DoT resolvers.
You can find more information at
https://dnsprivacy.org/wiki/display/DP/DNS+Privacy+Public+Resolvers#DNSPrivacyPublicResolvers-DNS-over-TLS(DoT).
>
> There are also the TCP Errors e.g.:
> 21/07/2020 11:15:01 C:\Program Files\Unbound\unbound.exe[16308:0]
> debug: tcp error for address 1.0.0.1 port 853
Nothing wrong here, seems like a tcp error to that IP and port. Unbound couldn't make the connection (maybe network routing problems, unavailability from the other side or the local system) and it should go on to try the next available server.
> These are unexplained so far as are some of the other entries like:
> 21/07/2020 10:41:40 C:\Program Files\Unbound\unbound.exe[16308:0]
> debug: request E.ROOT-SERVERS.NET. has exceeded the maximum number of
> glue fetches 65
> 21/07/2020 10:41:40 C:\Program Files\Unbound\unbound.exe[16308:0]
> debug: return error response SERVFAIL And
> 21/07/2020 10:41:40 C:\Program Files\Unbound\unbound.exe[16308:0]
> debug: request has exceeded the maximum number of nxdomain nameserver
> lookups with 13
> 21/07/2020 10:41:40 C:\Program Files\Unbound\unbound.exe[16308:0]
> debug: return error response SERVFAIL
>
> All of which are still occurring, should they be happening?
Both of the above are because resolution has exceeded a set of limits and the query is considered as hitting a dead end from unbound's point of view (there seems to be no available servers that can provide an answer).
Unbound stops resolution and returns SERVFAIL to the client(s).
As you are forwarding to a limited set of resolvers (in contrast with reaching the different authoritative nameservers during normal resolution), those kind of limits could be reached easier/faster if there are communication issues as the upstream is the same and sole responsible server for all the delegation points.
>
> Also I have been able to look back at some of my backup images these were all running the same way as currently and the event log messages like:
>
> Level Date and Time Source Event ID Task Category
> Warning 16/07/2020 15:48:44 Microsoft-Windows-DNS-Client 1014 (1014) Name resolution for the name enews.synology.com timed out after none of the configured DNS servers responded.
>
> Are present.
>
> These started occurring after the release of V1.9.4. The event log on that backup image (which I am able to run as a virtual machine) did not contain any of the above errors.
> So V1.9.4 was OK
>
> Unfortunately between the above VM and the next one I can run there
> was V1.9.5 and the download file for that was dated at 19/11/2019 V1.9.6. download date was 12/12/2019 I can say that the above type of errors started appearing just after V1.9.5 was downloaded. I normally install the new version on the same day that I download it. So something happened somewhere between V1.9.4 and V1.9.5 and has been the same ever since.
I believe the difference in behavior is only a coincidence for unbound.
1.9.5 was a CVE release that was solving a security vulnerability in the ipsecmod module. It had nothing to do with upstream connections, tcp connections, or DoT and if unbound is not compiled with the ipsecmod, the code should be identical to the 1.9.4 version.
Best regards,
-- George
>
> I hope that helps.
>
> Thanks for any further information/comments
>
> Ray
> -----Original Message-----
> From: George Thessalonikefs <george at nlnetlabs.nl>
> Sent: 20 July 2020 15:09
> To: unbound-users at lists.nlnetlabs.nl
> Subject: Re: Unbound randomly fails to resolve names
>
> Hi Ray, Andi,
>
> I see from Ray's log that use-caps-for-id: is enabled.
> I also see that the forwarding resolvers used seem to have an issue
> with
> 0x20 replies (use-caps-for-id related).
>
> For example:
> When unbound asks for an.ExaMple.domAin.NeT and the record is not cached in the forwarder, the answer will contain the correct case.
> Afterwards, when the answer is cached, the wrong casing (always
> lowercase) will be used, and until the TTL expires I assume. This results in a mismatch between query and reply if use-caps-for-id is used.
>
> Unbound's fallback may or may not help at that time. From your log I
> see that the fallback does not help (returns SERVFAIL after some
> further
> tries) and consecutive queries try without 0x20.
>
> I will try to reach the people involved but for now turning off use-caps-for-id should help.
>
> Let us know how it goes.
>
> Best regards,
> -- George
>
>
More information about the Unbound-users
mailing list