[Dnssec-trigger] Why Does unbound Fail on So Many Requests?
Garry T. Williams
gtwilliams at gmail.com
Sun Apr 20 00:57:30 UTC 2014
On 4-19-14 20:16:58 Paul Wouters wrote:
> On Sat, 19 Apr 2014, Garry T. Williams wrote:
>
> > unbound[773]: [773:1] info: validation failure t6021.network-dns-unbound-user.dnstalk.us.dlv.isc.org. DLV IN
> > unbound[773]: [773:0] info: validation failure natenom.name.dlv.isc.org. DLV IN
> > unbound[773]: [773:0] info: validation failure platform.twitter.com.dlv.isc.org. DLV IN
>
> > garry at vfr$ dig +dnssec t6021.network-dns-unbound-user.dnstalk.us @127.0.0.1
> >
> > ; <<>> DiG 9.9.4-P2-RedHat-9.9.4-12.P2.fc20 <<>> +dnssec t6021.network-dns-unbound-user.dnstalk.us @127.0.0.1
> > ;; global options: +cmd
> > ;; Got answer:
> > ;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 56300
>
> That should not happen. I've seen at times that there are timing
> failures when it takes long to get to the hotspot. To test that, you
> can try to restart unbound but load it with the same forwarders
> after you have authenticated with the hotspot:
Thanks for the reply. I should have mentioned that my first trial
with this stuff is on a desktop system at home. No hotspot here. I'm
just doing some browsing and my Web browser reports errors
occasionally for various domains.
While trying a broken domain name, I noticed that one of my ISP's
servers was not responding at all and dig timed out waiting for it.
The other two responded with A and RRSIG records. My local unbound
gives back SERVFAIL after a shorter wait.
garry at vfr$ dig +dnssec test.dnssec-or-not.com @65.68.49.50
; <<>> DiG 9.9.4-P2-RedHat-9.9.4-12.P2.fc20 <<>> +dnssec test.dnssec-or-not.com @65.68.49.50
;; global options: +cmd
;; connection timed out; no servers could be reached
garry at vfr$ dig +dnssec test.dnssec-or-not.com @127.0.0.1
; <<>> DiG 9.9.4-P2-RedHat-9.9.4-12.P2.fc20 <<>> +dnssec test.dnssec-or-not.com @127.0.0.1
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 37221
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags: do; udp: 4096
;; QUESTION SECTION:
;test.dnssec-or-not.com. IN A
;; Query time: 1075 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Sat Apr 19 20:28:42 EDT 2014
;; MSG SIZE rcvd: 51
garry at vfr$ fc -Dl
...
10843 0:15 dig +dnssec test.dnssec-or-not.com @65.68.49.50
10844 0:06 dig +dnssec test.dnssec-or-not.com @127.0.0.1
garry at vfr$
Here, BellSouth (now AT&T) doesn't respond in 15 seconds. Unbound
calls it an error after six seconds. Queries on the other two
BellSouth servers are returned normally in under one second.
garry at vfr$ dig +dnssec test.dnssec-or-not.com @205.152.37.23
...
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 43577
...
;; Query time: 719 msec
;; SERVER: 205.152.37.23#53(205.152.37.23)
;; WHEN: Sat Apr 19 20:31:44 EDT 2014
;; MSG SIZE rcvd: 910
garry at vfr$ fc -Dl
...
10846 0:01 dig +dnssec test.dnssec-or-not.com @205.152.37.23
My conclusion is that unbound doesn't manage to go around the
unresponsive server in my ISP's network.
> I think dnssec-trigger/unbound should have a combination to make
> negative-ttl much much shorter on "enduser systems" to avoid these
> kind of timing errors.
Perhaps.
My observation on this /one case/ tells me this stuff needs to avoid
forwarders that have become unresponsive and not cache the non-
response as an answer returned to clients. But I don't know how to
accomplish that.
I don't remember seeing so many failures when I was running dnsmasq
instead of unbound. Of course that may be because nothing was ever
logged by dnsmasq -- unbound is very noisy in my system journal.
Anyway, I think you have given me something to go on. Thank you.
--
Garry T. Williams
More information about the dnssec-trigger
mailing list