Strange behaviour unbound server

Eduard Ahmatgareev e.ahmatgareev at gmail.com
Thu Jul 11 09:45:56 UTC 2019


Hi Ralph,

Thank you for response, do dump_cache with cron it's good idea, also
probably I can merge it with command to get request_list and join all of it
with my cron to collect tcpdump traffic
But now, I know next:
amazon doesn't like NXDOMAIN records, if query arrived to nonexistent
domain, unbound forwards this query to aws vpc dns server and aws spend a
lot of time to return answer.
Probably it can be our issue, but I am not 100% sure



чт, 11 июл. 2019 г. в 12:36, Ralph Dolmans via Unbound-users <
unbound-users at nlnetlabs.nl>:

> Hi Eduard,
>
> Hard to say why this happens periodically to you. Do you see an increase
> in the incoming queries when this happens? Maybe running out of some
> buffer space? Or do you by any chance periodically perform an expensive
> operation on unbound, like doing a dump_cache from cron? Are there any
> errors written to the log?
>
> -- Ralph
>
> On 11-07-19 10:34, Eduard Ahmatgareev via Unbound-users wrote:
> > Hi everyone,
> >
> > I faced with intersting issue with unbound server and couldn't figure
> > out without your help
> > We used unbound as primary dns resolver in our aws infrastructure, but
> > from time to time unbound server is not responding to queries from our
> > clients
> > Also I found by tcpdump and wireshark a lot of retransmission DNS
> > requests from clients in the subnets.
> > But this issue present periodically, our clients get timeout issue
> > during the day.
> > from 100 queries, timeout can be get for 3-8 queries.
> >
> > For debug I used command:
> > perf trace -p $(pidof unbound)  --duration=10
> > and got next:
> >     13.285 (599.741 ms): unbound/15943 epoll_pwait(epfd:
> > 54<anon_inode:[eventpoll]>, events: 0x564955c6ae10, maxevents: 128,
> > timeout: -1, sigsetsize: 8) = -1 EINTR Interrupted system call
> >    616.016 (94.403 ms): unbound/15943 epoll_pwait(epfd:
> > 54<anon_inode:[eventpoll]>, events: 0x564955c6ae10, maxevents: 128,
> > timeout: -1, sigsetsize: 8) = 1
> >    710.662 (130.206 ms): unbound/15943 epoll_pwait(epfd:
> > 54<anon_inode:[eventpoll]>, events: 0x564955c6ae10, maxevents: 128,
> > timeout: -1, sigsetsize: 8) = 1
> >    616.649 (224.502 ms): unbound/15952 epoll_pwait(epfd:
> > 42<anon_inode:[eventpoll]>, events: 0x7faea89ea7f0, maxevents: 128,
> > timeout: -1, sigsetsize: 8) = 1
> >    850.606 (112.947 ms): unbound/15952 epoll_pwait(epfd:
> > 42<anon_inode:[eventpoll]>, events: 0x7faea89ea7f0, maxevents: 128,
> > timeout: -1, sigsetsize: 8) = 1
> >     13.453 (1160.129 ms): unbound/15951 epoll_pwait(epfd:
> > 37<anon_inode:[eventpoll]>, events: 0x7faea47ca3e0, maxevents: 64,
> > timeout: -1, sigsetsize: 8) = 1
> >    840.904 (335.113 ms): unbound/15943 epoll_pwait(epfd:
> > 54<anon_inode:[eventpoll]>, events: 0x564955c6ae10, maxevents: 128,
> > timeout: -1, sigsetsize: 8) = 1
> >    710.891 (465.469 ms): unbound/15950 epoll_pwait(epfd:
> > 36<anon_inode:[eventpoll]>, events: 0x7faeac8b2680, maxevents: 128,
> > timeout: -1, sigsetsize: 8) = 1
> >     13.769 (1174.857 ms): unbound/15954 epoll_pwait(epfd:
> > 48<anon_inode:[eventpoll]>, events: 0x7fae98747c20, maxevents: 128,
> > timeout: -1, sigsetsize: 8) = 1
> >   1176.048 (17.121 ms): unbound/15943 epoll_pwait(epfd:
> > 54<anon_inode:[eventpoll]>, events: 0x564955c6ae10, maxevents: 128,
> > timeout: -1, sigsetsize: 8) = -1 EINTR Interrupted system call
> >   1175.740 (21.495 ms): unbound/15951 epoll_pwait(epfd:
> > 37<anon_inode:[eventpoll]>, events: 0x7faea47ca3e0, maxevents: 64,
> > timeout: -1, sigsetsize: 8) = 1
> >   1177.587 (19.955 ms): unbound/15950 epoll_pwait(epfd:
> > 36<anon_inode:[eventpoll]>, events: 0x7faeac8b2680, maxevents: 128,
> > timeout: 264, sigsetsize: 8) = 1
> >   1196.914 (11.097 ms): unbound/15954 epoll_pwait(epfd:
> > 48<anon_inode:[eventpoll]>, events: 0x7fae98747c20, maxevents: 128,
> > timeout: -1, sigsetsize: 8) = 1
> >
> >
> >
> > our infra:
> > ec2: c5.2xlarge (16gb mem, 8cores, 60gb gp2)
> > dist: amazon linux 2
> >
> > unbound-libs-1.6.6-1.amzn2.0.2.x86_64
> > unbound-python-1.6.6-1.amzn2.0.2.x86_64
> > unbound-1.6.6-1.amzn2.0.2.x86_64
> >
> > conf:
> > server:
> >     verbosity: 1
> >     num-threads: 8
> >     statistics-interval: 0
> >     extended-statistics: yes
> >     statistics-cumulative: no
> >     msg-cache-slabs: 4
> >     rrset-cache-slabs: 4
> >     infra-cache-slabs: 4
> >     key-cache-slabs: 4
> >     rrset-cache-size: 100m
> >     msg-cache-size: 50m
> >     so-rcvbuf: 4m
> >     so-sndbuf: 4m
> >     so-reuseport: yes
> >     outgoing-range: 8192
> >     num-queries-per-thread: 4096
> >     do-daemonize: no
> >     prefetch: yes
> >     rrset-roundrobin: yes
> >     logfile: ""
> >     use-syslog: no
> >     directory: "/etc/unbound"
> >     chroot: ""
> >     log-queries: no
> >     access-control: 0.0.0.0/0 <http://0.0.0.0/0> allow
> >     interface: 0.0.0.0
> >     interface-automatic: yes
> >     port: 53
> >     do-ip4: yes
> >     do-ip6: no
> >     do-udp: yes
> >     do-tcp: yes
> >     username: "unbound"
> >     pidfile: "/var/run/unbound/unbound.pid"
> >     root-hints: /etc/unbound/root.hints
> >     key-cache-size: 32m
> >     local-zone: "10.in-addr.arpa." nodefault
> >
> > remote-control:
> >     control-enable: yes
> >
> > any ideas?
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.nlnetlabs.nl/pipermail/unbound-users/attachments/20190711/a69d800a/attachment.htm>


More information about the Unbound-users mailing list