DoT with unbound
Havard Eidnes
he at uninett.no
Sun Nov 24 23:04:34 UTC 2019
Hm, following up on the saga...
My unbound is listening as follows:
server:
# Listen on port 853 on both logical and physical interfaces:
interface: 158.38.0.1 at 853
interface: 2001:700:0:ff00::1 at 853
interface: 158.38.0.168 at 853
interface: 2001:700:0:503::ca53 at 853
and ends up with typically the following FDs open:
# fstat -n -p 24864
USER CMD PID FD DEV INUM MODE SZ|DV R/W
unbound unbound 24864 root 18,4 46615746 40755 512 r
unbound unbound 24864 wd 18,4 46615746 40755 512 r
unbound unbound 24864 0 18,0 85874 20666 2,2 rw
unbound unbound 24864 1 18,0 85874 20666 2,2 rw
unbound unbound 24864 2 18,0 85874 20666 2,2 rw
unbound unbound 24864 3* crypto 0xfffffe841bcbae70
unbound unbound 24864 4* internet dgram udp 158.38.0.1:853
unbound unbound 24864 5* internet stream tcp 158.38.0.1:853
unbound unbound 24864 6* internet6 dgram udp [2001:700:0:ff00::1]:853
unbound unbound 24864 7* internet6 stream tcp [2001:700:0:ff00::1]:853
unbound unbound 24864 8* internet dgram udp 158.38.0.168:853
unbound unbound 24864 9* internet stream tcp 158.38.0.168:853
unbound unbound 24864 10* internet6 dgram udp [2001:700:0:503::ca53]:853
unbound unbound 24864 11* internet6 stream tcp [2001:700:0:503::ca53]:853
unbound unbound 24864 12* unix stream <-> /var/run/unbound-ctl [creat]
unbound unbound 24864 13* unix dgram <-> /var/run/log [using]
The TCP/158.38.0.1:853 port is the "main" service point (receives the
most queries), fd 5.
I added some verbose logging to handle_select(), and it looks like
when unbound takes inordinately long time to return questions on
158.38.0.1:853 (or doesn't do so within even 2 minutes), I get logging
that fd 5 is *not* part of the read-set passed to select(!)
I would have thought that FDs 4-11 would *always* be included in the
read-set in the select loop, but that doesn't appear to be the case.
This also appears to be when there starts to pile up connections which
are established by the TCP stack but which are not being handled by
unbound (not yet picked up with accept()):
# netstat -f inet -n | grep 853
tcp 147 0 158.38.0.1.853 158.38.204.xxx.33336 CLOSE_WAIT
tcp 165 0 158.38.0.1.853 158.38.204.yyy.44656 ESTABLISHED
tcp 147 0 158.38.0.1.853 158.38.216.zzz.41239 ESTABLISHED
tcp 147 0 158.38.0.1.853 158.38.204.xx6.37978 CLOSE_WAIT
tcp 147 0 158.38.0.1.853 158.38.204.zz8.49063 ESTABLISHED
tcp 147 0 158.38.0.1.853 158.38.204.xx1.37366 ESTABLISHED
tcp 147 0 158.38.0.1.853 128.39.150.aa8.37272 ESTABLISHED
tcp 147 0 158.38.0.1.853 158.38.204.xx7.51850 ESTABLISHED
tcp 147 0 158.38.0.1.853 158.38.204.xx9.44854 CLOSE_WAIT
tcp 517 0 158.38.0.1.853 158.38.204.xx1.37362 ESTABLISHED
tcp 517 0 158.38.0.1.853 158.38.204.xx2.46394 ESTABLISHED
...
Looking at the code in util/, there doesn't seem to be any code
outside of mini_event.c which messes with the "reads" set, it's zeroed
by event_init(), used by handle_select(), set by event_add(), and
cleared by event_del(). Except, of course, one may operate with
several "contexts" created by event_init().
So I added some code to event_del() and event_add(), and I see fd 5
being both added and removed, and now last in the log I have 2
minutes+ where it's been deleted, so new connections are not picked up
during that entire duration, which is obviously bad.
What might be causing this?
Regards,
- Håvard
More information about the Unbound-users
mailing list