DoT with unbound

Havard Eidnes he at
Sun Nov 24 14:35:18 UTC 2019

Earlier, I wrote:

> we're using unbound (1.9.4) on NetBSD 8.0 to provide DNS-over-TLS
> service to a diverse set of customers, at the moment mostly as an
> experiment.
> What I find is that over time there will pile up a set of TCP
> connections in CLOSE_WAIT state, all with 147 bytes stuck in the
> receive queue of the socket (according to "netstat -n -f inet").

I've taken a closer look, but not found a solution to this issue.

I'm doing testing with "kdig" from the knot package.  What I find
is that it can take an inordinately long time to get a response,

% time kdig @ +timeout=120 +tls a
0.000u 0.008s 0:31.80 0.0%      0+0k 0+0io 0pf+0w

31 seconds, in other words, and I've seen response times as long
as 1minute+.  The randomness pool on the server host is full, as
is it at the client, so it's not that this is running low.

While it's waiting, I can find:

tcp      337      0     ESTABLISHED

in "netstat -f inet -n | grep 853" output on the server.  In
other words: the TCP connection is established, but the server
isn't consuming the data the client has sent.  Other clients will
leave a little less data, typically 147 bytes:

tcp      147      0   ESTABLISHED
tcp      147      0         158.36.214.xx.41054    ESTABLISHED
tcp      147      0   ESTABLISHED
tcp      147      0   ESTABLISHED

These will in all probability end up as CLOSE_WAIT sockets,
because the client times out, and by the looks of it, the server
is very slow in cleaning these out as well.

Furthermore, I observe this behaviour both with libevent (2.1.11)
and when using the unbound-internal "mini-event" code.

OpenSSL is the in-tree code from NetBSD, version OpenSSL 1.0.2k
26 Jan 2017.

In the process, I ran across

but wedging that into libevent 2.1.11 didn't make any significant
improvement that I could see.

I added some logging at the top of ssl_handshake() in
util/netevent.c, and the latest test I did resulted in:

% time kdig @ +timeout=120 +tls a
;; Time 2019-11-24 15:06:21 CET
;; From at 853(TCP) in 11.1 ms
0.000u 0.007s 1:08.18 0.0%      0+0k 0+0io 0pf+0w

and the corresponding log entry:

Nov 24 15:07:29 oliven unbound: [13367:0] notice: ssl handshake with port 64713

So ... apparently unbound is Very Slow in getting around to
handling the socket and handing it off for SSL handshake.  The
reason for this has yet to be found.  Hints gratefully accepted.


- Håvard

More information about the Unbound-users mailing list