1.9.4: TCP queries when some threads are full

Wouter Wijngaards wouter at nlnetlabs.nl
Mon Oct 21 12:30:17 UTC 2019


On 10/21/19 2:13 PM, Patrik Lundin via Unbound-users wrote:
> Hello,
> I have noticed a machine running unbound sometimes
> ignoring TCP requests, tested like so on the server:
> ===
> # dig +tcp @ CH TXT hostname.bind +tries=1
> [...]
> ;; connection timed out; no servers could be reached
> ===
> It will work some times, and then it wont, UDP appears unaffected. The TCP
> statistics looks like this:
> ===
> # unbound-control stats_noreset | grep tcp
> thread0.tcpusage=10
> thread1.tcpusage=1
> thread2.tcpusage=0
> thread3.tcpusage=4
> thread4.tcpusage=0
> thread5.tcpusage=0
> thread6.tcpusage=0
> thread7.tcpusage=0
> thread8.tcpusage=10
> thread9.tcpusage=10
> thread10.tcpusage=10
> thread11.tcpusage=1
> thread12.tcpusage=5
> thread13.tcpusage=0
> thread14.tcpusage=0
> thread15.tcpusage=1
> total.tcpusage=52
> ===
> The machine is running with the default "incoming-num-tcp" of 10. So it appears
> some of the threads are fully utilized. My first question: Is it

You should probably increase that count to 100 or better 1000.
incoming-num-tcp: 1000
And perhaps also increase the upstream (outgoing-num-tcp) if you have 
upstream TCP or TLS configured.

This allocates more buffers and that is useful for a server with more 
clients on it.

The setting is actually similar to a setting for NSD that controls 
similarly the number of buffers for client streams.

> possible that the sometimes failing requests is a result of that request
> being dispatched to a "full" thread even when there are unused threads
> available?

I don't know depends on the OS dispatching it.  I think Unbound stops 
accepting them for that thread when the thread is full.

> My second question is what the expected behaviour of unbound is for TCP
> connections that are idling. From unbound.conf(5) I see "tcp-idle-timeout"
> defaults to 30000ms, so this tells me a TCP connection being silent for 30
> seconds will be dropped but maby this only matters until we have seen an
> initial query and will then leave the connection forever?
> I tracked down the file descriptor for one of the TCP connections to
> unbound, found it was created over 12 hours ago, and then filtered for
> traffic for the host and port that was holding the connection with
> tcpdump, and not a single packet appeared for the several minutes I was running
> it.

When TCP is nearly full it should use an even shorter timeout.  And not 
allow such very long idle connections.  That looks like it went wrong.

Best regards, Wouter

More information about the Unbound-users mailing list