unbound vs idle tcp connections
david at gwynne.id.au
Mon Aug 19 23:03:24 UTC 2019
At work we're hitting a problem where unbound collects a lot of TCP connections that the daemon doesn't seem to ever close. In turn this means we hit the tcp connection limit (incoming-num-tcp?), which prevents new tcp connections working, and then that part of the service is effectively DoSsed.
I'm not sure it matters, but the cause of these long lived tcp connections appears to be the dns-sd service on macOS boxes. We're using DNS-SD (RFC 6763 I think) to advertise printers to a lot of differently wired networks, and those records are fairly chunky so they end up coming in on TCP. dns-sd seems to like to keep the connection open in case it wants to enumerate stuff again quickly, but in some situations the client machine goes away and the unbound boxes don't notice.
For example, a macOS client may be a laptop. If it is on a wired net and currently has a TCP connection to the unbound server still open, but someone yanks the network out to take the laptop somewhere else, the client will not be able to generate a FIN or RST packet for the server to close the connection with. The unbound box won't know the client is gone immediately, but in this situation I'd expect unbound to close the idle connection, presumably after the tcp-idle-timeout period is hit. We're not seeing that at all though. We've got TCP connections listed in netstat output that have been there for 12 hours now, while the machine in question has been taken to the persons home (to a completely different network).
So can anyone confirm that unbound has code that actively maintains timeouts on client tcp connections, or does it assume other events will occur (eg, tcp keepalives don't get answered, or client actively disconnects) that will cause the connections to get cleaned up?
More information about the Unbound-users