[nsd-users] nsd server OS hang

Jeroen Koekkoek jeroen at nlnetlabs.nl
Fri Sep 16 15:24:10 UTC 2022


Hi Franky,

Glad to hear the server is behaving a lot better. --enable-mmap uses
mmap rather than malloc. It probably has lowerer overhead in specific
scenarios, but if things are working as expected right now, I suggest
leaving it as this.

- Jeroen 


On Fri, 2022-09-16 at 15:45 +0200, Franky Van Liedekerke wrote:
> Hi Jeroen,
> 
> I've recompiled with --enable-recvmmsg and left --enable-checking
> (for now).
> However, the old version was also compiled with "--enable-mmap" which
> I now deactivated since it is marked as being experimental.
> This is the current compile line:
> 
> ./configure --prefix=/usr --with-configdir=/etc/nsd --with-
> nsd_conf_file=/etc/nsd/nsd.conf --with-pidfile= --with-
> dbfile=/var/lib/nsd/nsd.db --with-zonesdir=/etc/nsd --with-
> xfrdfile=/var/lib/nsd/xfrd.state --enable-root-server --enable-
> ratelimit --enable-checking --enable-dnstap --enable-systemd --
> enable-pie --enable-relro-now --enable-recvmmsg --enable-packed --
> enable-memclean
> 
> Franky
> 
> On Fri, 2022-09-16 at 13:19 +0200, Jeroen Koekkoek wrote:
> > Hi Franky,
> > 
> > You may want to disable "--enable-checking", that's enabling debug
> > information and negatively impacts performance. --disable-recvmmsg
> > is
> > something you do want to enable because it gets multiple UDP
> > messages
> > with one syscall and thus improves performance.
> > 
> > Maybe it helps if you set the reload timeout a bit higher? It's
> > hard to
> > tell with the provided information what can be changed to keep the
> > server from becoming unresponsive. Maybe you can share the
> > configuration? You may want to have a look at the tuning section of
> > the
> > manual
> > (https://nsd.docs.nlnetlabs.nl/en/latest/running/tuning.html). I
> > wouldn't bother with Processor Affinity just yet, the first section
> > may
> > already do wonders for your setup.
> > 
> > Best,
> > Jeroen
> > 
> > 
> > On Fri, 2022-09-16 at 10:34 +0200, Franky Van Liedekerke via nsd-
> > users
> > wrote:
> > > Hi,
> > > 
> > > I seem to have an issue with one nameserver (the one running nsd
> > > 4.6.0, but it also happened with the nsd package that came with
> > > ubuntu itself):
> > > 
> > > on a regular basis the server just hangs. No coredumps (the
> > > server is
> > > configured to coredump), nothing in nsd logs, nothing in syslog
> > > except always the same final message that happens to arrive on
> > > the
> > > central logserver just before the OS hang:
> > > "TCP: request_sock_TCP: Possible SYN flooding on port 53. Sending
> > > cookies."
> > > 
> > > After that message, it's game over for that server: not even the
> > > console is responsive anymore. It's a vm, so we see the cpu
> > > spiking
> > > in the vm stats on the host so I'm assuming something is taking
> > > up
> > > all cpu causing a huge load, but I'm unable to pinpoint it since
> > > ...
> > > it hangs :-) . Other dns servers (running bind) with the same
> > > kernel
> > > parameters for flooding (burst), don't show the message (so maybe
> > > just 1 server is being targetted, but it still shouldn't crash
> > > like
> > > that).
> > > Any hints on how to debug this? If somone might think it is
> > > related
> > > to nsd, this is the compile line:
> > > ./configure --prefix=/usr --with-configdir=/etc/nsd --with-
> > > nsd_conf_file=/etc/nsd/nsd.conf --with-pidfile=/run/nsd/nsd.pid -
> > > -
> > > with-dbfile=/var/lib/nsd/nsd.db --with-zonesdir=/etc/nsd --with-
> > > xfrdfile=/var/lib/nsd/xfrd.state --disable-largefile --disable-
> > > recvmmsg --enable-root-server --enable-mmap --enable-ratelimit --
> > > enable-checking --enable-dnstap --enable-systemd
> > > 
> > > (I see there's an option for tcp_fastopen but not used by the
> > > person
> > > that compiled it and I can't really explain the reason on -
> > > disable-
> > > largefile --disable-recvmmsg, but those two shouldn't have any
> > > impact)
> > > The server-count=2 (server having 2 vcpu's), no mem issues seen.
> > > Server is serving (as secondary) more than 7000 zones (so many
> > > xfr
> > > requests, but currently we left the xfr-reload-timeout at 1
> > > second).
> > > 
> > > With friendly regards,
> > > Franky
> > > _______________________________________________
> > > nsd-users mailing list
> > > nsd-users at lists.nlnetlabs.nl
> > > https://lists.nlnetlabs.nl/mailman/listinfo/nsd-users
> > 



More information about the nsd-users mailing list