[nsd-users] NSD as slave leaves a child process behind
Ville Mattila
vmattila at csc.fi
Mon Sep 14 14:25:26 UTC 2009
Hi,
We've run into problems with NSD v3.2.3 (on Red Hate Enterprise Linux
5.4 x86_64) failing to kill one of it's children processes while NSD
is reloading after it has received an update to a zone from master.
Everything seems to running fine, but 'nsdc stop' and 'nsdc patch'
etc. really don't work because the reload updates NSD the pid file but
the child that was left behind really is handling all the queries
and zone updates.
Our NSD is a slave to some thousands of zones and because of the way we
automatically create NSD configuration (we configure NSD to use every
other NS of the zone as master even though typically only one of the other
NS servers of a zone actually allows an AXFR for us), our NSD typically
has n*10 zone transfers in SYN_SENT state destined to fail after connect
timeout and for a lot of AXFR requests our NSD gets REFUSED response.
I'd appreciate help hunting down the cause for this problem; we need to
get 'nsdc stop && nsdc start' working to be able to restart NSD reliably
after generating new nsd.conf automatically. (Of course I could write
a script which kills NSD some other way instead of 'nsdc stop' but I'd
prefer fixing NSD/nsdc/our environment.)
-----
Here's how the problem evolves on our systems:
1. Initially (after e.g. server reboot) NSD is running fine. Three nsd
processes (with default server-count setting), one of which (pid 20925
here) is the parent to the other two (20933 and 20934). The parent NSD
owns all tcp LISTEN and udp sockets, and child 20933 is the only one
trying to do some AXFRs from masters:
% fgrep 'nsd[20925]' /var/log/messages
Sep 14 14:59:50 isar nsd[20925]: nsd started (NSD 3.2.3), pid 20925
% ps -ef | grep nsd.conf
nsd 20925 1 56 14:59 ? 00:00:04 /usr/sbin/nsd -c /v/net/ns-secondary.funet.fi/etc/nsd/nsd-isar.conf
nsd 20933 20925 11 14:59 ? 00:00:00 /usr/sbin/nsd -c /v/net/ns-secondary.funet.fi/etc/nsd/nsd-isar.conf
nsd 20934 20925 1 14:59 ? 00:00:00 /usr/sbin/nsd -c /v/net/ns-secondary.funet.fi/etc/nsd/nsd-isar.conf
% sudo netstat -anp | grep nsd | grep -v -E '(LISTEN|ESTABLISHED)'
tcp 0 0 128.214.248.137:53 0.0.0.0:* LISTEN 20925/nsd
tcp 0 0 193.167.245.84:53 0.0.0.0:* LISTEN 20925/nsd
tcp 0 0 195.148.12.100:53 0.0.0.0:* LISTEN 20925/nsd
tcp 0 0 2001:708:10:70::55:1:53 :::* LISTEN 20925/nsd
tcp 0 0 2001:708:10:55::53:53 :::* LISTEN 20925/nsd
udp 0 0 128.214.248.137:53 0.0.0.0:* 20925/nsd
udp 0 0 193.167.245.84:53 0.0.0.0:* 20925/nsd
udp 0 0 195.148.12.100:53 0.0.0.0:* 20925/nsd
udp 0 0 2001:708:10:70::55:1:53 :::* 20925/nsd
udp 0 0 2001:708:10:55::53:53 :::* 20925/nsd
% sudo netstat -anp -A inet,inet6 | grep nsd | grep SYN_SENT
(master server IP addresses obfuscated as masterX)
tcp 0 1 128.214.248.137:57130 masterA:53 SYN_SENT 20933/nsd
tcp 0 1 128.214.248.137:53660 masterA:53 SYN_SENT 20933/nsd
tcp 0 1 128.214.248.137:37449 masterB:53 SYN_SENT 20933/nsd
tcp 0 1 128.214.248.137:50400 masterC:53 SYN_SENT 20933/nsd
tcp 0 1 128.214.248.137:48250 masterD:53 SYN_SENT 20933/nsd
tcp 0 1 128.214.248.137:35477 masterE:53 SYN_SENT 20933/nsd
tcp 0 1 128.214.248.137:34076 masterE:53 SYN_SENT 20933/nsd
tcp 0 1 128.214.248.137:48529 masterE:53 SYN_SENT 20933/nsd
tcp 0 1 128.214.248.137:48535 masterE:53 SYN_SENT 20933/nsd
tcp 0 1 128.214.248.137:35901 masterD:53 SYN_SENT 20933/nsd
tcp 0 1 128.214.248.137:37632 masterF:53 SYN_SENT 20933/nsd
tcp 0 1 128.214.248.137:47745 masterE:53 SYN_SENT 20933/nsd
tcp 0 1 128.214.248.137:32931 masterF:53 SYN_SENT 20933/nsd
tcp 0 1 128.214.248.137:49364 masterG:53 SYN_SENT 20933/nsd
tcp 0 1 128.214.248.137:37719 masterD:53 SYN_SENT 20933/nsd
tcp 0 1 128.214.248.137:36117 masterF:53 SYN_SENT 20933/nsd
tcp 0 1 128.214.248.137:37209 masterD:53 SYN_SENT 20933/nsd
tcp 0 1 128.214.248.137:45596 masterG:53 SYN_SENT 20933/nsd
tcp 0 1 128.214.248.137:60195 masterD:53 SYN_SENT 20933/nsd
tcp 0 1 128.214.248.137:54687 masterE:53 SYN_SENT 20933/nsd
tcp 0 1 128.214.248.137:54131 masterF:53 SYN_SENT 20933/nsd
tcp 0 1 128.214.248.137:60484 masterE:53 SYN_SENT 20933/nsd
tcp 0 1 128.214.248.137:55516 masterD:53 SYN_SENT 20933/nsd
tcp 0 1 128.214.248.137:50900 masterF:53 SYN_SENT 20933/nsd
tcp 0 1 128.214.248.137:36627 masterG:53 SYN_SENT 20933/nsd
2. Soon NSD receives an update to one of the zones and reloads, and
updates the pid file:
(from /var/log/messages:)
Sep 14 15:01:55 isar nsd[20925]: signal received, reloading...
Sep 14 15:01:55 isar nsd[21072]: memory recyclebin holds 535424 bytes
% cat $(nsd-checkconf -o pidfile /v/net/ns-secondary.funet.fi/etc/nsd/nsd-isar.conf)
21072
3. Now observe the child pid 20933 still is running directly under init process (pid 1).
The child 20933 now owns the LISTEN sockets.
% ps -ef | grep nsd.conf
nsd 20933 1 0 14:59 ? 00:00:01 /usr/sbin/nsd /v/net/ns-secondary.funet.fi/etc/nsd/nsd-isar.conf
nsd 21072 1 2 15:01 ? 00:00:00 /usr/sbin/nsd /v/net/ns-secondary.funet.fi/etc/nsd/nsd-isar.conf
nsd 21073 21072 1 15:01 ? 00:00:00 /usr/sbin/nsd /v/net/ns-secondary.funet.fi/etc/nsd/nsd-isar.conf
% sudo netstat -anp | grep nsd | grep -v -E '(LISTEN|ESTABLISHED)'
tcp 0 0 128.214.248.137:53 0.0.0.0:* LISTEN 20933/nsd
tcp 0 0 193.167.245.84:53 0.0.0.0:* LISTEN 20933/nsd
tcp 0 0 195.148.12.100:53 0.0.0.0:* LISTEN 20933/nsd
tcp 0 0 2001:708:10:70::55:1:53 :::* LISTEN 20933/nsd
tcp 0 0 2001:708:10:55::53:53 :::* LISTEN 20933/nsd
udp 0 0 128.214.248.137:53 0.0.0.0:* 20933/nsd
udp 0 0 193.167.245.84:53 0.0.0.0:* 20933/nsd
udp 0 0 195.148.12.100:53 0.0.0.0:* 20933/nsd
udp 0 0 2001:708:10:70::55:1:53 :::* 20933/nsd
udp 0 0 2001:708:10:55::53:53 :::* 20933/nsd
Steps 2-4 repeat over and over: process 20933 keeps on running and
responding to DNS queries.
I'm not sure if process 20933 should terminate in reloads but for sure
nsdc cannot find it because nsdc depends on pidfile being correct.
-----
Regards,
--
Ville Mattila, System Specialist, Funet network, CSC
PO Box 405, FIN-02101 Espoo, Finland, fax +358 9 457 2302
CSC is the Finnish IT Center for Science, http://www.csc.fi/, email:
ville.mattila at csc.fi
More information about the nsd-users
mailing list