[nsd-users] Assumed Memory Leak in NSD-3.2.3 ??

Matthijs Mekking matthijs at NLnetLabs.nl
Fri Apr 16 11:58:18 UTC 2010


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi Christian,

As Shane pointed out, it only makes sense to measure memory in certain
points of time. IXFR increases the memory usage of NSD, and nsdc patch
will clean the IXFR changes. So, measuring memory after patch makes sense.

There is a known issue with NSD with respect to memory management. That
is, if a domain is removed from the zone, NSD will clean up the records,
but not the owner name. That means, for every domain that was once in
your zone, but has been removed since, NSD will keep a small amount of
bytes in memory. On my system, this is 68 bytes per domain. If you use
NSEC3, this might be even a bit more.

About the 9.6 GB RAM top is showing you. I claimed that 5 GB should be
sufficient. I can come up with two reasons why this does not match:

1. My claim is based on the database file being 1.1 GB, while actually
this is the size of the text zonefile. If the database file is larger,
more RAM is needed. The general approach is 4x sizeof(nsd.db) (+  a bit
overhead).

2. It is possible that you can not count up these resident sizes of nsd
processes top is showing you. NSD attempts to get the OS to share memory
with copy-on-write, thus processes sharing memory. I think you can look
this up in /proc/pid/smaps. So, while top shows you three nsd processes
which add up to be 9.6 GB RAM, it might not be true that NSD is actually
using 9.6 GB RAM.

Will you be at RIPE60? Perhaps we can make some time to look into your
situation with a bit more detail.

Hope this helps.

Best regards,

Matthijs

Christian Petrasch wrote:
> Hi Matthijs,
> 
> thank you for your fast response. The version we used prior to
>  NSD-3.2.3 was NSD-2.1.4. We execute the nsd-patch at a 6 hours interval
> and after that we are deleting the ixfr.db. So it won't be larger and
> larger. But this didn't help at the past.
> 
>>> The zone in memory is about twice as big in memory then on disk. At most
>>> it can be four times as large. So a RAM of 5 GB should be sufficient.
> 
> You tell me that 5 Gb should be sufficient for a 1.1 GB zonefile, but if
> i have a look
> onto a top like the following, i see that nsd uses 9.6 GB of RAM. How
> could this be ?
> Or do you mean we need 5 GB for each process ?
> 
> 
> top - 13:20:33 up 55 days,  1:21,  0 users,  load average: 0.00, 0.00, 0.00
> Tasks:  52 total,   2 running,  50 sleeping,   0 stopped,   0 zombie
> Cpu(s):  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,
>  0.0%st
> Mem:  12582912k total,  8555160k used,  4027752k free,    52316k buffers
> Swap:  9999992k total,   856700k used,  9143292k free,  1861800k cached
> 
>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
> 18025 dnsadm    15   0 3709m 3.6g  332 S    0 29.9   0:00.44 nsd
> 18026 dnsadm    18   0 3709m 3.6g  136 S    0 29.9   0:00.00 nsd
> 22273 dnsadm    15   0 3282m 2.4g  616 S    0 20.0   0:14.31 nsd
> 
> Between this mail and the last one NSD was not restartet. Only IXFR
> update is running.
> And if you have a look onto the older top and the newer one from today
> you can see that the RAM footprint is increasing
> 
> old
> 
>  top - 15:14:33 up 50 days,  3:15,  2 users,  load average: 0.00, 0.23, 0.34
>> Tasks:  61 total,   1 running,  60 sleeping,   0 stopped,   0 zombie
>> Cpu(s):  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,
>>  0.0%st
>> Mem:  12582912k total,  6505944k used,  6076968k free,     1088k buffers
>> Swap:  9999992k total,   689732k used,  9310260k free,     8348k cached
>>
>>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
>> 20616 dnsadm    15   0 86632 1612  912 S    0  0.0   0:00.00 sshd
>> 20617 dnsadm    15   0 67276 1988  988 S    0  0.0   0:00.01 tcsh
>> 20909 dnsadm    15   0 63536  992  836 S    0  0.0   0:00.00 less
>> 21936 dnsadm    15   0 86628 1608  912 S    0  0.0   0:00.00 sshd
>> 21937 dnsadm    15   0 67268 1984  988 S    0  0.0   0:00.02 tcsh
>> 22259 dnsadm    15   0 10704  992  776 R    0  0.0   0:00.15 top
>> 22273 dnsadm    16   0 3282m *2.6g*  608 S    0 21.3   0:14.30 nsd
>> 22284 dnsadm    15   0 3439m *3.3g*  412 S    0 27.7   5:40.85 nsd
>> 22526 dnsadm    18   0 3439m *3.3g*  136 S    0 27.7   0:00.00 nsd
> 
> 6,6 GB + 2,6 Gb = 9,2 Gb
> 
> new
> 
> top - 13:20:33 up 55 days,  1:21,  0 users,  load average: 0.00, 0.00, 0.00
> Tasks:  52 total,   2 running,  50 sleeping,   0 stopped,   0 zombie
> Cpu(s):  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,
>  0.0%st
> Mem:  12582912k total,  8555160k used,  4027752k free,    52316k buffers
> Swap:  9999992k total,   856700k used,  9143292k free,  1861800k cached
> 
>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
> 18025 dnsadm    15   0 3709m *3.6g*  332 S    0 29.9   0:00.44 nsd
> 18026 dnsadm    18   0 3709m *3.6g*  136 S    0 29.9   0:00.00 nsd
> 22273 dnsadm    15   0 3282m* 2.4g*  616 S    0 20.0   0:14.31 nsd
> 
> 7,2 Gb + 2,4 Gb = 9,6 Gb
> 
> 
> Do you have any ideas ?
> 
> thank you very much
> 
> kind regards..
> 
> Christian
> 
> 
> -- 
> Christian Petrasch
> IT-Services
> 
> DENIC eG
> Kaiserstraße 75-77
> 60329 Frankfurt am Main
> GERMANY
> 
> E-Mail: petrasch at denic.de
> Fon: +49 69 27235-429
> Fax: +49 69 27235-239
> http://www.denic.de <http://www.denic.de/>
> 
> PGP-KeyID: 17613DFA, Fingerprint: 791A 40DF 47EF DBBD D8E3 72D0 9A6A
> 846E  1761 3DFA
> 
> Angaben nach § 25a Absatz 1 GenG:
> DENIC Domain Verwaltungs- und Betriebsgesellschaft eG (Sitz: Frankfurt
> am Main)
> Vorstand: Sabine Dolderer, Marcus Schäfer, Carsten Schiefner, Dr. Jörg
> Schweiger
> Vorsitzender des Aufsichtsrats: Elmar Knipp
> Eingetragen unter Nr. 770 im Genossenschaftsregister, Amtsgericht
> Frankfurt am Main
> 
> 
> 
> Von:        Matthijs Mekking <matthijs at NLnetLabs.nl>
> An:        Christian Petrasch <petrasch at denic.de>
> Kopie:        nsd-info at nlnetlabs.nl, labs at nlnetlabs.nl,
> nsd-users at nlnetlabs.nl, Jürgen Geinitz <geinitz at denic.de>, Elmar Bins
> <bins at denic.de>, Wolfgang Kriegleder <kriegleder at denic.de>
> Datum:        09.04.2010 17:35
> Betreff:        Re: Assumed Memory Leak in NSD-3.2.3 ??
> ------------------------------------------------------------------------
> 
> 
> 
> Hi Christian,
> 
> We are more than happy to help you with this problem.
> What was the version of NSD you used prior to 3.2.3?
> 
> I assume you run nsdc patch at a regular interval?
> 
> I have some more comments inline.
> 
> Best regards,
> 
> Matthijs Mekking
> NLnet Labs
> 
> What was
> Christian Petrasch wrote:
>> Hello,
> 
>> we ( the DENIC eG ) are using your nameserver software NSD now for
>> several years.
>> After  switching to NSD-3.2.3  we are getting trouble with the memory
>> footprint of NSD-3.2.3.
>> The memory  consumtion is raising until its physical  limit ist reached,
>> then swap is used.
>>  After all limits are reached NSD-3.2.3 can't  start an ixfr  anymore.
> 
>> It looks like if swapped memory is not freed.
>> Starting NSD with one process, NSD forks a second process as usual.
>> Then another process is opened. beeing independent from the others.
>> I assume the forked process is used to perform the ixfr.
>> I don't know the purpose of the independent process.
> 
> The 'independent' is indeed the xfr daemon. It will request axfr and ixfr.
> 
> 
>> We run our servers on Xen virtual machines using 12 GB of RAM and 10 GB
>> Swap.
>> The size of our zonefile is about 1.1 GB.
>> What size of RAM and swao do you recommend  for a zonefile of this size ?
>> In the past, using  NSD-2.1.4,  a RAM size of  8 GB had been sufficient.
> 
> The zone in memory is about twice as big in memory then on disk. At most
> it can be four times as large. So a RAM of 5 GB should be sufficient.
> 
>> Here an output of top of our server after a fresh start:
> 
>> top - 15:14:33 up 50 days,  3:15,  2 users,  load average: 0.00, 0.23,
> 0.34
>> Tasks:  61 total,   1 running,  60 sleeping,   0 stopped,   0 zombie
>> Cpu(s):  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,
>>  0.0%st
>> Mem:  12582912k total,  6505944k used,  6076968k free,     1088k buffers
>> Swap:  9999992k total,   689732k used,  9310260k free,     8348k cached
> 
>>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
>> 20616 dnsadm    15   0 86632 1612  912 S    0  0.0   0:00.00 sshd
>> 20617 dnsadm    15   0 67276 1988  988 S    0  0.0   0:00.01 tcsh
>> 20909 dnsadm    15   0 63536  992  836 S    0  0.0   0:00.00 less
>> 21936 dnsadm    15   0 86628 1608  912 S    0  0.0   0:00.00 sshd
>> 21937 dnsadm    15   0 67268 1984  988 S    0  0.0   0:00.02 tcsh
>> 22259 dnsadm    15   0 10704  992  776 R    0  0.0   0:00.15 top
>> 22273 dnsadm    16   0 3282m 2.6g  608 S    0 21.3   0:14.30 nsd
>> 22284 dnsadm    15   0 3439m 3.3g  412 S    0 27.7   5:40.85 nsd
>> 22526 dnsadm    18   0 3439m 3.3g  136 S    0 27.7   0:00.00 nsd
> 
> 
> 
>> Here the output of top  approx. 1 week of running:
> 
>> top - 14:53:59 up 50 days,  2:54,  2 users,  load average: 0.00, 0.00,
> 0.00
>> Tasks:  61 total,   2 running,  59 sleeping,   0 stopped,   0 zombie
>> Cpu(s):  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,
>>  0.0%st
>> Mem:  12582912k total,  9097360k used,  3485552k free,    94196k buffers
>> Swap:  9999992k total,  2720028k used,  7279964k free,  1953724k cached
> 
>>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
>> 19608 dnsadm    15   0 5758m 5.6g  340 S    0 46.5   0:00.84 nsd
>> 19609 dnsadm    18   0 5758m 5.6g  136 S    0 46.5   0:00.00 nsd
>> 20197 dnsadm    15   0 3282m 704m  616 S    0  5.7   0:14.68 nsd
>> 20616 dnsadm    15   0 86632 1624  912 S    0  0.0   0:00.00 sshd
>> 20617 dnsadm    15   0 67276 1988  988 S    0  0.0   0:00.01 tcsh
>> 20909 dnsadm    15   0 63536  992  836 S    0  0.0   0:00.00 less
>> 21936 dnsadm    16   0 86628 1616  912 R    0  0.0   0:00.00 sshd
>> 21937 dnsadm    15   0 67268 1984  988 S    0  0.0   0:00.02 tcsh
>> 21970 dnsadm    15   0 10704  992  776 R    0  0.0   0:00.00 top
> 
> 
>> As You can see, NSD is using a lot of RAM space.
>> The longer NSD is running the more memory is allocated until the limit
>> is reached having the famous oom-killer (out of memory) inside the
>> kernel killing some process.
> 
>> There seems to be a memory leak somewhere.
> 
>> Can you assist us with this problem ?
> 
>> Thank you very much..
> 
>> kind regards
> 
>> --
>> Christian Petrasch
>> IT-Services
> 
>> DENIC eG
>> Kaiserstraße 75-77
>> 60329 Frankfurt am Main
>> GERMANY
> 
>> E-Mail: petrasch at denic.de
>> Fon: +49 69 27235-429
>> Fax: +49 69 27235-239
>> http://www.denic.de <http://www.denic.de/> <http://www.denic.de/>
> 
>> PGP-KeyID: 17613DFA, Fingerprint: 791A 40DF 47EF DBBD D8E3 72D0 9A6A
>> 846E  1761 3DFA
> 
>> Angaben nach § 25a Absatz 1 GenG:
>> DENIC Domain Verwaltungs- und Betriebsgesellschaft eG (Sitz: Frankfurt
>> am Main)
>> Vorstand: Sabine Dolderer, Marcus Schäfer, Carsten Schiefner, Dr. Jörg
>> Schweiger
>> Vorsitzender des Aufsichtsrats: Elmar Knipp
>> Eingetragen unter Nr. 770 im Genossenschaftsregister, Amtsgericht
>> Frankfurt am Main
> 
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iQEcBAEBAgAGBQJLyFDTAAoJEA8yVCPsQCW5Nz8IAIGQF7EYEgIsSSvqMvryAEy6
Ud1x7Bj4L/RQs/t87sq4Ax8PQHz/wjeNkOUbxo51ZuhHT8+TxRq2wCeZIwQYJ0vX
/vUKy4vj1C5RBk0kaPpxyYjQ8QXm7lXcuoT/MhBzdfZ5GIFR5bn040y0unAknhKV
AnVa9PwVMeAmyOSrDXg6Mh9cRKVJA251VYSU3fi1YwHgltrPKMZtV2N/uOBC/Tn8
6JN9xpsWtmZSxCq4c8vE6gzXWMbJR9dl2VzFkXORejPjCpU+xqmWyzS9+xD3ii1L
grD+H3kXCmFC7E3P6ejAojSlMOTsfASbs91MRIJPnMqyydslx997oV8P1hyfke8=
=8s/x
-----END PGP SIGNATURE-----



More information about the nsd-users mailing list