[nsd-users] fork-failed only on certain servers

Klaus Darilion klaus.mailinglists at pernau.at
Mon Sep 23 10:00:37 UTC 2019


Hello!

We use NSD as slave on ~ 20 server. One in a while, if there is huge
IXFR, the fork fails. Frankly, it fails only on 4 of these identical 20
servers.

Those VMs are really identical: Same dom0, same amount of RAM, CPUs,
Diskspace, Kernel, sysctl settings, NSD settings.

When I compare a failled server with a good server: RAM usage before the
IXFR, was on both server 10.5GB. Both have 25G RAM installed - hence
there should be sufficient RAM available - the IXFR was ~2GB.

NSD logs look identical, except that the fork failed on one (see below).

Do have any hints whe the fork fails on some VMs?

thanks
Klaus



Good Server:
05:44:58 vie nsd[22157]: notify for xxx. from 1.2.3.4 serial 2019092314
05:44:58 vie nsd[22157]: notify for xxx. from 1234::5 serial 2019092314
05:49:45 vie nsd[669]: xfrd: zone xxx committed "received update to
serial 2019092314 at 2019-09-23T05:49:45 from 1.2.3.20 TSIG verified
with key mykey"
05:51:14 vie nsd[672]: rehash of zone xxx. with parameters 1 0 5
939fffb0948cbf34
05:51:27 vie nsd[672]: nsec3 xxx 1 %
05:51:33 vie nsd[672]: nsec3 xxx 17 %
05:51:39 vie nsd[672]: nsec3 xxx 25 %
05:51:45 vie nsd[672]: nsec3 xxx 31 %
05:51:49 vie nsd[22157]: notify for xxx. from 1.2.3.4 serial 2019092315
05:51:49 vie nsd[22157]: notify for xxx. from 1234::5 serial 2019092315
05:51:49 vie nsd[669]: xfrd: zone xxx committed "received update to
serial 2019092315 at 2019-09-23T05:51:49 from 1.2.3.4 TSIG verified with
key mykey"
05:51:49 vie nsd[22157]: notify for xxx. from 1.2.3.20 serial 2019092315
05:51:49 vie nsd[22157]: notify for xxx. from 2345::5 serial 2019092315
05:51:51 vie nsd[672]: nsec3 xxx 39 %
05:51:57 vie nsd[672]: nsec3 xxx 45 %
05:52:03 vie nsd[672]: nsec3 xxx 54 %
05:52:09 vie nsd[672]: nsec3 xxx 61 %
05:52:15 vie nsd[672]: nsec3 xxx 68 %
05:52:21 vie nsd[672]: nsec3 xxx 77 %
05:52:27 vie nsd[672]: nsec3 xxx 84 %
05:52:33 vie nsd[672]: nsec3 xxx 91 %
05:52:39 vie nsd[672]: nsec3 xxx 98 %
05:52:41 vie nsd[672]: zone xxx. received update to serial 2019092314 at
2019-09-23T05:49:45 from 1.2.3.20 TSIG verified with key mykey of
1815276647 bytes in 411.196 seconds
05:52:45 vie nsd[669]: xfrd: zone xxx committed "received update to
serial 2019092315 at 2019-09-23T05:52:45 from 2345::5 TSIG verified with
key mykey"
05:52:57 vie nsd[672]: zone xxx. received update to serial 2019092315 at
2019-09-23T05:51:49 from 1.2.3.4 TSIG verified with key mykey of 792413
bytes in 0.03947 seconds
05:53:05 vie nsd[669]: zone xxx serial 2019092314 is updated to 2019092315.



Failed Server:
05:44:59 nyc nsd[344]: notify for xxx. from 1234::5 serial 2019092314
05:44:59 nyc nsd[344]: notify for xxx. from 1.2.3.4 serial 2019092314
05:49:54 nyc nsd[10937]: xfrd: zone xxx committed "received update to
serial 2019092314 at 2019-09-23T05:49:54 from 2345::5 TSIG verified with
key mykey"
05:51:14 nyc nsd[10939]: rehash of zone xxx. with parameters 1 0 5
939fffb0948cbf34
05:51:25 nyc nsd[10939]: nsec3 xxx 1 %
05:51:31 nyc nsd[10939]: nsec3 xxx 17 %
05:51:37 nyc nsd[10939]: nsec3 xxx 25 %
05:51:43 nyc nsd[10939]: nsec3 xxx 31 %
05:51:49 nyc nsd[10939]: nsec3 xxx 38 %
05:51:49 nyc nsd[344]: notify for xxx. from 1.2.3.4 serial 2019092315
05:51:49 nyc nsd[344]: notify for xxx. from 1234::5 serial 2019092315
05:51:50 nyc nsd[344]: notify for xxx. from 2345::5 serial 2019092315
05:51:50 nyc nsd[344]: notify for xxx. from 1.2.3.20 serial 2019092315
05:51:50 nyc nsd[10937]: xfrd: zone xxx committed "received update to
serial 2019092315 at 2019-09-23T05:51:50 from 1.2.3.4 TSIG verified with
key mykey"
05:51:55 nyc nsd[10939]: nsec3 xxx 45 %
05:52:02 nyc nsd[10939]: nsec3 xxx 54 %
05:52:09 nyc nsd[10939]: nsec3 xxx 62 %
05:52:15 nyc nsd[10939]: nsec3 xxx 71 %
05:52:21 nyc nsd[10939]: nsec3 xxx 78 %
05:52:27 nyc nsd[10939]: nsec3 xxx 84 %
05:52:33 nyc nsd[10939]: nsec3 xxx 90 %
05:52:39 nyc nsd[10939]: nsec3 xxx 97 %
05:52:42 nyc nsd[10939]: zone xxx. received update to serial 2019092314
at 2019-09-23T05:49:54 from 2345::5 TSIG verified with key mykey of
1815276647 bytes in 418.798 seconds
05:52:43 nyc nsd[10939]: fork failed: Cannot allocate memory
05:52:45 nyc nsd[10937]: process 10939 exited with status 256
05:52:45 nyc nsd[4570]: handle_reload_cmd: reload closed cmd channel
05:52:45 nyc nsd[4570]: Reload process 10939 failed, continuing with old
database
05:52:46 nyc nsd[10937]: xfrd: zone xxx committed "received update to
serial 2019092315 at 2019-09-23T05:52:46 from 1.2.3.20 TSIG verified
with key mykey"
05:53:10 nyc nsd[10937]: xfrd: zone xxx: soa serial 2019092315 update
failed, restarting transfer (notified zone)




More information about the nsd-users mailing list