[nsd-users] xfr errors

Michael Tokarev mjt at tls.msk.ru
Sun Aug 1 09:31:52 UTC 2021


01.08.2021 12:19, Michael Tokarev via nsd-users wrote:
> 01.08.2021 12:12, Anand Buddhdev via nsd-users wrote:
> 
>> TSIG requires the time on the primary and secondary to be synchronised
>> to within 5 minutes. Check the system time on your two servers. One of
>> them has probably drifted more than 5 minutes. If you're not already
>> running something like ntp or chrony, you should do that to keep the
>> time accurate on these servers.
> 
> The time is syncronized within microseconds. We're running time sync
> daemons on both, and both works as expected. This is the first thing
> I checked - if the time is okay - yes it is, ticking exactly in line
> on both servers.
> 
> I'm trying to rebuild NSD with #ifdebug turned on in the place where
> BADTIME is returned, but so far it doesn't work, - apparently it tries
> to print to a closed filedescriptor :)

And after actually enabling the debug stuff in there I found out that
the time differs by exactly 3 hours - which is our GMT offset. Which
lead me to realize that the problem apparently is wrong timezone
picked by one of the NSDs, - after restart of NSD on master, the
secondary were finally able to transfer the zones.

There was an issue still, - but this looks more like a prob is with
systemd integration (debian), not in nsd itself:

Aug 01 12:24:15 panda.tls.msk.ru systemd[1]: Starting Name Server Daemon...
Aug 01 12:24:15 panda.tls.msk.ru nsd[3195]: nsd starting (NSD 4.3.5)
Aug 01 12:24:15 panda.tls.msk.ru nsd[3195]: [2021-08-01 12:24:15.242] nsd[3195]: notice: nsd starting (NSD 4.3.5)
Aug 01 12:24:15 panda.tls.msk.ru nsd[3196]: nsd started (NSD 4.3.5), pid 3195
Aug 01 12:24:15 panda.tls.msk.ru nsd[3196]: [2021-08-01 12:24:15.274] nsd[3196]: notice: nsd started (NSD 4.3.5), pid 3195
Aug 01 12:24:16 panda.tls.msk.ru nsd[3195]: zone corpit.ru serial 2020071628 is updated to 2020071632
Aug 01 12:24:16 panda.tls.msk.ru nsd[3195]: [2021-08-01 12:24:16.375] nsd[3195]: info: zone corpit.ru serial 2020071628 is updated to 2020071632
.... a few more zones updated ....
Aug 01 12:25:45 panda.tls.msk.ru systemd[1]: nsd.service: start operation timed out. Terminating.
Aug 01 12:25:45 panda.tls.msk.ru nsd[3196]: signal received, shutting down...
Aug 01 12:25:45 panda.tls.msk.ru nsd[3196]: [2021-08-01 12:25:45.228] nsd[3196]: warning: signal received, shutting down...
Aug 01 12:25:45 panda.tls.msk.ru systemd[1]: nsd.service: Failed with result 'timeout'.
Aug 01 12:25:45 panda.tls.msk.ru systemd[1]: Failed to start Name Server Daemon.
Aug 01 12:25:45 panda.tls.msk.ru systemd[1]: nsd.service: Scheduled restart job, restart counter is at 1.
Aug 01 12:25:45 panda.tls.msk.ru systemd[1]: Stopped Name Server Daemon.
Aug 01 12:25:45 panda.tls.msk.ru systemd[1]: Starting Name Server Daemon...
Aug 01 12:25:45 panda.tls.msk.ru nsd[3240]: nsd starting (NSD 4.3.5)
Aug 01 12:25:45 panda.tls.msk.ru nsd[3240]: [2021-08-01 12:25:45.512] nsd[3240]: notice: nsd starting (NSD 4.3.5)
Aug 01 12:25:45 panda.tls.msk.ru nsd[3241]: nsd started (NSD 4.3.5), pid 3240
Aug 01 12:25:45 panda.tls.msk.ru nsd[3241]: [2021-08-01 12:25:45.543] nsd[3241]: notice: nsd started (NSD 4.3.5), pid 3240
Aug 01 12:27:15 panda.tls.msk.ru systemd[1]: nsd.service: start operation timed out. Terminating.
Aug 01 12:27:15 panda.tls.msk.ru nsd[3241]: signal received, shutting down...
Aug 01 12:27:15 panda.tls.msk.ru nsd[3241]: [2021-08-01 12:27:15.728] nsd[3241]: warning: signal received, shutting down...
Aug 01 12:27:15 panda.tls.msk.ru systemd[1]: nsd.service: Failed with result 'timeout'.
Aug 01 12:27:15 panda.tls.msk.ru systemd[1]: Failed to start Name Server Daemon.
Aug 01 12:27:15 panda.tls.msk.ru systemd[1]: nsd.service: Scheduled restart job, restart counter is at 2.
Aug 01 12:27:15 panda.tls.msk.ru systemd[1]: Stopped Name Server Daemon.
Aug 01 12:27:15 panda.tls.msk.ru systemd[1]: Starting Name Server Daemon...
Aug 01 12:27:16 panda.tls.msk.ru nsd[3266]: nsd starting (NSD 4.3.5)

(for some reason some lines are repeated in the log but with different prefix,
this is yet another thing to look at).

Anyway, the initial - main - issue is now solved, and I'm trying to understand
what actually happened.

Thank you!

/mjt


More information about the nsd-users mailing list