[nsd-users] Xfrd scalability problem
W.C.A. Wijngaards
wouter at NLnetLabs.nl
Mon Mar 1 08:16:17 UTC 2010
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hi Martin,
Thanks for the perf measurements. I did not know that. I wrote that
code some time ago, and decided against optimizing xfrd like this,
because the netio handler is also used by the server processes. Those
processes listen on only a limited number of sockets, and thus this is
more efficient for them. If this is the only bottleneck for a larger
number of zones, then it may be relatively easy to fix.
Best regards,
Wouter
On 02/28/2010 08:30 PM, Martin ?vec wrote:
> Hello again,
>
> I think that xfrd daemon suffers a scalability problem with respect to
> the number of zones. For every zone, xfrd adds a netio_handler to the
> linked list of handlers. Then, every netio_dispatch call sequentially
> scans the entire list for "valid" filedescriptors and timeouts. With a
> large number of zones, this scan is pretty expensive and superfluous,
> because almost all zone filedescriptors/timeouts are usually not
> assigned. The problem is most obvious during "nsdc reload". Because
> server_reload function sends soa infos of all zones to xfrd, xfrd
> performs full scan of the linked list for every zone. So the resulting
> complexity of reload is O(n^2). Just try "nsdc reload" with 65000 zones
> and you'll see that xfrd daemon consumes 100% CPU for several _minutes_!
> However, I guess that the scalability problem is not only limited to
> reload, because _every_ socket communication with xfrd goes through the
> same netio_dispatch. There is "perf record" result of xfrd process
> during reload:
>
> # Overhead Command Shared Object Symbol
> # ........ ....... ................... ......
> #
> 98.69% nsd /usr/sbin/nsd [.] netio_dispatch
> 0.06% nsd [kernel] [k] unix_stream_recvmsg
> 0.05% nsd /usr/sbin/nsd [.] rbtree_find_less_equal
> 0.04% nsd [kernel] [k] kfree
> 0.04% nsd [kernel] [k] copy_to_user
>
> Then, "perf annotate netio_dispatch" shows that the heart of the problem
> is indeed in the loop scanning the linked list (because of gcc
> optimizations, line numbers are only estimative):
>
> 48.24% /work/nsd-3.2.4/netio.c:158
> 45.41% /work/nsd-3.2.4/netio.c:158
> 2.14% /work/nsd-3.2.4/netio.c:172
> 2.14% /work/nsd-3.2.4/netio.c:156
> 1.81% /work/nsd-3.2.4/netio.c:172
>
> I wonder why the linked list in xfrd contains netio_handlers of _all_
> zones. Wouldn't be better to dynamically add/remove zone handlers only
> when their filedescriptors/timeouts are assigned/cleared? And perhaps
> replace the linked list with a more scalable data structure? (Or NSD is
> intentionally designed to serve only a small number of zones? ;-))
>
> Best regards
> Martin Svec
>
>
> _______________________________________________
> nsd-users mailing list
> nsd-users at NLnetLabs.nl
> http://open.nlnetlabs.nl/mailman/listinfo/nsd-users
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/
iEYEARECAAYFAkuLd9AACgkQkDLqNwOhpPgFvACfX5IQLLcI9iCwBMWaGmVtzK1J
7xsAn2UdLeJXS90z/Z5dvKERxN5P9Xqu
=cgTf
-----END PGP SIGNATURE-----
More information about the nsd-users
mailing list