[nsd-users] Xfrd scalability problem

Martin Švec martin.svec at zoner.cz
Sun Feb 28 19:30:11 UTC 2010

Hello again,

I think that xfrd daemon suffers a scalability problem with respect to 
the number of zones. For every zone, xfrd adds a netio_handler to the 
linked list of handlers. Then, every netio_dispatch call sequentially 
scans the entire list for "valid" filedescriptors and timeouts. With a 
large number of zones, this scan is pretty expensive and superfluous, 
because almost all zone filedescriptors/timeouts are usually not 
assigned. The problem is most obvious during "nsdc reload". Because 
server_reload function sends soa infos of all zones to xfrd, xfrd 
performs full scan of the linked list for every zone. So the resulting 
complexity of reload is O(n^2). Just try "nsdc reload" with 65000 zones 
and you'll see that xfrd daemon consumes 100% CPU for several _minutes_! 
However, I guess that the scalability problem is not only limited to 
reload, because _every_ socket communication with xfrd goes through the 
same netio_dispatch. There is "perf record" result of xfrd process 
during reload:

# Overhead  Command        Shared Object  Symbol
# ........  .......  ...................  ......
    98.69%      nsd  /usr/sbin/nsd        [.] netio_dispatch
     0.06%      nsd  [kernel]             [k] unix_stream_recvmsg
     0.05%      nsd  /usr/sbin/nsd        [.] rbtree_find_less_equal
     0.04%      nsd  [kernel]             [k] kfree
     0.04%      nsd  [kernel]             [k] copy_to_user

Then, "perf annotate netio_dispatch" shows that the heart of the problem 
is indeed in the loop scanning the linked list (because of gcc 
optimizations, line numbers are only estimative):

48.24% /work/nsd-3.2.4/netio.c:158
45.41% /work/nsd-3.2.4/netio.c:158
2.14% /work/nsd-3.2.4/netio.c:172
2.14% /work/nsd-3.2.4/netio.c:156
1.81% /work/nsd-3.2.4/netio.c:172

I wonder why the linked list in xfrd contains netio_handlers of _all_ 
zones. Wouldn't be better to dynamically add/remove zone handlers only 
when their filedescriptors/timeouts are assigned/cleared? And perhaps 
replace the linked list with a more scalable data structure? (Or NSD is 
intentionally designed to serve only a small number of zones? ;-))

Best regards
Martin Svec

More information about the nsd-users mailing list