[Unbound-users] Uneven load on threads
lst_hoe02 at kwsoft.de
lst_hoe02 at kwsoft.de
Thu Jul 5 20:40:20 UTC 2012
Zitat von Sven Ulland <sveniu at opera.com>:
> What determines how queries are scheduled to the available threads?
> We're seeing very uneven load on the 16 threads we have configured:
>
> thread0: 9.60%
> thread1: 1.40%
> thread2: 26.97%
> thread3: 3.05%
> thread4: 0.75%
> thread5: 6.20%
> thread6: 2.09%
> thread7: 4.37%
> thread8: 1.40%
> thread9: 8.38%
> thread10: 0.97%
> thread11: 2.69%
> thread12: 14.85%
> thread13: 6.92%
> thread14: 9.49%
> thread15: 0.87%
>
> We have around 15-20k queries per second in total on this node, and
> while the node is not struggling by any means (and we could easily run
> with fewer threads), it would be interesting to understand what's
> happening. Queries are coming in from nodes spread across a few
> subnets, with random source ports.
>
> We have 8 queues on the network card, so we distribute interrupts
> between 8 CPUs (manually configured by echoing cpu masks into
> /proc/irq/n/smp_affinity). I was thinking that there's a relationship
> here, that causes packets received on a certain queue -- and thus
> a certain CPU -- to end up being handled by the Unbound thread running
> on the same CPU, or copied to another CPU with a NET_RX softirq. This
> could be way off. Perhaps this would work better with forked
> operation.
>
> Network card input queue interrupt rates:
>
> eth0-0: 0.6%
> eth0-1: 12.1%
> eth0-2: 11.7%
> eth0-3: 32.1%
> eth0-4: 10.6%
> eth0-5: 10.6%
> eth0-6: 10.5%
> eth0-7: 11.9%
>
> If anyone could shed some light on this, both how the uneven load
> comes about, and how the packet-to-queue-to-cpu-to-application_thread
> works (and how it should be set up for optimal performance, possibly
> ref [1] and taskset), it would be much appreciated!
>
> Relevant parts of unbound.conf, version 1.4.16:
> num-threads: 16
> msg-cache-slabs: 16
> rrset-cache-slabs: 16
> infra-cache-slabs: 16
> key-cache-slabs: 16
> rrset-cache-size: 2000m
> msg-cache-size: 500m
> outgoing-range: 8192 # Yes, --with-libevent
> num-queries-per-thread: 4096
> so-rcvbuf: 8m
> so-sndbuf: 8m
> extended-statistics: yes
>
> [1]: Documentation/networking/scaling.txt
> <URL:http://lxr.linux.no/#linux+v3.4.4/Documentation/networking/scaling.txt>
This has been discussed lately and as far as i understand the
distribution between the threads is a OS duty
(https://unbound.net/pipermail/unbound-users/2012-February/002240.html).
Regards
Andreas
More information about the Unbound-users
mailing list