Question about thread logging statistics
Yorgos Thessalonikefs
yorgos at nlnetlabs.nl
Wed May 21 08:24:15 UTC 2025
Hi Mike,
The "exceeded" number are queries that were dropped because the request
list (queries from clients) was full.
However versions 1.21.0 up to and including 1.23.0 wrongfully use
the statistic for queries that exceed the discard-timeout [1] and/or
wait-limit [2] options.
Version 1.23.0 fixes that by introducing an explicit counter for that
accessible from the 'stats' command (total.num.queries_discard_timeout
and total.num.queries_wait_limit [3]) and stop counting those drops in
the "exceeded".
What I believe happens in your case is because you increased
max-sent-count to 200, those queries are now slow to respond back and
Unbound drops the replies to those clients because discard-timeout is
exceeded, or because they are slow those clients exceed their wait-limit.
(And wrongfully counts those in the "exceeded" in the log output)
Btw, did increasing max-sent-count actually help in your case?
Is your Unbound configured specially for domain.com or it just uses a
'.' forwarder?
I mainly am asking about the last error log you shared.
Best regards,
-- Yorgos
[1]
https://unbound.docs.nlnetlabs.nl/en/latest/manpages/unbound.conf.html#unbound-conf-discard-timeout"
[2]
https://unbound.docs.nlnetlabs.nl/en/latest/manpages/unbound.conf.html#unbound-conf-wait-limit
[3]
https://unbound.docs.nlnetlabs.nl/en/latest/manpages/unbound-control.html#statistic-counters
On 20/05/2025 18:37, Mike Durkin via Unbound-users wrote:
> Hi
>
> We are using unbound docker containers (version 1.22) in our corporate
> environment and after fixing an issue with DNSSEC records, I wanted to
> ask about some of the logging statistics to see if there still might be
> an performance issue.
>
> Last week, we were getting reports of certain domains not resolving and
> I saw error messages like the following in the logs:
>
> [1747490624] unbound[1:2] info: validation failure
> <wpad.domain.com. A IN>: SERVFAIL [exceeded the maximum number of sends]
> no DS for DS domain.com. while building chain of trust
> [1747493945] unbound[1:1] error: SERVFAIL <wpad.domain.com. A IN>:
> exceeded the maximum number of sends
>
> I ended up adding the following which seemed to resolve the issue:
>
> max-sent-count: 200
>
> I had tried some lower values initially, but that didn't resolve the
> problem until I bumped it up to 200.
>
>
> So at the moment we are not getting any reports for DNS client failues,
> but I am seeing the following in the logs:
>
> [1747757273] unbound[1:0] info: server stats for thread 0:
> requestlist max 78 avg 68.4251 exceeded 84 jostled 0
> [1747757333] unbound[1:0] info: server stats for thread 0:
> requestlist max 72 avg 66.9528 exceeded 55 jostled 0
> [1747757393] unbound[1:0] info: server stats for thread 0:
> requestlist max 78 avg 66.9892 exceeded 62 jostled 0
>
>
> The thread server stats is always showing a significant number for
> exceeded. The host where the container is running is not overloaded. I
> do see in the logs that there are a significant number of requests for
> legacy subdomains that are no longer in use and cause error messages
> like the following:
>
> [1747758108] unbound[1:0] error: SERVFAIL <db01-dev.dev.domain.com.
> A IN>: all the configured stub or forward servers failed, at zone
> domain.com. from 10.10.32.2 got SERVFAIL
>
> My main question is, would those requests that are being forwarded and
> timing out with a client error "no servers could be reached" be a source
> for the "exceeded" count in the thread server stats?
>
> Thanks,
>
> -Mike Durkin
>
>
>
>
>
>
More information about the Unbound-users
mailing list