Forwarder behavior, infra-cache interaction, and feature ideas
couloum
couloum at gmail.com
Tue Mar 31 09:37:37 UTC 2026
Hello,
We are currently using Unbound (v1.24.2) as a local cache and
forwarder to regional DNS recursors, and I would appreciate some
feedback on a few observations and potential improvements, before
opening a feature request on Github.
**Context**
Our typical query path is `application -> local Unbound instance
listening no 127.0.0.1 -> regional DNS recursor -> Internet`.
We use a configuration similar to:
```
forward-zone:
name: "."
forward-addr: X.X.X.X
forward-addr: X.X.X.Y
forward-first: no
```
Depending on the region we can have from 2 to 6 DNS recursors to
configure. In smaller regions, we may rely on forwarders from another
region over a private VPN.
For information, we use the following other settings:
```
server:
qname-minimisation: no
do-ip6: no
#
# Performance tuning
#
cache-max-negative-ttl: 60
outbound-msg-retry: 1
infra-cache-min-rtt: 1000
# Serve expired entries before they are refreshed in cache
serve-expired: yes
serve-expired-ttl: 300
# cache size tuning
msg-cache-size: 128m
rrset-cache-size: 256m
auto-trust-anchor-file: "/var/lib/unbound/root.key"
```
On the client side, applications use a 6s timeout with up to 3 retries
(no retry on SERVFAIL).
**Observations / Issues**
1. Infra cache behavior with forwarders
>From my understanding, the Infra cache mechanism monitors DNS response
time and will select the best ones based on latency within a band of
400ms. Also, the timeout is automatically calculated based on answer
received. (source:
https://www.nlnetlabs.nl/documentation/unbound/info-timeout/). This
makes sense when querying authoritative servers for a zone.
However, in a forwarder setup, RTT variability is often driven by the
queried domain rather than the forwarder itself. As a result,
forwarders may be penalized due to slow domains rather than actual
network latency.
2. Lack of prioritization between forwarders
In our setup, some forwarders are local (same region) and others are
remote (over VPN). Ideally, we would prefer local forwarders and only
fall back to remote ones when needed.
**Ideas / Possible Improvements**
These are exploratory suggestions:
• Ability to define a fixed timeout for forwarders, bypassing
automatic RTT-based adjustments.
• Option to disable infra-cache-based selection for forwarders
(while still detecting unresponsive ones).
• Optional round-robin strategy instead of latency-based selection.
• Support for prioritization between forwarders (e.g., prefer
local over remote).
• (Optional) Ability to probe forwarders using a specific domain
to assess availability/latency.
Example of configuration (illustrative only):
```
forward-zone:
name: "."
forward-addr: X.X.X.X%10 # %10 means a priority of 10
forward-addr: X.X.X.Y%20 # %20 means a priority of 20. Will be used
only if all forwarders with a lower priority could not answer
timeout: 200 # Define the timeout, in milliseconds
infra-cache-disable: true # Disable all the mechanism of
infra-cache. (Is defining parameters above enough?)
```
**Questions**
Does this interpretation of infra-cache behavior in a forwarder setup
seem accurate?
Are there existing configuration options that already address some of
these needs?
Would these ideas be appropriate for a feature request on GitHub?
Thank you for your feedback
More information about the Unbound-users
mailing list