unbound timeouts to auth-zone records when server lost access to Internet

me at 16depo.xyz me at 16depo.xyz
Thu Sep 23 09:09:37 UTC 2021


Hello! We have a task at our hands to ensure that most of our devices 
would work even after our offices lose access to Internet. We set up 
some auth-zones:

auth-zone:
name: "printers.company.org"
master: "198.51.100.55"
master: "2001:db8:::ffff"
fallback-enabled: "yes"
for-downstream: "no"
for-upstream: "yes"
zonefile: "/usr/local/etc/unbound/slave_zones/printers.company.org"

auth-zone:
name: "cctv.company.org"
master: "198.51.100.55"
master: "2001:db8:::ffff"
fallback-enabled: "yes"
for-downstream: "no"
for-upstream: "yes"
zonefile: "/usr/local/etc/unbound/slave_zones/cctv.company.org"

auth-zone:
name: "company.org"
master: "198.51.100.55"
master: "2001:db8:::ffff"
fallback-enabled: "yes"
for-downstream: "no"
for-upstream: "yes"
zonefile: "/usr/local/etc/unbound/slave_zones/company.org"

When we have Internet access, unbound work as intended — resolve records 
recursively.

When our office loses access, we expect to be able to resolve records 
saved in slave zones, but we experience different results:

  * When number of devices is small — unbound is serving requests for
    domains in auth-zones, and everything works fine.
  * When number of devices is large — unbound stop serving all requests.


We tried to reproduce this problem in our lab with dnsperf. For our 
stand we used the last version of unbound (1.13.2) on FreeBSD 12 (config 
in attach):
dnsperf -s dns-lab.company.org -f inet6 -Q 100000 -d data -l 300 -q 200

- When we use data where 50% of domains from auth-zone, and 50% from 
elsewhere — unbound struggling, but continued to serve our records.
- When data is composed of 10000 third-party domains and 300 of our 
domains — unbound is lost its ability to serve any request, and every 
resolve attempt ended in timeout.

When we dtrace process, we find out that that unbound work most of the 
time in processQueryTargets -> iter_filter_unsuitable (flamegraph as svg 
in attach).

Looks like this works accordingly with 
https://www.nlnetlabs.nl/documentation/unbound/info-timeout/ — unbound 
tried to reach root servers to resolve records, but it ultimately can't 
without internet access.
Also, "Мany threads can have many packets outstanding to an IP address, 
all at the same time. The infra-cache data is shared between threads." 
Since all threads are tried to get to same unreachable servers, and a 
number of requests from clients that lost Internet grow manifold — 
unbound sometimes lock in umtxn state. Even when unbound isn't lock in 
umtxn (we tried forked operation in lab), it cannot serve locally saved 
data from auth-zone.

Can you confirm that this behavior is expected from unbound? How can we, 
by changing config or other means, provide our devices with working DNS 
when we lose Internet access?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.nlnetlabs.nl/pipermail/unbound-users/attachments/20210923/95506e3a/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: usr.svg
Type: image/svg+xml
Size: 778999 bytes
Desc: not available
URL: <http://lists.nlnetlabs.nl/pipermail/unbound-users/attachments/20210923/95506e3a/attachment-0001.svg>
-------------- next part --------------
remote-control:
	control-enable: "yes"
	control-interface: "127.0.0.1"
	control-interface: "::1"
	control-port: "1953"
	server-key-file: "/usr/local/etc/unbound/unbound_server.key"
	server-cert-file: "/usr/local/etc/unbound/unbound_server.pem"
	control-key-file: "/usr/local/etc/unbound/unbound_control.key"
	control-cert-file: "/usr/local/etc/unbound/unbound_control.pem"

server:
	verbosity: 1

	num-threads: 2

	interface: 0.0.0.0
	interface: ::0

	outgoing-port-permit: 1024-65535

	outgoing-port-avoid: 1-1023

	interface-automatic: yes

	outgoing-range: 1024
	num-queries-per-thread: 512

	outgoing-num-tcp: 64

	incoming-num-tcp: 128

	so-rcvbuf: 512k

	so-sndbuf: 512k

	edns-buffer-size: 1450

	msg-cache-size: 100m

	msg-cache-slabs: 2

	rrset-cache-slabs: 2

	cache-max-ttl: 86400

	infra-cache-slabs: 2

	access-control: 0.0.0.0/0 deny
	access-control: ::0/0 deny
	access-control: 127.0.0.0/8 allow
	access-control: ::1 allow
	access-control: ::ffff:127.0.0.1 allow
	access-control: 172.21.0.0/16 allow
	access-control: 10.0.0.0/8 allow

	key-cache-size: 64m

	key-cache-slabs: 2

	neg-cache-size: 4m


	local-zone: "10.in-addr.arpa." nodefault 
	local-zone: "24.172.in-addr.arpa." nodefault
	local-zone: "25.172.in-addr.arpa." nodefault
	local-zone: "26.172.in-addr.arpa." nodefault
	local-zone: "27.172.in-addr.arpa." nodefault
	local-zone: "28.172.in-addr.arpa." nodefault
	local-zone: "29.172.in-addr.arpa." nodefault
	local-zone: "30.172.in-addr.arpa." nodefault
	local-zone: "31.172.in-addr.arpa." nodefault

stub-zone:
	name: "24.172.in-addr.arpa."
	stub-addr: "2001:db8::1001"
	stub-addr: "2001:db8::1:1"
	stub-first: "no"
	stub-prime: "yes"

stub-zone:
	name: "25.172.in-addr.arpa."
	stub-addr: "2001:db8::1001"
	stub-addr: "2001:db8::1:1"
	stub-first: "no"
	stub-prime: "yes"

stub-zone:
	name: "26.172.in-addr.arpa."
	stub-addr: "2001:db8::1001"
	stub-addr: "2001:db8::1:1"
	stub-first: "no"
	stub-prime: "yes"

stub-zone:
	name: "27.172.in-addr.arpa."
	stub-addr: "2001:db8::1001"
	stub-addr: "2001:db8::1:1"
	stub-first: "no"
	stub-prime: "yes"

stub-zone:
	name: "28.172.in-addr.arpa."
	stub-addr: "2001:db8::1001"
	stub-addr: "2001:db8::1:1"
	stub-first: "no"
	stub-prime: "yes"

stub-zone:
	name: "29.172.in-addr.arpa."
	stub-addr: "2001:db8::1001"
	stub-addr: "2001:db8::1:1"
	stub-first: "no"
	stub-prime: "yes"

stub-zone:
	name: "30.172.in-addr.arpa."
	stub-addr: "2001:db8::1001"
	stub-addr: "2001:db8::1:1"
	stub-first: "no"
	stub-prime: "yes"

stub-zone:
	name: "31.172.in-addr.arpa."
	stub-addr: "2001:db8::1001"
	stub-addr: "2001:db8::1:1"
	stub-first: "no"
	stub-prime: "yes"

stub-zone:
	name: "10.in-addr.arpa."
	stub-addr: "2001:db8::1001"
	stub-addr: "2001:db8::1:1"
	stub-first: "no"


auth-zone:
	name: "printers.company.org"
	master: "198.51.100.55"
	master: "2001:db8:::ffff"
	fallback-enabled: "yes"
	for-downstream: "no"
	for-upstream: "yes"
	zonefile: "/usr/local/etc/unbound/slave_zones/printers.company.org"
      
auth-zone:
	name: "cctv.company.org"
	master: "198.51.100.55"
	master: "2001:db8:::ffff"
	fallback-enabled: "yes"
	for-downstream: "no"
	for-upstream: "yes"
	zonefile: "/usr/local/etc/unbound/slave_zones/cctv.company.org"
	
auth-zone:
	name: "company.org"
	master: "198.51.100.55"
	master: "2001:db8:::ffff"
	fallback-enabled: "yes"
	for-downstream: "no"
	for-upstream: "yes"
	zonefile: "/usr/local/etc/unbound/slave_zones/company.org"



More information about the Unbound-users mailing list