Unbound strange stub_zone behavior?

Andrew Forgue andrew at forgue.io
Tue Jul 14 18:48:46 UTC 2020



> On Jul 13, 2020, at 11:55 AM, Jan Komissar (jkomissa) <jkomissa at cisco.com> wrote:
> 
> Hi Andrew,
> 
> I believe that stub-zones will not work correctly for +norecurse (RD (recursion desired) flag unset) queries. Also, if your blah.example.com has delegations to subzones (even on the same server) and you use a non-standard port, you would need a stub-zone for each sub-zone.

After restarting unbound, non-recursive queries work fine for several days, until they don't (not sure why).  My understanding is that stub_zone presents as if it's local data, and the behavior you're describing would be more like the behavior of a forward zone.

> I would follow Eric's advice to use an auth-zone, either as primary or secondary server (depending on your authoritative requirements).

Yeah, Thanks Eric & Jan I'll take a look at that, but I'm not sure the "proxied" dns server can do notifies, but seems to be a good lead.

-Andrew

> Regards,
> 
> Jan.
> 
> On 7/12/20, 12:00 PM, "Unbound-users on behalf of Eric Luehrsen via Unbound-users" <unbound-users-bounces at lists.nlnetlabs.nl on behalf of unbound-users at lists.nlnetlabs.nl> wrote:
> 
>    On 7/11/20 11:49 AM, Andrew Forgue via Unbound-users wrote:
>> I have an unbound server that acts as a recursive resolver for clients and also acts as a target for fully delegated DNS (i.e. unbound is the NS record). For the fully-delegated domain it is a simple stub zone with an upstream of localhost on a different port.  Let's call it "blah.example.com".
>> 
>> Occasionally, unbound (has happened on versions 1.10.1 and 1.7.3) will start responding to non-recursive queries with the list of root zones instead of a response from the stub-zone.  It seems that clients that use the `rd` flag are fine and continue to be able to resolve records in the stub-zone.  Only recursive desired clients will receive correct records from unbound (using the stub server).  All records in seemingly all stub zones have this behavior simultaneously.
>> 
>> I don't know what triggers it, but a full restart of unbound is the only thing that fixes it.  I've tried flushing cache, flushing infra, and everything, nothing seems to matter. I've seen only 2 things that may point to the issue.
>> 
>> - With verbosity turned up to 10, there's an entry produced in strace (but not in the actual log - maybe a misconfig): "unbound[2213085:5] debug: answer from the cache failed"
>> 
>> - stracing the "broken" unbound process is a very tight recvmsg() (of the request) and sendmsg() (with the root servers) with no syscalls in between.
>> 
>> Again, Using dig with +recurse works all the time, even when unbound gets in this state.  So seems like an unbound bug / cache corruption or something?
> 
>    If it is a bug, you may want to try a work around while waiting for a 
>    fix. You could try "auth-zone:" instead of "stub-zone:" or as a 
>    companion to "stub-zone:" You may need to give the authoritative server 
>    permission for a wholesale zone transfer to the Unbound instance. This 
>    may help avoid some undiscovered bug in piecemeal zone recursion.
>    - Eric
> 



More information about the Unbound-users mailing list