Would be unbound good candidate to replace systemd-resolved on desktop?

Fri May 27 23:13:22 UTC 2022

On 5/27/22 16:40, Paul Wouters wrote:
> On Fri, 27 May 2022, Petr Menšík wrote:
>
>> They obivously dislike DNSSEC and consider it breaking too many stuff.
>> Often their first advice is try turning DNSSEC off. Worked? Okay, close
>> the issue.
>
> So that shows to my point of letting them ask "what do we need to do to
> be considered for a system default resolver", and one point would be
> "do not disabled DNSSEC". If they keep thinking DNSSEC is not core to
> the DNS protocol, they have no business being the default resolver.
I agree. But they have well integrated implementation, which solves
*some* users' problems. Of course it breaks others, but that is a
different story.
>
>> Even when I disagree with them in quite many things, they have working
>> resolver even in weird network situations. For example routerboard wifi
>> routers do not support EDNS0 or DNSSEC.
>
> EDNS is an RFC from 1999. In 2019, there was a global DNS "flag day" to
> remove all workarounds dealing with badly or not implemented EDNS0:
>
> https://kb.isc.org/docs/dns-flag-day-will-it-affect-you
>
> It seems like systemd-resolved would be the single DNS resolver that
> still feels it needs to hack around broken EDNS0 support?

This is not only case of systemd-resolved. They have ticket with broken
DBahn train network. I were talking about mikrotik system, which
provides not bad routers with updated firmware [1]. They have many good
features, even have DNS over TLS support. I have made a ticket, but
unless there is a demand for DNSSEC support from customers, they won't
fix that.

So resolvers without DNSSEC or even EDNS0 exists. We should not create
hacks for their invalid responses. But if they ignore ENDS0 or DNSSEC OK
bit, that is well specified. We have to allow that working somehow, if
DNSSEC validation should become enabled by default. It might emit
warning in status icon or something like, but should connect and
resolve. I would consider that well-implemented DNSSEC=allow-downgrade,
which resolved has just wrong.

>
>> If I have a laptop and connect
>> to such network, I would like to receive working resolution.
>
> See the work in the ADD group. Basically, you will move anyway towards
> not giving coffeeshops and hotels your DNS traffic, as that is a privacy
> issue, and your system will start using a Trusted Remote Resolver (TRR).
> Mozilla is already doing this for some of its users, eg in the US. That
> reduces your use case to doing proper captive portal in cases of broken
> DNS. And that is another battle in the opensource software stack, as
> everyone and no one takes responsibility for captive portals. systemd?
> NetworkManager? Gnome? Firefox? Everyone mucks a little and not enough.
>
>> That is not
>> as straight forward with unbound. With its default configuration it
>> would result in servfail and not working connectivity.
>
>> I would recommend
>> anyone to request such network provider to fix the DNS cache/forwarder,
>> but such networks are not very uncommon. If unbound should be
>> preinstalled also on Laptops moving from network to network, this has to
>> be solved somehow.
>
> The proper engineered solution is an "upstream" container that is only
> used for captive portaling and will not need DNSSEC and shouldn't need
> EDNS0, that uses a contained cookieless browser (so the hotspot doesn't
> get your facebook cookies) and once internet connectivity has been
> enabled, you tell NM the network is up, and NM tells all the apps, and
> those use their own ADD/TRR discovered DNS servers that will work with
> EDNS0 and DNSSEC. Those without TRR could use DoH or DoT to well known
> DNS providers (1.1.1.1, 8.8.8.8 or 9.9.9.9)

It is not so simple. There might be some local services, which you want
still reachable. For example local network printer, hotel reception page
or train page with services menu. But I found in recent RFC 8801 [2]
reference to aging RFC 6731 [3]. Both should be implemented in NM. Then
they could provide list of local resources domains for each connection
to unbound.

I think the containerized solution can be archieved by setting max TTL
to 1 until dashboard passes. Once it passes dashboard login, it can
remove restriction to max TTL and continue just well.

But right, spawning another cache configured for just tested forwarders
would be more clean. Then a browser directed to use that NS would
process the dashboard. But wait, it can be directed then to the tested
NS instead, without immediate cache, right?

I think it would not be bad, if you could assign some tag to selected
forwarders. Then when you would remove forwarders with that tag, a
simple action could clear all cached resources obtained from those
forwarders. Would be useful for clearing hosts when you disconnect from
LAN or VPN and those hosts start to be unreachable, because they are
working only on that connection. RFC 8801 mentions such case.

>
>> I would recommend ability to set DNSSEC validation per connection in NM.
>> But anyway, this should be solved automatically if unbound would be
>> installed on every desktop.
>
> That should not be needed if you engineer the above suggested solution.

This seems like well integrated dnssec-trigger. I would like it
possible, but that is another level. Basic level would be working
connection over servers the network offered. It might be mandatory in
some organizations. For example in redhat network dns queries are
monitored for communication with botnets or such potential threats. I
want maximal possible security which the network I use can provide.

Using alternative servers is out of scope for that.

>
>>> It works for bsd as default resolver. It has a dedicated team and
>>> budget. It’s developers attend IETF and RIPE meetings and are
>>> closely involved with operational issues and protocol development.
>>> None of this is true for systemd-resolved.
>> Well, Red Hat pays many of systemd developers. It is not in the same
>> situation as dnsmasq for example. It has its developer base not small.
>> But I admit they do not follow DNS best practices very often. It should
>> be noted that there are also other name resolution methods on Linux
>> workstation. It is not just DNS, but also mDNS, LLMNR or various nss
>> plugins. I don't know whether there is any work group working with those
>> methods as well.
>
> As long as we have a few dozen open systemd-resolved bugs, I am
> concluding no systemd people are working on this part of systemd.
That depends on how important those bugs are considered by them. Some
issues are fixed overnight. Some are open for years.It may not be hard,
but there is no existing implementation for that I
>> would know about. For example OpenVPN announced they would like
>> dbus-only configuration of those forwarders.
>
> I'm sure unbound wouldn't mind a varlink API that can accept a wrapped
> version of "unbound-control forward" commands. I think they should stay
> away from the aging dbus.
Okay, good note. Whatever is usable by third party is acceptable.
>> I have not read all ADD drafts. But do they solve how to direct name
>> queries to multiple resolvers when I am connected to multiple networks?
>
> Yes. Each network can advertise the "local domains" they deem they are
> authoritve for so you can reconfigure it. Note that the systemd-resolved
> solution of "multiple networks" is "throw all queries to all networks",
> which in a meeting with systemd people 10 years ago, I told them is an
> unacceptable privacy issue. It was disgarded as not a use case they
> considered.
I think they have changed that crazy idea already. I know you had mostly
conflicts with them, but they are getting slightly better. But something
like this is mentioned in RFC 6731. If multiple connections provide the
same name, you should ask all servers with equivalent priority. I don't
think this is possible with normal DNS resolver, because it considers
the first NXDOMAIN as a final answer.
>
>> There are attempts to solve those problems right now somehow, even
>> before final standard is made. Example might be hack with ~ prefix to
>> search domains in DHCP. Then they are just added to local domains set
>> for a connection, but not into /etc/resolv.conf search clause.
>
> The mixing of search domains and DHCP domains has also been problematic,
> and another case where people only look at their one use case. Red Hat
> wanted to have their local domains "just work" when people connect to
> their wifi, and so they cause modification of the search path, which is
> a security issue. When I am at starbucks, I dont want my unqualified
> queries for "mail" to go to "mail.starbucks.com". In general, my search
> domains are my own, and should not be modified by the network at all.
> That is, search domain for unqualified queries should not be taken from
> the network ever. If you want to set it for your users, provision them
> with some config/package that adds "redhat.com" to the search domains.

Here I think it should be possible to ignore local domains. Whether they
are provide by RDNSS or search, user should be able to override them or
ignore them all. That would be important for NM feature.

If you don't want mail to reach starbucks.com, just use full name in
their network. But I would like to use ping router at home or at work,
if those hosts have the same name. When search is used on trusted
connection, I think that is okay. Again, this should be configurable per
connection in NM.

I think search domain were used, because it were possible to already
configure it in existing software. It is visible in NM and DHCP. It also
make sense to have it set in most cases. RFC 6731 would be better, but
that is not yet implemented on Fedora AFAIK.

>
>> I am for ability to validate DNSSEC from every machine possible. However
>> some resolvers offered to network users do not allow it. Not only
>> resolved with DNSSEC=no is breaking it. If that is not important for a
>> network operator, okay. The user should have easy way to receive
>> notification this network does offer servers with working validation.
>> But it should work even in that case.
>
> See above if you want to really properly engineer yourself out of this
> problem. But if you are connecting to a network that does not answer
> DNS queries with the DO bit set in 2022, then that network gear has not
> been updated for so long, it should be considered under malicious
> control, and you should REALLY not let it force you to use unprotected
> DNS.
Unfortunately that is not correct. Even one of our internal RH resolvers
has it disabled for a discutable reason. And Mikrotik's latest system
still does not support even EDNS. I expect more similar vendors exist. I
think it should be visible to user that this network has some issues,
but it should not prevent working connection.
>
> These arguments are all from individual users having one individual use
> case. They are not engineering for the real world.

I know multiple cases when this happens. Denying them would not lead to
better alternative to systemd-resolved.

>> Interesting. Not sure about that. The only user of varlink that I know
>> is resolved. I guess any generic enough API would work. But dbus
>> integration on workstation would still make sense.
>
> I had hoped varlink would have gained more steam now to replace dbus.
> The libreswan team is planning to add support for it.
Thanks for the infromation. I would have to look at what it is and how
does it work. Never seen a code for working with that.
>>> This is actually a bug. Systemd-resolved changes its answer based on
>>> whether gethostbyname() or getaddr_info() is used. One of the
>>> reasons I remove this from nsswitch.conf in all my systems.
>> Not sure what do you mean. Can you explain?
>
> If you change resolv.conf to a real server instead of 127.0.0.53, and
> leave "resolv" in nsswitch.conf, then glibc will still intercept
> gethostbyname() calls and use systemd-resolved to resolve those. While
> getaddr_info() complies to POSIX still and MUST use the nameserver in
> /etc/resolv.conf, so glibc honours that. You get a frankenresolver based
> on what lowlevel DNS calls apps or libraries are using.

No, that is not true. Both gethostbyname() and getaddrinfo() use
nsswitch on Linux (or BSD), libnss_dns.so. That can provide addresses
even before ANY dns server is contacted. One possibility is /etc/hosts,
another might be mdns or libvirt plugin.

Sure, that can lead to a situation, where getent hosts localhost gets
and address, but dig localhost doesn't. Minor advantage of resolve
plugin is it can cache also /etc/hosts. And make faster lookup from it.
Pure DNS resolver usually ignores that file. dnsmasq reads that file
also. I think additional plugin monitoring that file and filling local
data from it would help also with unbound on workstations. What do you
think?

>> I would like to move LLMNR and mDNS resolution to nss plugins only. But
>> there is no simple to use asynchronous API, which can request names
>> including nss plugins from application. If we insist on DNS, we end in
>> the same mess resolved now has. We have getdns library, which is nice.
>> But DNS specific. I think we miss a simple library or service, which can
>> provide getaddrinfo() library calls in non-blocking way and easily
>> integrated into desktop application.
>
> Maybe others know more about this and can point you somewhere.
>
>> Resolved has dbus resolve1 API [3], which seems too specific for
>> resolved. I don't like the API itself, but provides a way to
>> asynchronously resolve address without extra pain.
>
> I am amused at the sentence containing "dbus" and "without extra pain".
> I tried using the various example codes and none of it worked when I
> was playing with it.

I mean desktop people might already know how to work with d-bus well.
For them that should be easy. It is not easy for me. But the thing is,
you have to implement some thread stuff, waiting for blocking
getaddrinfo() then send event somewhere to eventloop.

I think it should be simple to have single thread and eventloop, where I
say connect to "example.org" and tell me when the socket is ready or
when that failed. I have found nice QHostInfo class [4] for Qt, but I
doubt there is something similar for GTK. I think even console
applications using glib might need it asynchronous from time to time.
Not sure how well is getaddrinfo_a supported, but it is only for Linux.
Not for *BSD.

>>> Very very few people need mdns or LLMNR, especially on servers or
>>> containers. I like my enterprise DNS not optimizing for printer
>>> discovery or a developers need to reach a fake .box TLD to login to
>>> their router.
>>
>> I understand servers rarely need LLMNR. It breaks too many things in
>> current systemd-resolved implementation. Filled [1] recently, but I
>> doubt they would change it enough.
>
> Thanks for filing that. It is indeed yet another bug where the RFC is
> not followed, and things break :(
I am not sure, LLMNR RFC might be followed in fact. But it seems even
Microsoft does not follow that and restricts it just to few types.
>
>> Anyway, if I configure root forwarders to some local server and it can
>
> another nice feature of unbound is support for local root, RFC 8806.
> Which also prevents a lot of junk queries from leaking on the internet
> and increases privacy for the users on typos not leaking to root servers
> via ISP.
Sometimes you need to reach local resources via local forwarder. For
example tplinkmodem.net name is used to configure some tplink routers.
Once we have way to specify just few local resources on a network, okay.
But until that we need local name override by on connection configuration.
>
>> respond to 'machine.example.' query with unsigned content, it is quite
>> lame if my unbound turns that response to NXDOMAIN, because example. is
>> proven non-existent at root. That server obviously knows it. If that
>> domain is immediately above root, there is no danger just accepting this
>> with NTA exception IMO. Many existing networks misuse some kind of
>> private TLD. If we want unbound with DNSSEC enabled to be a default, it
>> should not break on every such network, where it worked for years
>> before. Of course they should be told to get and use own domain. There
>> is not a small collection, which might be accepted by default [2].
>
> That is something that can again be easilly provisioned using an
> enterprise package (or home user config) that adds such domains to an
> unbound include file in /etc/unbound.d/ which is why I created that
> directory in the fedora unbound package to allow files to be dropped in
> there without needing to change the core /etc/unbound/unbound.conf.
It doesn't work that way if you have mobile device. When I am at work, I
want only local resources. At home I want them different. There has to
be way for network to provide them. RFC 8801 seems able to provide even
TLS protected network identification. Until that is common, I think
per-connection override would work. But I don't want it hardcoded in a
single system.
>
>> Well, something can be changed via Merge Requests to systemd. But they
>> have own ideas how it is correct.
>
> Based on my commets and RFC quoting in their github issue tracker, I
> wouldn't invest my time in fixing code to only see it rejected as not
> their use case. Sorry :(
Understood. That is why I started here. I would like to improve other
products more than to write flames with Lennart.
>
>> My efforts can change something. But fact is systemd team has more
>> developers and I am alone again in our Infrastructure Services team on
>> DNS related products for RHEL.
>
> That mirrors my situation there, except that I had the weight of a DNS
> RFC author as well, and still that did not help :(
>
>> In fact systemd maintainer asked me for a
>> requirements summary on modern system. Any idea for a good mailing list
>> for that question? Or working group? Except what we already discussed, I
>> think easy registration of local hostnames would be useful. As a
>> replacement for libvirt-nss plugin, which provides names only via
>> getaddrinfo calls. Unless I have something equivalent or better than
>> systemd-resolved, I am afraid the resolution implementation in next
>> major RHEL release would become resolved.
>
> As long as systemd-resolved remains split of in its own package we can
> rpm -e, I can live with that. For most users, the browser's DNS is what
> really matters, and there firefox is going full TRR anyway, so whatever
> the OS will do in the future will not really matter anymore.
>
> Paul
>
I think TRR belongs not to the browser, but to the system. I want to be
protected by GDPR in EU anyway. Unbound can use DNS over TLS even
without stubby. That is great. But local resources have to be solved
somehow. Application specific DNS resolver will never be able to fulfill
that IMO. How to make it configurable also by non-experts, that is a
different task.

I would like to propose more DNSSEC features. I like SSHFP for example.
But that requires user-friendly validator. dnssec-trigger is quite
annoying, when your connection breaks for some reason. I hope secure
enough DNS will become possible on every system.

Btw. original NM unbound plugin, which called dnssec-trigger, were
removed recently from NM [5]. Time to start a real one?

Cheers,
Petr

1. https://mikrotik.com/products/group/wireless-for-home-and-office
2. https://www.rfc-editor.org/rfc/rfc8801.html
3. https://www.rfc-editor.org/rfc/rfc6731.html
4. https://doc.qt.io/archives/qt-4.8/qhostinfo.html
5.
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/commit/5da17c689be5e66ea2f63dea6f1846625e652998

-- 
Petr Menšík
Software Engineer
Red Hat, http://www.redhat.com/
email: pemensik at redhat.com
PGP: DFCF908DB7C87E8E529925BC4931CA5B6C9FC5CB