False failure in capsforid fallback due to additional rrset ordering
Wouter Wijngaards
wouter at nlnetlabs.nl
Mon Feb 4 15:24:11 UTC 2019
Hi Alex,
On 1/31/19 2:14 AM, Alex Zorin via Unbound-users wrote:
> Hi Wouter,
>
> Some background: The .pl ccTLD currently has some issues with 0x20[1] that causes Unbound to go into capsforid fallback when it encounters one or two nameservers used by pl.
>
> That's not a problem in itself, but I believe Unbound may be occasionally triggering a false positive failure during capsforid fallback.
Thanks for the detailed report. Yes, you are right, it should get
sorted. I have made a fix, also in the code repository that sorts the
rrsets before comparison.
I hope this can solve your problem.
Best regards, Wouter
Index: iterator/iter_utils.c
===================================================================
--- iterator/iter_utils.c (revision 5099)
+++ iterator/iter_utils.c (working copy)
@@ -882,10 +882,34 @@
return 1;
}
+/** compare rrsets and sort canonically. Compares rrset name, type, class.
+ */
+static int
+rrset_canonical_sort_cmp(const void* x, const void* y)
+{
+ struct ub_packed_rrset_key* rrx = (struct ub_packed_rrset_key*)x;
+ struct ub_packed_rrset_key* rry = (struct ub_packed_rrset_key*)y;
+ int r = dname_canonical_compare(rrx->rk.dname, rry->rk.dname);
+ if(r != 0)
+ return r;
+ if(rrx->rk.type != rry->rk.type) {
+ if(ntohs(rrx->rk.type) > ntohs(rry->rk.type))
+ return 1;
+ else return -1;
+ }
+ if(rrx->rk.rrset_class != rry->rk.rrset_class) {
+ if(ntohs(rrx->rk.rrset_class) > ntohs(rry->rk.rrset_class))
+ return 1;
+ else return -1;
+ }
+ return 0;
+}
+
int
reply_equal(struct reply_info* p, struct reply_info* q, struct
regional* region)
{
size_t i;
+ struct ub_packed_rrset_key** sorted_p, **sorted_q;
if(p->flags != q->flags ||
p->qdcount != q->qdcount ||
/* do not check TTL, this may differ */
@@ -899,16 +923,40 @@
p->ar_numrrsets != q->ar_numrrsets ||
p->rrset_count != q->rrset_count)
return 0;
+ /* sort the rrsets in the authority and additional sections before
+ * compare, the query and answer sections are ordered in the sequence
+ * they should have (eg. one after the other for aliases). */
+ sorted_p = (struct ub_packed_rrset_key**)regional_alloc_init(
+ region, p->rrsets, sizeof(*sorted_p)*p->rrset_count);
+ if(!sorted_p) return 0;
+ log_assert(p->an_numrrsets + p->ns_numrrsets + p->ar_numrrsets <=
+ p->rrset_count);
+ qsort(sorted_p + p->an_numrrsets, p->ns_numrrsets,
+ sizeof(*sorted_p), rrset_canonical_sort_cmp);
+ qsort(sorted_p + p->an_numrrsets + p->ns_numrrsets, p->ar_numrrsets,
+ sizeof(*sorted_p), rrset_canonical_sort_cmp);
+
+ sorted_q = (struct ub_packed_rrset_key**)regional_alloc_init(
+ region, q->rrsets, sizeof(*sorted_q)*q->rrset_count);
+ if(!sorted_q) return 0;
+ log_assert(q->an_numrrsets + q->ns_numrrsets + q->ar_numrrsets <=
+ q->rrset_count);
+ qsort(sorted_q + q->an_numrrsets, q->ns_numrrsets,
+ sizeof(*sorted_q), rrset_canonical_sort_cmp);
+ qsort(sorted_q + q->an_numrrsets + q->ns_numrrsets, q->ar_numrrsets,
+ sizeof(*sorted_q), rrset_canonical_sort_cmp);
+
+ /* compare the rrsets */
for(i=0; i<p->rrset_count; i++) {
- if(!rrset_equal(p->rrsets[i], q->rrsets[i])) {
- if(!rrset_canonical_equal(region, p->rrsets[i],
- q->rrsets[i])) {
+ if(!rrset_equal(sorted_p[i], sorted_q[i])) {
+ if(!rrset_canonical_equal(region, sorted_p[i],
+ sorted_q[i])) {
regional_free_all(region);
return 0;
}
- regional_free_all(region);
}
}
+ regional_free_all(region);
return 1;
}
>
> I have attached:
> - 0001-capsforid-verbose-logging.patch: a verbose printf-debugging patch affecting comparison routines used during capsforid fallback. (Against Unbound 1.9.0rc1).
> - unbound-host.log.gz: the output of the unbound-host invocation below that triggers this failure, with the above patch.
> - pl.pcap: the packet capture that correlates directly to the unbound-host invocation.
> - unbound.conf: the conf file used for the unbound-host invocation.
>
> This is the command that triggers the issue. Please keep in mind you may need to try as many as 10+ times because due to the unlikely selection of the pl nameserver(s) with the 0x20 issue that triggers the fallback in the first place:
>
> $ unbound-host pie3aq.pl -t caa -d -vvv -C unbound.conf
>
> My hypothesis for what is happening is that Unbound might not performing a canonical sort of the additional RRs, which then triggers the capsforid fail:
>
> [1548894424] libunbound[5931:0] info: rrset 3 not equal
> [1548894424] libunbound[5931:0] info: canonical basic compare, dname_len: 16 vs 16
> [1548894424] libunbound[5931:0] info: canonical basic compare, flags: 0 vs 0
> [1548894424] libunbound[5931:0] info: canonical basic compare, type: 256 vs 256
> [1548894424] libunbound[5931:0] info: canonical basic compare, rrset_class: 256 vs 256
> [1548894424] libunbound[5931:0] info: canonical basic compare, ttl: 0 vs 0
> [1548894424] libunbound[5931:0] info: canonical basic compare, count: 1 vs 1
> [1548894424] libunbound[5931:0] info: canonical basic compare, rrsig_count: 0 vs 0
> [1548894424] libunbound[5931:0] info: canonical basic compare, trust: 1 vs 1
> [1548894424] libunbound[5931:0] info: canonical basic compare, security: 0 vs 0
> [1548894424] libunbound[5931:0] info: d vs d
> [1548894424] libunbound[5931:0] info: n vs n
> [1548894424] libunbound[5931:0] info: s vs s
> [1548894424] libunbound[5931:0] info: 2 vs 1
> [1548894424] libunbound[5931:0] info: rrset 3 not canonical equal
> [1548894424] libunbound[5931:0] info: Capsforid fallback: getting different replies, failed
>
> In the above, we see that the label "dns1" being compared to "dns2" is what ultimately triggers the capsforid failure.
>
> In the pcap, I believe this refers to a comparison of DNS response 0x00006547 with DNS response 0x000006bd, where in the latter, the ordering of the additional records is:
>
> dns2.pie3aq.pl: type A, class IN, addr 5.135.61.93
> dns1.pie3aq.pl: type A, class IN, addr 51.75.34.178
>
> and in the former, dns1 comes first.
>
> My very shoddy reading of Unbound's code is that these additional records should first undergo a canonical sort, in which case these messages should be matching under capsforid.
>
> I might be barking entirely up the wrong tree, but your guidance is very much appreciated.
>
> Thanks,
> Alex
>
> --
>
> 1. https://lists.dns-oarc.net/pipermail/dns-operations/2019-January/018359.html
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <http://lists.nlnetlabs.nl/pipermail/unbound-users/attachments/20190204/50d31508/attachment.bin>
More information about the Unbound-users
mailing list