False failure in capsforid fallback due to additional rrset ordering

Wouter Wijngaards wouter at nlnetlabs.nl
Mon Feb 4 15:24:11 UTC 2019


Hi Alex,

On 1/31/19 2:14 AM, Alex Zorin via Unbound-users wrote:
> Hi Wouter,
> 
> Some background: The .pl ccTLD currently has some issues with 0x20[1] that causes Unbound to go into capsforid fallback when it encounters one or two nameservers used by pl.
> 
> That's not a problem in itself, but I believe Unbound may be occasionally triggering a false positive failure during capsforid fallback. 

Thanks for the detailed report.  Yes, you are right, it should get
sorted.  I have made a fix, also in the code repository that sorts the
rrsets before comparison.

I hope this can solve your problem.

Best regards, Wouter


Index: iterator/iter_utils.c
===================================================================
--- iterator/iter_utils.c	(revision 5099)
+++ iterator/iter_utils.c	(working copy)
@@ -882,10 +882,34 @@
 	return 1;
 }

+/** compare rrsets and sort canonically.  Compares rrset name, type, class.
+ */
+static int
+rrset_canonical_sort_cmp(const void* x, const void* y)
+{
+	struct ub_packed_rrset_key* rrx = (struct ub_packed_rrset_key*)x;
+	struct ub_packed_rrset_key* rry = (struct ub_packed_rrset_key*)y;
+	int r = dname_canonical_compare(rrx->rk.dname, rry->rk.dname);
+	if(r != 0)
+		return r;
+	if(rrx->rk.type != rry->rk.type) {
+		if(ntohs(rrx->rk.type) > ntohs(rry->rk.type))
+			return 1;
+		else	return -1;
+	}
+	if(rrx->rk.rrset_class != rry->rk.rrset_class) {
+		if(ntohs(rrx->rk.rrset_class) > ntohs(rry->rk.rrset_class))
+			return 1;
+		else	return -1;
+	}
+	return 0;
+}
+
 int
 reply_equal(struct reply_info* p, struct reply_info* q, struct
regional* region)
 {
 	size_t i;
+	struct ub_packed_rrset_key** sorted_p, **sorted_q;
 	if(p->flags != q->flags ||
 		p->qdcount != q->qdcount ||
 		/* do not check TTL, this may differ */
@@ -899,16 +923,40 @@
 		p->ar_numrrsets != q->ar_numrrsets ||
 		p->rrset_count != q->rrset_count)
 		return 0;
+	/* sort the rrsets in the authority and additional sections before
+	 * compare, the query and answer sections are ordered in the sequence
+	 * they should have (eg. one after the other for aliases). */
+	sorted_p = (struct ub_packed_rrset_key**)regional_alloc_init(
+		region, p->rrsets, sizeof(*sorted_p)*p->rrset_count);
+	if(!sorted_p) return 0;
+	log_assert(p->an_numrrsets + p->ns_numrrsets + p->ar_numrrsets <=
+		p->rrset_count);
+	qsort(sorted_p + p->an_numrrsets, p->ns_numrrsets,
+		sizeof(*sorted_p), rrset_canonical_sort_cmp);
+	qsort(sorted_p + p->an_numrrsets + p->ns_numrrsets, p->ar_numrrsets,
+		sizeof(*sorted_p), rrset_canonical_sort_cmp);
+
+	sorted_q = (struct ub_packed_rrset_key**)regional_alloc_init(
+		region, q->rrsets, sizeof(*sorted_q)*q->rrset_count);
+	if(!sorted_q) return 0;
+	log_assert(q->an_numrrsets + q->ns_numrrsets + q->ar_numrrsets <=
+		q->rrset_count);
+	qsort(sorted_q + q->an_numrrsets, q->ns_numrrsets,
+		sizeof(*sorted_q), rrset_canonical_sort_cmp);
+	qsort(sorted_q + q->an_numrrsets + q->ns_numrrsets, q->ar_numrrsets,
+		sizeof(*sorted_q), rrset_canonical_sort_cmp);
+
+	/* compare the rrsets */
 	for(i=0; i<p->rrset_count; i++) {
-		if(!rrset_equal(p->rrsets[i], q->rrsets[i])) {
-			if(!rrset_canonical_equal(region, p->rrsets[i],
-				q->rrsets[i])) {
+		if(!rrset_equal(sorted_p[i], sorted_q[i])) {
+			if(!rrset_canonical_equal(region, sorted_p[i],
+				sorted_q[i])) {
 				regional_free_all(region);
 				return 0;
 			}
-			regional_free_all(region);
 		}
 	}
+	regional_free_all(region);
 	return 1;
 }



> 
> I have attached:
> - 0001-capsforid-verbose-logging.patch: a verbose printf-debugging patch affecting comparison routines used during capsforid fallback. (Against Unbound 1.9.0rc1).
> - unbound-host.log.gz: the output of the unbound-host invocation below that triggers this failure, with the above patch.
> - pl.pcap: the packet capture that correlates directly to the unbound-host invocation.
> - unbound.conf: the conf file used for the unbound-host invocation.
> 
> This is the command that triggers the issue. Please keep in mind you may need to try as many as 10+ times because due to the unlikely selection of the pl nameserver(s) with the 0x20 issue that triggers the fallback in the first place:
> 
>     $ unbound-host pie3aq.pl -t caa -d -vvv -C unbound.conf
> 
> My hypothesis for what is happening is that Unbound might not performing a canonical sort of the additional RRs, which then triggers the capsforid fail:
> 
>   [1548894424] libunbound[5931:0] info: rrset 3 not equal
>   [1548894424] libunbound[5931:0] info: canonical basic compare, dname_len: 16 vs 16
>   [1548894424] libunbound[5931:0] info: canonical basic compare, flags: 0 vs 0
>   [1548894424] libunbound[5931:0] info: canonical basic compare, type: 256 vs 256
>   [1548894424] libunbound[5931:0] info: canonical basic compare, rrset_class: 256 vs 256
>   [1548894424] libunbound[5931:0] info: canonical basic compare, ttl: 0 vs 0
>   [1548894424] libunbound[5931:0] info: canonical basic compare, count: 1 vs 1
>   [1548894424] libunbound[5931:0] info: canonical basic compare, rrsig_count: 0 vs 0
>   [1548894424] libunbound[5931:0] info: canonical basic compare, trust: 1 vs 1
>   [1548894424] libunbound[5931:0] info: canonical basic compare, security: 0 vs 0
>   [1548894424] libunbound[5931:0] info: d vs d
>   [1548894424] libunbound[5931:0] info: n vs n
>   [1548894424] libunbound[5931:0] info: s vs s
>   [1548894424] libunbound[5931:0] info: 2 vs 1
>   [1548894424] libunbound[5931:0] info: rrset 3 not canonical equal
>   [1548894424] libunbound[5931:0] info: Capsforid fallback: getting different replies, failed
> 
> In the above, we see that the label "dns1" being compared to "dns2" is what ultimately triggers the capsforid failure.
> 
> In the pcap, I believe this refers to a comparison of DNS response 0x00006547 with DNS response 0x000006bd, where in the latter, the ordering of the additional records is:
> 
>   dns2.pie3aq.pl: type A, class IN, addr 5.135.61.93
>   dns1.pie3aq.pl: type A, class IN, addr 51.75.34.178
> 
> and in the former, dns1 comes first.
> 
> My very shoddy reading of Unbound's code is that these additional records should first undergo a canonical sort, in which case these messages should be matching under capsforid.
> 
> I might be barking entirely up the wrong tree, but your guidance is very much appreciated.
> 
> Thanks,
> Alex
> 
> --
> 
> 1. https://lists.dns-oarc.net/pipermail/dns-operations/2019-January/018359.html
> 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <http://lists.nlnetlabs.nl/pipermail/unbound-users/attachments/20190204/50d31508/attachment.bin>


More information about the Unbound-users mailing list