From cra at WPI.EDU Sun Feb 1 00:21:16 2015 From: cra at WPI.EDU (Chuck Anderson) Date: Sat, 31 Jan 2015 19:21:16 -0500 Subject: [Dnssec-trigger] persistent cache needed? In-Reply-To: <20150131235759.GD4025@angus.ind.WPI.EDU> References: <20150131235759.GD4025@angus.ind.WPI.EDU> Message-ID: <20150201002115.GE4025@angus.ind.WPI.EDU> On Sat, Jan 31, 2015 at 06:58:00PM -0500, Chuck Anderson wrote: > After booting up and re-opening Firefox, restoring 50-100 tabs causes > so much DNS traffic that unbound goes unresponsive, and queries > repeatedly timeout for many minutes until things finally settle down. > I thought Firefox's behavior was to not reload every tab until you > activate the tab, but maybe it is still doing DNS pre-fetches for the > inactive tabs? I don't know. > > I think we need a persistent cache, saved across restarts/reboots. > What else can we do to solve this problem? > > Or is the verbosity the cause of the problem: > > #journalctl -b -u unbound | wc -l > 24581 > > unbound.conf: > > server: > # verbosity number, 0 is least verbose. 1 is default. > verbosity: 3 Nope, I turned this back down to 1, and the problem is the same after rebooting. I also confirmed that only some DNS queries timeout. For example, www.yahoo.com and www.nasa.gov timeout (or sometimes SERVFAIL), but www.google.com works fine. Probably any DNS queries that are already cached before the flood of queries comes into unbound will work fine. I also confirmed that the problem only begins when Firefox is reloading the previous session. It takes about 5 minutes for things to settle down enough for queries to finish without timing out. From paul at nohats.ca Sun Feb 1 18:46:53 2015 From: paul at nohats.ca (Paul Wouters) Date: Sun, 1 Feb 2015 13:46:53 -0500 (EST) Subject: [Dnssec-trigger] persistent cache needed? In-Reply-To: <20150131235759.GD4025@angus.ind.WPI.EDU> References: <20150131235759.GD4025@angus.ind.WPI.EDU> Message-ID: On Sat, 31 Jan 2015, Chuck Anderson wrote: > After booting up and re-opening Firefox, restoring 50-100 tabs causes > so much DNS traffic that unbound goes unresponsive, and queries > repeatedly timeout for many minutes until things finally settle down. Why is that causing timeouts and failures on DNS for you? I do think unbound needs an option to tell it it is operating on an endnode and not a network wide cache, where it can be a little more aggressive on negative cache entries and retry more. > I think we need a persistent cache, saved across restarts/reboots. > What else can we do to solve this problem? I would like that. But it would require the cache to have some kind of timestamp associaed to it, so the loading unbound can calculate how much to lower the TTL's of the cached data. Otherwise you would end up with badly cached data that has in reality expired (and might have changed) Note this is the reverse of another problem people have, which is when switching network they want the cache to be wiped because some networks might have split-DNS entries that aren't valid elsewhere. > Or is the verbosity the cause of the problem: > > #journalctl -b -u unbound | wc -l > 24581 Verbosity causes a significant performance drop, so for your original problem it might be worth reducing it to 1 again and see if your problem disappears. Paul From wouter at nlnetlabs.nl Mon Feb 2 08:27:18 2015 From: wouter at nlnetlabs.nl (W.C.A. Wijngaards) Date: Mon, 02 Feb 2015 09:27:18 +0100 Subject: [Dnssec-trigger] persistent cache needed? In-Reply-To: References: <20150131235759.GD4025@angus.ind.WPI.EDU> Message-ID: <54CF34E6.8040308@nlnetlabs.nl> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 Hi, On 01/02/15 19:46, Paul Wouters wrote: > On Sat, 31 Jan 2015, Chuck Anderson wrote: > >> After booting up and re-opening Firefox, restoring 50-100 tabs >> causes so much DNS traffic that unbound goes unresponsive, and >> queries repeatedly timeout for many minutes until things finally >> settle down. > > Why is that causing timeouts and failures on DNS for you? If unbound was compiled with libevent, it should not have any issues coping with the traffic. But I heard that 'nat boxes' have trouble with many connections. So, I do not know how to fix this, the network won't allow the amount of traffic you are trying to do ... Best regards, Wouter > I do think unbound needs an option to tell it it is operating on an > endnode and not a network wide cache, where it can be a little more > aggressive on negative cache entries and retry more. > >> I think we need a persistent cache, saved across >> restarts/reboots. What else can we do to solve this problem? > > I would like that. But it would require the cache to have some > kind of timestamp associaed to it, so the loading unbound can > calculate how much to lower the TTL's of the cached data. Otherwise > you would end up with badly cached data that has in reality expired > (and might have changed) > > Note this is the reverse of another problem people have, which is > when switching network they want the cache to be wiped because some > networks might have split-DNS entries that aren't valid elsewhere. > >> Or is the verbosity the cause of the problem: >> >> #journalctl -b -u unbound | wc -l 24581 > > Verbosity causes a significant performance drop, so for your > original problem it might be worth reducing it to 1 again and see > if your problem disappears. > > Paul _______________________________________________ dnssec-trigger > mailing list dnssec-trigger at NLnetLabs.nl > http://open.nlnetlabs.nl/mailman/listinfo/dnssec-trigger -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAEBCAAGBQJUzzTmAAoJEJ9vHC1+BF+N0FQP/i7juffUKyFfRfPM9g+AX/qP gWXWdWy7E1bQeMxy7eniLk25zcAM1gD39d3GjJAgT9ujbU/8exJzEeDDLech/4z0 lzDup6QGJSUKH36A78G/cZXmWhfZSHFP5w0iZo3wrvvv6NnQ3UcyvdbTEsEo99U5 9gEcpQNeA2RbTTz6xgeyW/JoHcg9PJGaAbQ7e5xqzBtAZ6pLfthW+EWkIIXZJGRd H+nGxL58d5lKGM+3lFzF4YmFiGd2VRreXqy4y+e20SdxvxenGZ8e1GBhC8LVvfP7 vlTzhjYdUiV9pKyACC/5jng8BrDqFuqNif+n8stI1Z1CuoAwSQbXf/kCSO9hnpPw Jg/SiA/9tLX9Z4RFDG6SmXYsKQMfkVEzPhnNmUtg8s7i8N1+Kt2HTEgFtIi0cI3y udQMW09VVckXJaLd6zj6t2BVYUZ/9RhxWJO4ieCuzfnuBnVNepj3T6+hgmjOEX0o mHB5nkcEuDk23MqFV5Tj1ac80JuJrzuO2c4BOPcD0uw5jQWmSEwnImvlAd1q8ng6 tkzTptkLPoFaNo5xDkNhLPNOH0d3OdgXaurH4AbExb2pepQSkMyKA73kS+9K0QWH 4KgPlf7ew6HU4F63h+Xn19gLvNOrWfZJSab0CSW71kk6GjiHW+Z22h554jNbxJKF kiRcK4BvjxPtYOIEtZLs =t9kc -----END PGP SIGNATURE----- From cra at WPI.EDU Mon Feb 2 14:54:55 2015 From: cra at WPI.EDU (Chuck Anderson) Date: Mon, 2 Feb 2015 09:54:55 -0500 Subject: [Dnssec-trigger] persistent cache needed? In-Reply-To: <54CF34E6.8040308@nlnetlabs.nl> References: <20150131235759.GD4025@angus.ind.WPI.EDU> <54CF34E6.8040308@nlnetlabs.nl> Message-ID: <20150202145454.GI4025@angus.ind.WPI.EDU> On Mon, Feb 02, 2015 at 09:27:18AM +0100, W.C.A. Wijngaards wrote: > Hi, > > On 01/02/15 19:46, Paul Wouters wrote: > > On Sat, 31 Jan 2015, Chuck Anderson wrote: > > > >> After booting up and re-opening Firefox, restoring 50-100 tabs > >> causes so much DNS traffic that unbound goes unresponsive, and > >> queries repeatedly timeout for many minutes until things finally > >> settle down. > > > > Why is that causing timeouts and failures on DNS for you? I'm unsure why. It happens even with verbosity set back to 1. > If unbound was compiled with libevent, it should not have any issues > coping with the traffic. But I heard that 'nat boxes' have trouble > with many connections. So, I do not know how to fix this, the network > won't allow the amount of traffic you are trying to do ... That sounds plausible. After fixing a bug in dnssec-trigger-script that was causing it to crash (TRUE -> True), the forwarders are now being set properly via DHCP. The behavior is the same either way--without any forwaders or with one forwarder set to 192.168.1.1, a NetGear router with stock firmware. I have a CeroWRT router that I'll test next--at least I should be able to monitor the connection limit to see if that is the problem.