<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40"><head><meta http-equiv=Content-Type content="text/html; charset=us-ascii"><meta name=Generator content="Microsoft Word 14 (filtered medium)"><style><!--
/* Font Definitions */
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
margin-bottom:.0001pt;
font-size:11.0pt;
font-family:"Calibri","sans-serif";}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:purple;
text-decoration:underline;}
span.EmailStyle17
{mso-style-type:personal-compose;
font-family:"Calibri","sans-serif";
color:windowtext;}
.MsoChpDefault
{mso-style-type:export-only;
font-size:10.0pt;
font-family:"Calibri","sans-serif";}
@page WordSection1
{size:8.5in 11.0in;
margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]--></head><body lang=EN-US link=blue vlink=purple><div class=WordSection1><p class=MsoNormal>I am using an amazon large EC2 instance (4ECUs, 2 cores) for my unbound configured as below. I am seeing a 150ms+ average response time as reported by namebench Alexa 2K result. In order to reduce my lookup times, I am running an hourly scan of these 35K sites (from namebench dat files) in order to give my clients a cached response whenever possible. On average, my cachemiss rate is 6% as shown below. My cache-ttl-min is 1 hour so these entries should be cached at all times. The cachemisses I am guessing are from sites my pythonmod looks up and responds to in a special way:<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>6.5Mbytes of free RAM<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>total.num.cachehits=3185<o:p></o:p></p><p class=MsoNormal>total.num.cachemiss=188<o:p></o:p></p><p class=MsoNormal>mem.cache.rrset=8319405<o:p></o:p></p><p class=MsoNormal>mem.cache.message=8729827<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>(forked configuration)<o:p></o:p></p><p class=MsoNormal>server:<o:p></o:p></p><p class=MsoNormal> #disable chroot as it caused several issues with python's PYTHONHOME vars<o:p></o:p></p><p class=MsoNormal> chroot: ""<o:p></o:p></p><p class=MsoNormal> verbosity: 0<o:p></o:p></p><p class=MsoNormal> # set to num of cores or cpus<o:p></o:p></p><p class=MsoNormal><b> num-threads: 2<o:p></o:p></b></p><p class=MsoNormal> ##slabs <o:p></o:p></p><p class=MsoNormal> rrset-cache-slabs: 1<o:p></o:p></p><p class=MsoNormal> infra-cache-slabs: 1<o:p></o:p></p><p class=MsoNormal> key-cache-slabs: 1<o:p></o:p></p><p class=MsoNormal> msg-cache-slabs: 1<o:p></o:p></p><p class=MsoNormal> ##cache sizes<o:p></o:p></p><p class=MsoNormal> msg-cache-size: 250m<o:p></o:p></p><p class=MsoNormal> #2X msg-cache-size<o:p></o:p></p><p class=MsoNormal> rrset-cache-size: 500m<o:p></o:p></p><p class=MsoNormal> outgoing-range: 950<o:p></o:p></p><p class=MsoNormal> #2X outgoing range<o:p></o:p></p><p class=MsoNormal> num-queries-per-thread: 512<o:p></o:p></p><p class=MsoNormal> # sudo sysctl -w net.core.rmem_max=8388608<o:p></o:p></p><p class=MsoNormal> so-rcvbuf: 8m<o:p></o:p></p><p class=MsoNormal> interface: 0.0.0.0<o:p></o:p></p><p class=MsoNormal> interface: ::0<o:p></o:p></p><p class=MsoNormal> port: 53<o:p></o:p></p><p class=MsoNormal> access-control: 0.0.0.0/0 allow<o:p></o:p></p><p class=MsoNormal> module-config: "python iterator"<o:p></o:p></p><p class=MsoNormal><b> prefetch: yes<o:p></o:p></b></p><p class=MsoNormal><b> cache-min-ttl: 3600<o:p></o:p></b></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>python:<o:p></o:p></p><p class=MsoNormal> python-script: "XYZ"<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>remote-control:<o:p></o:p></p><p class=MsoNormal> control-enable: yes<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>forward-zone:<o:p></o:p></p><p class=MsoNormal> name: "."<o:p></o:p></p><p class=MsoNormal> forward-addr: XYZ<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>Question:<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>Even with this setup, I am seeing most of the domains return a TTL of 3600 at the start of a random namebench which means they were iterated/recursed over instead of looked up from cache. This is causing a 150ms+ average response times for these 35K sites. It’s the exact same 35K sites being scanned by namebench – why aren’t these looked up from the cache instead of being iterated over? Are these sites not cached for a full 3600 seconds? <o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>With prefetch, cache-min-ttl of 1hour, why isn’t an hourly scan of these 35K sites populating my cache and giving me a <50ms response time on average?<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>With the same setup, if I take 500 sites and run namebench back to back for these fixed 500 sites, my average response time starts approaching 40-50ms which is where I am trying to be with the 35K sites. <o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>Where am I going wrong and how can debug and fix this issue?<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>Vinay.<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p></div></body></html>