Make DNS fly with dnsmasq –all-servers

The other day my friend Andrew messaged me out of the blue:

--all-servers
By default, when dnsmasq has more than one upstream server available, it will
send queries to just one server. Setting this flag forces dnsmasq to send all
queries to all available servers. The reply from the server which answers first
will be returned to the original requester.

Rather than finding it odd to receive a pasted chunk of a manpage via IM, I was fascinated by the content. By configuring dnsmasq to be a slightly obnoxious netizen, you can effectively always have your queries answered by the fastest server — whatever it happens to be for that query.

What I wondered was how big of a difference it would actually make. Would there be a measurable difference? 5%? 20%? I decided to take namebench for a spin and find out.

The results left me so incredulous that I disabled dnsmasq’s cache entirely, figuring that it was maybe giving it an unfair advantage. (Though I don’t think it is — all my upstream servers are running caches.) Re-running with dnsmasq being forced to go upstream for every query, it still ended up 124% faster than the next-best choice. (Wat?!) It strains belief, but see for yourself:

dnsmasq running against the 5 resolvers in my /etc/resolv.conf, using the --all-servers flag, had a mean response time of 32.00 milliseconds. (Yes, that’s curious even, but the fastest was 12.2 and the slowest was 666.7, so I don’t think 32.00 holds any special value.) namebench compared it against my current nameserver, my ISP’s local nameserver, which averaged 71.68ms. (It had the same 12.2ms minimum, and the worst case was 3.5 seconds.)

In fairness, namebench found a couple of servers that are slightly faster than my ISP’s, and dnsmasq wasn’t 124% faster than them. But it’s still a huge improvement. The fastest is another of my ISP’s servers, with an average of 53.65ms. Level 3’s 4.2.2.1 open resolver averaged 60.48ms, OpenDNS averaged 69.21ms, and Google’s 8.8.8.8 averaged 75.82ms.

What’s interesting to me here is that, by asking multiple servers and always returning the fastest, dnsmasq with –all-servers ends up being considerably faster on average than any single server. It averaged 32.00ms, while the next best one averaged 53.65ms. And bear in mind that this is with dnsmasq’s cache disabled because I worried that it would bias results in favor of dnsmasq.

Graphs, for those who love them:

(Don’t mind the “Internal 192-1-1” entry — it’s a broken internal DNS cache that slipped in erroneously. It shouldn’t be used much on the LAN and uses Comcast as its upstream, so the fact that it’s slightly faster than Comcast suggests that there was some room for caching here.)

Recall, too, that this is with caching disabled in dnsmasq! It had one hand tied behind its back and it still won by an incredible amount.

For what it’s worth, here’s my dnsmasq.conf:

port=53
domain-needed
bogus-priv
resolv-file=/etc/resolv.conf
interface=lo0
listen-address=127.0.0.1
no-dhcp-interface=127.0.0.1
cache-size=0
no-negcache
bogus-nxdomain=67.215.65.132

Take note that you should not use the “cache-size=0” or “no-negcache” entries — they hurt performance. (And very badly in the real world!) I included them only to rule out the possibility of dnsmasq being faster because it was working from local cache. Of course, you really shouldn’t blindly copy-and-paste this, and it’s a very vanilla setup.

I ran dnsmasq in the foreground with sudo dnsmasq -d --all-servers.

What I’m wrestling with is the question of whether this is such poor “netizenship” that it’s bad practice. With --all-servers set, dnsmasq will send a copy of your query to every nameserver in /etc/resolv.conf. That’s five entries for me, effectively increasing load on DNS servers five times. That’s pretty obnoxious. (Though having five entries in /etc/resolv.conf is unusually many.) But at the same time, perhaps the biggest advantage of dnsmasq isn’t its borderline-magical --all-servers flag, but that it’s a caching resolver, which helps reduce the number of DNS queries that make it out of your network. So perhaps the impact isn’t quite as bad as it might seem. And in an ideal world, you’d only hit the your ISP’s servers, or a free service like Google’s or OpenDNS’s public servers, which would answer out of cache. So it does feel a little selfish and like a tragedy of the commons, but I don’t think it’s actually overly harmful.

1 Or something like that. I don’t actually have a transcript of the conversation.

The Browsers of Technical Users

I’ve been experimenting with gaug.es for analytics on a few of my sites. On my other blog, most of the traffic comes from searches for various technical topics, implying that the audience is likely a bit more tech-savvy then the general Internet populace.

Do you notice any browsers absent from this chart?

Trivia: “Other” is a 50/50 tie between Internet Explorer and Opera.

Also: gaug.es is very cool. Possibly cool enough that I will start paying when the 7-day free trial expires.