Breaking the Internet

This is a ‘nerd post’ for nerds and aspiring nerds (though it might be nominally interesting to non-nerds as well) . . . just to get that out in the open right away.

The Domain Name System (DNS) is part of the core functionality of the Internet and was central in making it usable for mere mortals. Basically, without DNS, web site addresses would look something like ‘209.85.171.99’, which is a bit harder to remember than ‘google.com’, the name under the DNS system for the exact same place on the Internet. To oversimplify, this is how the system works:

  • Google has a ‘nameserver’ (probably something like ns.google.com). That nameserver has been set up by Google to know that ‘google.com’ really means ‘209.85.171.99’.
  • The Domain Name registrar that Google uses to register ‘google.com’ to them has been set up to know that if people ask for google.com, they need to talk to ns.google.com to find out where to go.
  • The ‘root’ DNS servers on the Internet pretty much know the right other nameservers for pretty much every domain name, and update that information periodically.
  • ISPs that people use to access the Internet have nameservers too, which talk to the root servers to figure out where their customers go on the Internet when they type in ‘google.com’.

Now, imagine for a moment that you just turned on all of these servers today for the first time, so none of them knows anything yet except for the basic things you’ve put in yourself manually. This is what happens when the first user of this imaginary Internet types in ‘google.com’ and presses enter in Firefox:

  • Your computer (which is, say, on a Verizon DSL network) talks to the Verizon DSL nameserver and says, “Hey, I’m looking for ‘google.com’. Where do I go?”
  • Verizon DSL’s nameserver says, “I dunno, let me check.” It then contacts one of the ‘root’ nameservers and asks the same question.
  • The ‘root’ nameserver says, “I dunno, let me check.” It then calls up the Google nameserver and asks how to find ‘google.com’.
  • The Google nameserver says, “Oh yeah, you need to go to ‘209.85.171.99’.”
  • The ‘root’ nameserver says, “Awesome. Let me write that down so I don’t forget.” It puts the correct information for ‘google.com’ in its own cache in case anybody else asks today, then tells the Verizon nameserver the information.
  • The Verizon nameserver likewise says, “Awesome. Let me write that down so I don’t forget.” It, just like the ‘root’ nameserver, puts the correct information in its own cache in case anybody else asks today and then passes the information back to the user.
  • The user’s computer gets provided with the right information, starts talking to ‘209.85.171.99’, and Google’s web site comes up on the screen.

This all happens, most likely, within a few seconds. That’s not long, but it’s too long for billions of people visiting billions of web sites every day. That’s why, at each step, the system is remembering what it learns. The second Verizon DSL user to type in ‘google.com’ in their browser that day gets routed through much quicker:

  • Your computer talks to the Verizon DSL nameserver and says, “Hey, I’m looking for ‘google.com’. Where do I go?”
  • The Verizon DSL nameserver says, “Oh yeah, I got asked that earlier. You want ‘209.85.171.99’.”
  • The user’s computer gets provided with the right information, starts talking to ‘209.85.171.99’, and Google’s web site comes up on the screen.

Similarly, if a user on—say—Comcast Cable Internet wants to go to Google that day, the only addition to that process is that the Comcast nameserver has to ask the ‘root’ nameserver that already has the information cached. This caching has the benefit of making the Internet speedier for all, but has the drawback that if Google needed to change their IP address (the string of numbers) for some reason, that change can take some time to propagate through the system. This used to take up to 72 hours or so, but nowadays generally happens much quicker (like, 2 or 3 hours) because all the root servers and other servers and always pinging each other for information in a psycho, anarchistic hodge-podge of communication that somehow manages to work pretty reliably most of the time.

Now this brings us to the subject for the day, which is breaking the system. There’s an interesting approach for destabilizing the Internet that has been used at times in the past called ‘DNS cache poisoning’. Basically (again, putting it in oversimplified terms), if you could get wrong information into a DNS server’s cache—especially at the root nameserver level, but even at the other levels—that wrong information could propagate through much of the system, pointing untold users to the wrong place.

In my example, even if you successfully poisoned just one nameserver—the Verizon DSL one—to think that ‘google.com’ was really supposed to go to the IP address for my web site, then every Verizon DSL user who tried to go to Google that day would mysteriously arrive at Off on a Tangent. That wouldn’t really hurt anything (except a bit of Google’s ad revenue and a big jump in my hosting fees), but imagine that somebody pointed a bank’s website to a fake lookalike site with the intent of collecting people’s login information! Now you’re seeing how this could potentially be a lucritive criminal act, if successfully implemented.

Well, this sounds good in theory (if you’re a bad guy). The problem is that it’s hard, which is why it has hardly ever happened on any wide scale. You have to be able to step in at just the right moment (like, for example, right when the ‘root’ nameserver asks Google’s nameserver for how to get to ‘google.com’), you have to be able to provide a response that looks to the ‘root’ nameserver like it came from Google’s nameserver (which means it needs to have the right randomized ID code in it), and so on. Tough stuff.

Or not.

Security researcher Dan Kaminsky accidentally discovered a fundamental flaw in the DNS system about six months ago which would allow a moderately talented hacker to initiate widespread DNS cache poisoning and, quite possibly, destabilize the entire Internet in about 10 seconds-per-poisoning. Incredibly, this wasn’t a bug with a particular piece of DNS software but, rather, a bug in the actual DNS protocol and thus a bug that affects essentially every DNS server product on every operating system platform.

Kaminsky took the prudent course, releasing the information to the software folks who make DNS server software before releasing it publicly. He has been criticized for this, but I support him (at least initially) on keeping the details of this bug under wraps. Because of his efforts, an unprecedented synchronized release was made by many of these software vendors—including Microsoft, Sun, and various other open- and closed-source application developers—last month to resolve or significantly mitigate this bug. Assuming that system administrators actually apply these patches, an Internet disaster has likely been prevented. Of course, you can never really assume that system administrators will apply a patch—even one this important—so it will be interesting to watch.

But at this point, six months after the bug was discovered, I’m increasingly frustrated by the lack of published detail. Of course there have been leaks that accurately describe the potential attack this bug would allow, and the sources of those leaks have apologized for leaking them, but six months later—and after the major vendors have all fixed the problem—it’s well past time for ‘security by obscurity’. Publish the details already! If anything, releasing details of the bug now will force ISPs and other nameserver managers who haven’t applied the patch yet to apply it, thus making the Internet more well-protected faster. The bad guys have the info they need to start doing bad things, so the ‘cone of silence’ method of handling this issue is no longer productive or prudent.

All-in-all though, I do think Kaminsky handled this right and I applaud all who worked hard to fix/mitigate this bug and release patches and apply patches. Because of their efforts, the Internet will probably be humming along just fine for the immediate future.

Update 7/25/2008: Details of this bug (essentially the layman’s explanation I’ve written above with some more detail) have finally been published by Dan Kaminsky.

Scott Bradford is a writer and technologist who has been putting his opinions online since 1995. He believes in three inviolable human rights: life, liberty, and property. He is a Catholic Christian who worships the trinitarian God described in the Nicene Creed. Scott is a husband, nerd, pet lover, and AMC/Jeep enthusiast with a B.S. degree in public administration from George Mason University.