Wednesday, October 4, 2006

Internet Troubles

Warning: techno-geek content follows:

Just after 2 am this morning, we lost our connection to the Internet. I know this not because I was up and watching it, but because the last email my computer picked up was at 2:08 AM, which I discovered when I logged in at my normal time (around 4:30 AM). The symptoms were familiar, and frustrating as all get out: if I connected my WildBlue satellite modem directly to my laptop, it worked fine — but just for the laptop, of course. If I connected the modem to my Cisco 806 firewall/router, it didn’t work. We’ve seen this movie before.

But hopefully, never again — as I managed to troubleshoot the issue, and (I think!) fix it, with an electronic version of a Rube Goldberg device…

I knew from troubleshooting that I had done on previous occurrences of this problem that the basic issue was that the Cisco router wasn’t getting provisioned by DHCP via the satellite modem. For some reason, DHCP was working fine with the laptop, but not with the router.

Today I discovered why the router wasn’t being provisioned, through much googling for help on the web and correllating that with what I was observing. The problem was that the DHCP server (back in WildBlue-land somewhere) was taking so long to respond (as much as two minutes) that the router just silently timed out and tried again, which restarted the timer. It did this in an endless cycle of failures. The laptop (running Windows XP, of course), on the other hand, just stupidly sent out a single DHCP request, and waited forever for a response — which eventually WildBlue coughed up. The intermittent nature of the problem turns out to be caused by the varying speed of the Wildblue DHCP server — when it’s fast (presumably because of a light load), the Cisco worked fine because it didn’t time out. But when the DHCP server was slow (most likely due to a heavy load), the Cisco would time out and never get provisioned. And we’d be sawed off from the Internet, except for one pathetic connection on the laptop.

Once I figured out the why, the next problem was how to fix it. There’s gotta be a way! is my motto. Somewhere along the line I had an idea, a grotesque and perverted idea, one that will make every elegance-loving network engineer cringe and wince: to buy a cheap, simple, stupid broadband router and put it in between the Cisco router and the satellite modem. My thinking was that an el crappo especiale would quite likely use the same brainless algorithm to get provisioned as Windows does on the laptop — and then the Cisco would in turn get provisioned from the broadband router, quickly.

So off I went down the hill, thinking I’d go to Frye’s (about an hour away). But as I was passing the shopping center closest to our house (a mere half hour away), it occurred to me to see if Target might carry such a thing. And those fine folks, it turns out, carry exactly the kind of router that before today had been banned from my home: the D-Link EBR-2310, for a mere $49. I snarfed it and headed home, figuring I had about a 20% chance of this gambit succeeding.

On arriving home, I unpacked the little-bitty router (about the size of a small paperback book), followed the very simple directions — and in about two minutes, I had it installed and running. The D-Link was able to get provisioned from WildBlue without a problem, even though the DHCP server took 78 seconds to respond. And the Cisco router very happily provisioned itself from the D-Link, and now everything is working.

But following the tortured course of a packet coming into our home from the Internet (or leaving to the Internet) will definitely make your head hurt. There are 3 NATs (one each in the satellite modem, the D-Link, and the Cisco) before you get onto my LAN’s DMZ, and yet one more NAT before you land on one of my workstations. There are two firewalls those packets must traverse, plus whatever is going on inside the D-Link. Not to mention the physical path, which resembles a magnet winding more than it does a network.

But it works. The brain-dead little D-Link router works just fine, where the vastly more sophisticated Cisco fails in an obscure and frustratingly symptomless fashion. There’s a lesson in here somewhere, if only I was clever enough to suss it out…