Monday, May 23, 2005

Time

I work for a company in a regulated industry (securities trading). Some of the regulations require certain of the company's computers to keep verifiably accurate time, to within plus or minus three seconds of the official national time reference, kept at the National Institute of Standards and Technology (NIST) Laboratories, in Boulder, Colorado. This is the F1 Cesium Fountain Atomic Clock, shown in the photo at right.

When I first ran into this requirement a few years ago, I thought this would be a simple matter of implementing the Network Time Protocol (NTP). This is a very well-known and widely deployed protocol for the specific purpose of allowing a computer's clock to be precisely synchronized to time standard on the Internet. These "Stratum 1" Internet time servers are themselves synchronized directly to the NIST time standard.

I ran into a showstopper hitch with this approach: the "verifiable" part of the regulatory requirement. The trouble is that NTP works by communicating with the time servers over the Internet, and the time it takes messages to travel back-and-forth over the Internet is far from certain. Here's a mind experiment that illustrates the problem:

Imagine that you want to set your kitchen clock accurately, but the only time reference available is your neighbor's clock, which you know is accurate. Then imagine that the only way you can communicate with your neighbor is by talking. So to set your clock, you organize 12 of your friends to form a line between your kitchen and your neighbor's house, where each person is within shouting distance of the person next to him in line. Now you're ready: your neighbor shouts to the first person in the line when it's exactly twelve noon, and that person relays it to the next one, and so on, until finally you hear the shout from the last of your twelve friends in line — and at exactly that moment you set your kitchen clock to twelve noon.

Now how accurate do you suppose your clock is? It's going to be off by whatever the delay was; perhaps 20 or 30 seconds in this case. But you have an idea: you'll practice this relaying ahead of time, and time the delay, so you know what offset to subtract to set your kitchen clock correctly! Well, that's basically what NTP does. But it doesn't always work correctly. For instance, suppose one of your friends sneezed exactly when the adjacent friend shouted, so that the shout had to be repeated? Oops — guess that one will take a little longer than usual! Also, how consistent do you suppose that shouting is? Not very — you'll find that the time varies considerably from attempt to attempt. No matter how you slice it, if you set your kitchen clock by the method just described, there's no way you can be positively, absolutely certain that your clock is correctly set. In other words, it's not verifiable.

It turns out there's a perfectly conventional way to solve this problem: you buy your own stratum 1 time server, and put it on your own network. This eliminates the uncertainties of the Internet, and leaves you vulnerable only to the vagaries of your own network. In most corporate IT environments (and even in many homes), these "local area" networks (LANs) are implemented in a way (called "fully switched") that essentially eliminates any uncertainties in the message delivery time. In this situation, the computers that need to have their clocks accurately set can synchronize to this local stratum 1 time server, and this will satisfy (easily) the regulatory requirements such as those my company faces.

One problem with this approach: it's very expensive. The cheapest stratum 1 time servers work by listening to cell phone signals (which are synchronized to NIST's national time reference) or by listening to GPS satellites (which have atomic clocks on board that are in turn synchronized to NIST's national time reference). These servers cost about $5,000 to $10,000 each — and a company like mine really needs to have two of them, just in case one happens to break.

At the time my company was facing this problem, we really didn't want to spend that kind of money on this problem. So we looked for alternative solutions, and found that it's possible to purchase "time receivers" that listen to the aforementioned cell phone signals, GPS satellites, or the WWVB radio station maintained by NIST for the specific purpose of transmitting time information. For a number of reasons, including price, we chose a WWVB receiver. We bought two of these, and connected them to two computers running the Linux operating system. Then we modified the source code for NTP (this is reasonably well documented, and it's open source) to match the WWVB receiver, and installed the modified and recompiled NTP. Voila! We had our own stratum one time servers, for a lot less money than the commercial equipment.

Now, being the geek that I am, I just had to have something even better (and cheaper!) at home. It took me a couple of years for technology to catch up with what I wanted, but it finally has. Garmin recently came out with a new device called the GPS 18 LVC. This little darlin' is about the size and shape of a couple of drink coasters piled on top of each other; a little disk with a wire attached to it. It's a complete GPS sensor, but the part I care about (for these purposes) is that it has (a) an NMEA-compatible interface, and (b) a pulse-per-second (PPS) output. "NMEA" is the National Marine Electronics Association, and those folks long ago created a standard protocol ("language") for GPS units. The standard NTP source code distribution contains an NMEA driver, so this compatibility makes it easy to interface the GPS 18 with NTP. The bottom line: the GPS 18 LVC has everything you need to create a spectacularly good stratum 1 NTP server. And it costs about $80!

So I bought a couple of these (to make a redundant time standard). I've designed some simple circuitry (RS-232 line drivers) to allow me to put the GPS receivers up on my roof, and yet still have a stable signal travel down 100 feet or so of wire. I will be implementing NTP with the standard NMEA driver (modified if needed for the GPS 18's needs), and I'll be using the PPS output to implement the PPS driver in NTP. In the end, I'll have a stratum 1 NTP server synchronized to one millisecond or better with the NIST national time reference. The total cost of this project, including the dual GPS 18 LVCs, is less than $250 (using existing Linux-based computers as the NTP servers).

If you're interested in the design, I'd be happy to send you a Visio of the schematic. Just drop me a line! And once I've built it and got it running with NTP, I'd also be happy to share the NTP driver sources and configurations...

No comments:

Post a Comment