Monday, July 6, 2015

Floating point follies...

Floating point follies...  I ran across this blog post in my morning reading, and as I was reading it I thought “Jeez — this is like a litany of all the floating point problems I’ve seen programmers stumble into!”  Two memorable occasions:
  1. While working at a firm that provided hedge fund IT infrastructure as a service, I worked with a team implementing an algorithmic trading server.  This was implementing a relatively simple algorithm called “pairs trading”.  Part of the calculation involved raising a pair of fixed numbers to a series of powers, looking for a particular relationship between the results.  The powers were 1.00 to 1.99, in increments of 0.01.  The programmer's initial approach was to calculate the two fixed numbers raised to the 0.01 power, then accumulate as he iterated through the increments.  The programmer did this in the belief that he'd speed up the calculation by avoiding the power computation (which wasn't actually true on the server the code executed on, though it was true on his laptop).  However, much to his surprise and consternation there was a considerable error in his computation by the time he got to the end.  He sought my help on this, and we changed a few lines of code to eliminate the accumulated additions by doing the power raising for each iteration, and the error disappeared (of course).  I was tempted to tell the programmer that the problem was caused by his choice of Visual Basic as the programming language, but that didn't seem fair :)
  2. While working at a different electronic trading firm, we had a problem wherein the cost we computed for a trade didn't match what NASDAQ computed and charged us.  Usually the difference was a few cents and we just ignored it, but one day a trader bought a huge number of shares of a very low cost stock, and the difference was over $100 – enough for my boss to tell me “Go figure that out, damn it!”  When I dug into our calculations, I could find no errors, so I called my peer at NASDAQ, a pleasant fellow named Gary.  He told me he'd run into this a few times before, but had never figured it out.  He sent me the block of code that did the calculations on his end.  Our code was Java; theirs was C++, but in both cases the code used the underlying x86 machine's floating point processor.  After staring at the two pieces of code side-by-side for a while, I realized that we were doing a particular multiplication with the operands in a different order.  We were multiplying shares x price, and they were multiplying price x shares.  When I swapped the order of our multiplication, the difference suddenly disappeared.  I called Gary back and told him what the problem was, and he told me I was crazy – multiplication was associative and that was that!  An hour or so later he called me back and said he'd duplicated my results.  That was one very disillusioned programmer!  After that, he and I had several phone calls (at his request) so I could educate him a bit about floating point numbers.  Despite having a very responsible position on staff in a large exchange, he knew next to nothing about them – most especially about the sorts of things outlined in the blog post linked above.

No comments:

Post a Comment