Thursday, April 20, 2017

Prehistoric computing...

Prehistoric computing...  Ken Shirriff does his usual incredibly nerdy computing archaeology...

Tough choices...

Tough choices...  The algorithmic trading server I worked on a few years ago used three different numeric types to represent quantities of money (all USD, in its case).  We used scaled integers (32 bit scaled by 100) to represent stock prices (trades, bid, and ask) because that was known to be sufficient precision for the task and was fast because those integers are native types.  We used scaled longs (64 bits scaled by 10000) for foreign exchange prices (also trades, bid, and ask) for the same reasons.  Finally, we used BigDecimal (Java's built-in decimal with arbitrary precision) for aggregate calculations because it was the only built-in decimal type with enough precision, and because calculations like that were relatively infrequent and the performance penalty wasn't too bad.

But using several different types for the same logical purpose had very bad consequences for our system's reliability.  Why?  Because on every calculation involving money in the entire system, the programmer had to carefully think about the operands, make sure they were in the same value representation, convert them correctly if not, do the calculation, then convert the result (if necessary) to the desired type.  At the same time, the programmer had to correctly anticipate and handle any possible overflow.

So how bad could that be?  Here's a real-world example from that same algorithmic trading server.  Periodically we would look at the bid/ask stack for particular stocks within 5% of the last trade price.  For a highly liquid stock that might involve several thousand entries, where each entry was a price and a quantity.  We needed to get the sum of the extended price for each entry, and then an average price for the ask stack and separately for the bid stack.  The prices were in scaled integers, the quantities also in integers - but their product could exceed what could be held even in our scaled longs.  So we had to convert each price and each quantity to a BigDecimal, then do the multiplication to get the extended price.  Then we had to sum all those extended prices separately for the bid and ask stack.  We could sum the quantities safely in a long, so we did.  Then we had to convert that summed quantity to a BigDecimal so we could divide the sum of extended prices by the summed quantity.  The result was a BigDecimal, but we needed a scaled integer for the result – so we had to convert it back.  That means we needed to (very carefully!) check for overflow, and we also had to set the BigDecimal's rounding mode correctly.  Nearly every step of that process had an error in it, due to some oversight on the programmer's part.  We spent days tracking those errors down and fixing them.  And that's just one of hundreds of similar calculations that one server was doing!

Ideally there would be one way to represent the quantity of money that would use scaled integers when feasible, and arbitrary precision numbers when it wasn't.  It's certainly possible to have a single Java class with a single API that “wrapped” several internal representations.  This is much easier if the instances of the class are immutable, which would be good design practice in any case.  The basic rule for that class at construction time would be to choose the most performant representation that has the precision required.  We could have used such a class on that algorithmic trading server, and that would have saved us many errors – but none of us thought of it at the time...

Note: this is the last post of this series on representing and calculating monetary values in Java.  I've provided a link for the series at right.  If perchance some other semi-coherent thought occurs to me, I reserve the right to add to this mess...