Sunday, April 16, 2017

How about scaled integers for monetary amounts?

How about scaled integers for monetary amounts?  A friend recently wondered why I wouldn't simply use scaled integers for monetary amounts.  For instance, if I determined that all I needed was 10 significant integer digits, plus 4 decimal places, then I could exactly represent any decimal value within that range by multiplying it times 10,000 and using the resulting integer.  For example, I could represent 382.03 as 3,820,300.  When it came time to present that number to a human, I'd just divide by 10,000.

Scaled integers work particularly well for addition and subtraction, and for many financial applications that's the bulk of what they do.  Consider this addition example, unscaled on the left and scaled by 10,000 on the right:

    773.32     7733200
     27.99      279900
    ------     -------

    801.31     8013100

Multiplies aren't quite as lovely, though. The scale factor gets multiplied along with the actual number, so you get a result that has to be divided by the scale factor to get a correctly scaled result. Example:

      4.45          44500
      6.02          60200
    ------     ----------

    26.789     2678900000 rescaled to 267890   

And then there's division, where the scale factor essentially is canceled out – requiring you to multiply the result by the scale factor to re-scale it.

    773.32     7733200
     27.99      279900
    ------     -------

     27.63       27.63 rescaled to 276300 (results are rounded)

If these numbers at our desired precision all fit into a native integer type, this would be a bit unwieldy, a little less performant than native, but workable.  In an earlier post I figured that we needed a range that encompassed at least 30 decimal digits just to represent amounts of money.  The binary equivalent of 30 decimal digits is about 100 bits.  The largest native integer in Java (the long) has 63 significant bits – not even close.

Well, what if we used two longs?  That would give us 126 significant bits – plenty of room.  Addition and subtraction are still simple with this scheme.  Multiplication is a bit harder, but still workable.  Division is a bear, though, and substantially slower than a native implementation.  Those aren't necessarily deal-killers, just a consideration.  A similar issue arises from the fact that with this scheme it takes 16 bytes to store any number.  That's expensive in database, mass storage, network transmission, and CPU cache (for any application that uses lots of values, i.e. most of them).  But still not necessarily a deal killer.

But, as my mother-in-law would say, there's a worser problem with scaled integers.  It derives from the fact that you don't just represent monetary values in an application – you also do math with them.  I've used the simple example of multiplying price times quantity to get extended price, but many financial applications do much more than such simple math.

Just to pick one example out of my checkered past: I once was charged with building applications that modeled the performance of complex bonds (that is, those with fancy terms in them, not just simple interest), over wide ranges of multiple environmental variables (LIBOR rate, inflation rate, etc.) in combination with each other.  These models had multi-dimensional tables with millions of entries, each of which contained a calculated probable value.  In some of the models I built, these values could be as small as 10^-10, with around 8 significant digits.  That's not something exotic and unusual, either – it's a perfectly normal sort of financial application.

Here's the real point, though: financial applications need to do math with money, and we really can't predict what the range of numbers they'll need will be.  This point has been driven home for me by a number of bad experiences out in that pesky real world.  Every application I've ever worked on that used fixed point numeric representation (which scaled integers are an example of) has run into problems with the range of numbers they could represent.  The failure modes can be very bad, too – especially if the fixed point implementations aren't good at catching overflows (and many of them don't even try, because of the performance penalty). 

This hard stop on the range of numeric values held is the real deal-killer for me with fixed point representations.  The performance and size issues just make it a little bit worse.  In my opinion, fixed point representation and manipulation of monetary values is a dangerous source of fragility in financial applications.  Further, it's one that is very difficult to repair – once the decision to use a particular fixed point representation is made, that decision creates tendrils of dependency that find their way into every nook and cranny of the application.

So how do you avoid this?  There's a good solution, but it comes with its own costs: decimal floating point.  That will be the subject of a few more posts...

No comments:

Post a Comment