Sunday, November 1, 2009

Computers Suck at Math...

Well, not really.  But they do make mistakes that surprise most people, including most professional programmers.  The basic problem is the way that computers usually represent numbers internally – with something called “binary floating point numbers”.  This article does a reasonably good job of explaining the problems in laymen's terms.

The most common place that errors in computer math show up is in financial calculations, where it's not unusual to need 15 or more digits of precision, and where calculations are often done repetitively (as in adding a large column of numbers).  Just to give a simple example, if the computer represents $0.01 as $0.0999934821, and you add enough of those pennies together, your result will be off by more than a penny.  Even though a computer did the math.

There are other ways that computer math can go horribly wrong, too.  All of these are well-known and covered by extensive literature, yet programmers keep making the same mistakes over and over again.  The basic error is to choose binary floating point to represent money (or other critical values).  Other representations exist.

So why do programmers keep doing this?  I've asked that question of programmers working for me on numerous occasions.  The most common response, by far, is something along the lines of “These errors only show up because the incorrect rounding was used, and I'm using the correct rounding.”  That's just plain wrong, though most programmers appear to be unaware of this.  But probably the most important reason programmers keep using binary floating point numbers is because it's easy, and often the problems with that choice don't show up until after a software system has been in use for a while.

The fact is that binary floating point numbers cannot represent all real world numbers with sufficient accuracy for all uses.  It's not a rounding problem, it can't be “worked around” – they're simply not up to the job.  There are many alternative ways to represent numbers in a computer that do not have any of the problems of binary floating point.  These methods are a teensy bit harder to use in most programming languages, and not at all harder in some.  There is, in most cases, a small performance penalty – irrelevant in all but the most compute-intensive algorithms.

So be careful with those binary floating point numbers, y'all...

No comments:

Post a Comment