Why you should take statistics with a pinch of salt
I occasionally use the arithmetic mean approach to timing methods. Its quite simple and goes something like this:
long startTime = System.currentTimeMillis();
for (int i=0; i<1000; i++) {
methodUnderTest();
}
System.out.println("Avg. method execution time=" +
((System.currentTimeMillis() - startTime)/1000));
Now this may not be the best approach, but it works for me. I had assumed that if you used a large enough sample, you could get pretty accurate results. Except, this time, while I was demonstrating the result to Bill Hubertus, a disturbing though struck me. Statistical approaches such as this could so easily be misused by people, if they so desired. This was hardly an epiphany though. In an instant I recalled all those J2EE v/s .NET showdowns and imagined crooked old nerds concocting their slimy recipes to skew the results one way or the other. But no, this action hero is not out to rid the earth of its scum. So back to our story...
In fact the arithmetic mean approach itself is awfully flawed. For those that took a photography class in place of statistics, here's a refresher: the arithmetic mean is the standard average, in that you take a sum of all the numbers in a sequence and divide them by the number of items in that sequence. If you have a sequence of {1, 2, 3, 4, 5}, the mean is 3. Looks good, eh? But if you calculated the net worth of say, Bill Gates, Warren Buffet and Ashish Shetty, you'd be misled into believing all three were members of the Forbes 500 club. For the record, it would be money lost for Mr Gates and Mr Buffet if they even bent down to pick up a bag of cash equivalent to my annual paycheck from the pavement. Using the arithmetic mean then skews the results higher than they actually are. Which itself is not a problem for my test above. All it does it show a higher execution time; not quite as evil but misleading nevertheless.
The median is another way to look at statistics, perhaps a better one. It is the number that is exactly midway in a sequence of numbers. It is the value below which 50% of the scores fall (and need I add, the value above which 50% of the scores fall. ). When there is an even number of scores, the median is the mean of the two centermost scores.
And then there is the mode. It is a measure of frequency of occurence. So the most frequent occuring number in a sequence is the mode. I do not know if this is practical in timing methods.
So the next time you see a report throwing statistical inswinging yorkers your way, think of this: 98% of all heroin addicts started out by drinking milk. Go figure! And my personal favorite (I use this to discredit most claims): 47.5% of all statistics are made on the spot.
If you have any pet tricks for measuring method execution times reliably, please comment. But keep them honest.