Reproducabilty and Precision of the Numbers
This is my first post to this forum, so please be gentle, even if this post has a bit of rant, sorry
My problem with the test suite ist that numbers like 11632.56 mips are very probematic from a scientific/statistic point of view. The suggest a precision and reproducability in the range of 10^-5, which the test suite surely can't guarantee. These numbers are even more problematic, since no boundaries of error are given.
So my advice would be to run every test several times on different setups and machines to see if
1) the error follows really a known and expected distribution (gaussian or poisson for example)
2) estimate the procentual error from that,
3) mention at least the error in the verbatim results, and
4) give the numbers in the charts rounded to that range of error.
I think this would really improve the quality of the results and their trustability, which is IMHO an important issue for benchmarks. If the results don't follow an expected distribution, it surely would be even more interesting to investigate if a systematic error spoils the test.
Just giving a result like 11632.56 mips is, sorry to say, simply unscientific if not wrong.
Last edited by furanku; 01-15-2010 at 07:04 AM.