Following up on the post I just made (brought to you by 1 Minute Edit Limit):
Ideas for ways to improve download size!
1. If a test is based on a real game, the full game distribution has many many assets that are not used in the benchmarks. On a per-game basis, figure out which assets can be excluded and just rm them. If the game freaks out because it can't find every asset it expects, find out how to bypass that. If it's open source, there's always a way; if it's closed source, you can give up if it seems that expecting certain resources is built-in to the game binaries with no bypass.
2. For all tests (except maybe "untarring the Linux kernel"), switch compression over to LZMA2, using the .tar.xz file format. This codec has excellent compression ratio and very fast decompression time. The only drawback of LZMA2 is compression time, but that's a one-time constant cost at the time of writing the test, so you can be patient.
3. Strip out docs and help files from test source distributions. If compiling from source, you can (hopefully) use something equivalent to ./configure --disable-docs to conveniently tell it not to worry about docs. I guess the only doc you need to leave in is the license file, for legality's sake. But certain programs have megabytes of plaintext, html or xml based docs.
4. Make sure as many tests as possible (especially those in popular suites) use the same version of each program for any tests that use the same program. So you don't want to download foo-1.4.0.tar.xz in one test, then foo-1.4.1.tar.xz in the next test, especially when you can avoid this within a suite (since most users will run suites, not individual tests).
Also, while disk space isn't a concern on my desktop, it is on my laptop. It makes it much harder to do a graphics benchmark on my laptop when I have such space constraints and the tests are so huge (especially the ones that compile from source; all the intermediate objects plus the final build are quite massive).
If I had an uncapped connection, I would happily run tests on my desktop where space isn't an issue.
Maybe you can add a parameter to openbenchmarking.org letting users query tests by test download size?
