Originally Posted by Ren Höek
Not really...
If you have 1% Linux users, the probability to draw a Windows user is 0.99 = 99%.
If you draw two users, the probability for both to be Windows user is 0.99^2 = 98,01%.
If you draw 1000 users, the probability for all of them to be Windows users is 0.99^1000 = ~0,0043%.
For 5 million, it's ~0% and not "quite possible"...
The probability is even lower than that, since you aren't placing users back in the pool once you've picked one (otherwise the same account could be picked twice per month). Only the first user has a probability of 99% of being a Windows user. The 5 millionth user out of 50 million has a probability of 98,9% of being the 5 millionth Windows user in a row (44,500,001 in 45,000,001). The probability of picking 5 million Windows users in a row is NaN .

Which is precisely why the result for a large enough sample size will only deviate by ~1% from the "correct" result. As someone above said, if Valve was using a too small sample size you would see a large deviation in the results each month. And even in the astronomically unlikely event of picking only Windows users for a single month... try repeating that multiple months in a row...

Really, stop fooling yourself the results would be any better. In any case, why would Valve introduce a bias that favors Windows users, now that they have committed themselves to producing a Linux based gaming platform? In any event they would be favoring Linux users to make it more attractive to their business customers (i.e. publishers) if they were manipulating the results.