Homer,
With the data would it be best to randomize it before pulling your sets so you could avoid picking "bad runs" or "good runs"? Or since the numbers will be ran against themselves several times will it remove the error itself?
Thanks man.
When you sample from your original data - you should pull the data at random. Myself, I used random numbers from random.org because unlike computer-generated random numbers - they're pulled from atmospheric data that is though to be truly random (due to turbulence). Computer generated random numbers are just pseudo-random numbers (although their periodicity is extremely large).
Anyhow, before I even did the analysis, my intuition was that we only had 19 data points. All things considered, that's not a lot of data. Furthermore, there is definitely a bit of spread in the data - thus, my intuition was that the reliability of our results would have some definite error.
This is what I found (using the naïve choice of 12 sample values - and creating an ensemble of 10 data sub-sampled data sets). People can fairly attack these choices - because they're far from optimized for the given data-set ... but they seem adequately reasonable.
Anyhow, here is what I found:
For the original data set of 19 data points: mean = 0.5887 standard dev = 0.2065
sub-sample 1: mean = 0.5926 stdev = 0.2400
sub-sample 2: mean = 0.6206 stdev = 0.2576
sub-sample 3: mean = 0.7319 stdev = 0.1283
sub-sample 4: mean = 0.5479 stdev = 0.2310
sub-sample 5: mean = 0.6550 stdev = 0.2330
sub-sample 6: mean = 0.5810 stdev = 0.1403
sub-sample 7: mean = 0.6101 stdev = 0.2345
sub-sample 8: mean = 0.5880 stdev = 0.0878
sub-sample 9: mean = 0.6333 stdev = 0.1757
sub-sample 10: mean = 0.5985 stdev = 0.1818
Combining everything together: We can take the mean of the means now - and that will give us another "measure" of the mean. However, the standard deviation in these means can then conceptually be viewed as a measure of the ERROR in the mean.
mean (of the mean values) = 0.615879
Ironically, this win percentage, for a 13 game season ends up landing right at 8 wins.
standard deviation in the sub-sampled mean-values = 0.050349
The first thing to notice is that this measure is telling us something DIFFERENT than the standard deviation of the original full data set. The original standard deviation is telling us about the spread in the distribution. This standard deviation is more explicitly addressing the reliability of our calculated mean value. The implication of this is rather important.
We could expect a mean value to be drawn from the underlying distribution to be as high as 0.66228. This would correspond (in a 13 game season) to an expectation value of 8.66 wins. We could also expect a mean value to be drawn from the underlying distribution to be as low as 0.56553. This would correspond (in a 13 game season) to an expectation value of 7.35 wins.
Notice how the calculated mean from the full data set nicely fits within the "error bars" supplied above. The value of 0.5887 falls within accepted limits. However, it is interesting to note that it certainly is much closer to the LOWER limit than the HIGHER limit.
Mind you, the above supplies us with a spread in MEAN values. This isn't saying anything (yet) about the variability in the spread of the distribution itself.
mean of the standard deviations = 0.190988 (this is rather close to the value drawn from the full data set)
standard deviation of the standard deviations = 0.057315 (as in the case of the mean, this tells supplies us with information about the ERROR in the spread of the empirical distribution)
Relative to the calculated value of 0.057315 ... this is telling us that we have around 30% error in our measure of how spread out the distribution is. Again, given 19 data points and the variability in the data, this shouldn't really surprise us.
Given the quality of this data (or lack thereof) - on first blush, I'd likely conclude that we wouldn't have to look much beyond the conceptual statistics here to draw conclusions.
What the data tells us is that in any given year, the Hawks are likely to average anywhere around 7 wins to 9 wins. The spread in the data isn't terribly helpful ... but it indicates the obvious that sometimes we can do much better ... and sometimes much worse (kind of a "duh" statement - but one that must be uttered).