Indeterminate Length Determined[]
Dataset[]
I've downloaded episodes 523 through 722 of BOL in order to try to determine the "indeterminate length" of BOL. In order to check some theories on predicting episode length I joined in historical weather data from Weather Underground to produce a CSV dataset, and created an bol.R script file for the open source statistical tool R
Data Distribution[]
Here is a histogram of all the episodes.
There is clearly something strange going on with episodes of length < 20 minutes (Episodes 522,553,554,571,602,609,629,630,635, and 636). Closer analysis reveals these as 'special episodes'. Lets ignore these for now, and take a look at the fitered data and examine a QQ plot to see if the data is normally distributed.
Looks pretty normal (though short episodes are more frequent than a normal model would predict). Now we can calculate the mean and standard deviation:
Mean: 37.9 minutes Standard Deviation: 5.3 minutes
So there you have it: Buzz Out Loud: CNET's 38 ±10.6 minute podcast (to ~95% accuracy).
Additional Theories[]
We can also check a few more interesting theories:
Question 1) Are our hosts getting long or less winded over time?
Answer 1) Nope!
Question 2) Are Friday shows shorter?
Answer 2) Unexpectedly, Wednesday actually appears to be shorter than normal, but anova analysis of the data indicates the differences are not statistically significant.
Question 3) Do nice days make for shorter podcasts?
Answer 3) Not quite - Though the data appears to be correlated and show that we get shorter podcasts as the weather gets nicer, there is still a 10% chance the correlation is due to random variation so we can't quite be sure that the mean temperature to podcast length correlation is significant.
Btrapp 20:57, 16 May 2008 (UTC)