Thursday, 6 July 2017

Lies etc

The snip being taken from a graph from my bread workbook which purports to show in red the mean number of days per batch, on the right hand scale, with the mean computed over the previous 50 batches. The blue shows total days since the beginning of bread, on the left hand scale, for these purposes running from day 188, otherwise mid July 2011, to day 2371, a period of nearly six years. Batches along the bottom, starting at batch 51. I dare a bit more effort would get the labels right.

But more important, why do we have a blue line which looks very straight but a two humped red line? Two humps over six years? Further investigation needed. Have I blundered?

Which all goes to show that whoever invented the well known phrase about lies and statistics clearly knew his onions.

It also served to remind me of the power of the Excel 'datediff' function, a function which will compute the interval between two dates, supplied in ordinary calendar format, and give you the answer in days, weeks or whatever you fancy. I dare say I could write such a function for myself, but I hate to think how long it would take to be sure that I had got it right. So I am glad that MS took on this particular bit of work for me.

PS: it will also do hours, minutes and seconds if you include times with your dates, but it does not do moons, non trivial as the number of days in a moon, as seen from the earth, varies from about 29.18 to about 29.93. A tricky sort of number, no doubt accounting for all kinds of lunar mumbo jumbo. Perhaps native peoples everywhere should put in a bid for an enhancement.

1 comment:

  1. The wiggly red line is to a scale of 8, so a difference of 1 is large; the blue line is to a scale of 2500, so a difference of 30 is small. This is because you have done the right thing and provided the unadjusted data. I did once see a paper where the correlation lines of several studies had been compared and concatenated. Needless to say, the average line of the average lines was pretty straight: a perfect example of lies (all data is an extraction from reality), damn lies (all statistical methods are designed to "smooth" the bumps from data), and statistics (if you steamroller the data enough times it becomes rock-solid but terribly thin).

    Regarding the Moon, Don't forget that, mythologically speaking, he/she is both a trickster and a reliable calendar. Full moon is a good time to go hunting because it's almost as bright as daylight, so you can closely track injured prey for three days continuously. Also lions, hyenas, and other predators that can see well in the dark can be seen by humans in moonlight, so are much less of a threat. The moon is an important calendar which provides its own justification. That may be why the inner circle at Stonehenge consists of 29-and-a-bit stones (c.f. Lionel Sims, various dates).

    ReplyDelete