Weekly links for April 29 (one day late)

From Josh Wills at Cloudera, a post on reservoir sampling.

Evelyn Lamb has compiled a list of mathy ladies to follow on Twitter.

Stephen Wolfram (and presumably part of his army of people working for him) have some interesting visualizations of Data Science of the Facebook world.

Brian Hayes maps the Hilbert curve.

Dana Mackenzie at Slate writes on the mathematics of jury sizes. Also at Slate, Phil Plait writes for the Bad Astronomy blog on the analemma.

How to sort comments intelligently and this post on Bayesian methods for multi-armed bandits are part of Cam Davidson-Pilon’s book Probabilistic Programming and Bayesian Methods for Hackers. I found Davidson-Pilon via his list of machine learning counterexamples.

Kenneth Appel (of four colors suffice fame) died.

Reverse Anscombe

At Cross Validated, someone asked about why they get wildly different histograms from the same data. The user Glen_b gave an excellent answer based around an example for which data sets which differ from each other just by adding a constant have very different-looking histograms. Other commenters suggest using kernel density estimates or cumulative distribution plots, both of which wouldn’t fail on this particular question.

Anscombe’s quartet comes to mind – four bivariate data sets with the same mean and variance of each coordinate and the same correlation, which look wildly different when plotted. This is sort of a reverse-Anscombe: here data sets that look essentially the same when plotted have wildly different summary statistics.

Weekly links for April 8

From metafilter, mesmerizing visualizations of genetic algorithms.

The paper and pencil cosmological calculator.

Zipfian Academy is offering to train people to become data scientists in twelve intense weeks. (via.)

A prize is on offer for improving prediction of flight delays..

Sebastian Bubeck’s blog on “topics in optimization, probability, and statistics.

A roundup of 100 statistics blogs.

A tumblr of transit maps . (Yes, not really about math -b ut sort of tickles the same part of the brain, no?)

E. O. Wilson on why scientists don’t need math, and Jeremy Fox on why they do.

Pi(e) approximations in practice

Tonight the God Plays Dice art department made blondies!

These are supposed to be made, according to the recipe, in a pan which is an eight-inch square. But we have no such thing. We do have a nine-inch circular pan, though. Will that do?

Well, what matters is that the two pans have the same area – and therefore that the same volume of batter will have the same thickness and cook roughly the same. (If you thought I was going to solve some PDEs and work out how the heat transfers, you haven’t been paying attention.)

A nine-inch circle has area \pi (9/2)^2 = 81\pi/4 square inches, which is about 63.62. An eight-inch square, of course, has area 64 square inches. Not bad!

What would it take for this approximation to be exactly correct? This would require that 81\pi/4 = 64 exactly; solving for \pi gives $\pi = 256/81″, which is often credited as an Egyptian approximation to \pi as it implicitly appears in the Rhind papyrus, an ancient Egyptian document of,problems in mathematics. In fact the setting in which this is established there is almost exactly this one – a circle of diameter 9 and a square of side 8 are said to have the same area. See for example these slides for a history of math class by Bill Cherowitzo.

This isn’t the greatest approximation of \pi – in fact 81\pi is about 254.46 – but it has the added “virtue” that 256 is a power of two, and 81 is a power of three. We could write \pi \approx 2^8/3^4 – it looks nicer that way, I think.

And because Internet law forbids me from mentioning food without posting a picture of it:


Weekly links for April 1

From Decision Science News, Some ideas on communicating risk to the general public.

The Expression of Emotions in 20th Century Books via the Wall Street Journal. Over the course of the 20th century, authors in English used less “mood” words, and this has been stronger in British texts than American ones.

Is predictive modeling different from interpolation?

Wolfram on Mandelbrot (via Gelman)

Network theory approach reveals altitude sickness to be two different diseases.

27-game streak? For Heat, 50-1 shot

The Fifth problem: math & anti-Semitism in the Soviet Union by Edward Frenkel.

A series from Bloomberg on gerrymandering: part one, two, three, and a couple of graphics.