## Flightstats statistical mumbo-jumbo

From flight stats, describing a flight that is the first leg of a two-leg itinerary I’m flying in the near future – obviously this is the sort of flight where one is interested in knowing whether it tends to be on time, because one does not like being stuck in Charlotte:

This flight has an on-time performance of 84%. Statistically, when controlling for sample size, standard deviation, and mean, this flight is on-time more often than 95% of other flights.

I didn’t realize one could control for standard deviation and mean.

(Presumably controlling for “sample size” could mean some Bayesian approach, where if there is a small amount of data for a flight they tend to give moderate predictions. This is probably not too influential as

## Weekly links for February 25

Matthew Barsalou does a Bayesian analysis of the deaths of redshirts in Star Trek.

Charles Radin asks why can you stand on ice but not on water? in the Notices of the AMS.

Igor Pak has a blog; his most recent post is on the history of Catalan numbers.

Brian MacDonald has a paper on realignment in the four major sports leagues, with a view towards minimizing the total amount of travel required for teams. At Hockey Prospectus he’s written a three-part series on this paper (emphasizing hockey, of course): part one, part two, part three.

From It’s Okay to be Smart, Drake’s equation applied to finding love.

Rick Durrett writes Cancer modeling: a personal perspective for the Notices of the AMS.

Disease spreads like ripples on a pond, but only if you have the right metric.

## Oscars edition

Nate Silver fivethirtyeights the Oscars. (Yes, that’s a verb.) That is, he predicts who’s going to win Academy Awards tonight by looking at who’s won (or been nominated for) awards previously in this awards season, weighting the results in proportion to how well those results have predicted Oscar results in the past. See also his 2009 and 2011 (behind NYT paywall) attempts at the same, which try to take some other variables into account; Silver seems to believe that he may have overfit, hence the simplification.

Meanwhile, John Lopez of Vanity Fair reports on a 2008 paper by Jonas Krauss, Stefan Nann, Daniel Simon, Kai Fischbach, and Peter Gloor, “Predicting Movie Success and Academy Awards Through Sentiment and Social Network Analysis”; at least at the time, the IMDB comments section gave lots of useful information. But there was no Twitter at the time of the paper (which was based on data from 2006); the folks at Topsy have an Oscars Index.

(I will refrain from predicting, because unlike Nate Silver I don’t have minions to clean the data for me.)

## (Bi-)weekly links for February 18

Larry Wasserman: statistics declares war on machine learning.

Natalie Wolchover at Wired: In Mysterious Pattern, Math and Nature Converge, on random matrix theory.

A draft book by John Hopcroft and Ravi Kannan, CS theory for the information age (large PDF). Used in this CMU course by Venkatesan Guruswami and Ravi Kannan on modern mathematics for computer science, emphasizing high-dimensional geometry, probability, and other non-discrete mathematics.

257885161-1 is prime, says GIMPS. Liz Landau blogged about it and people at Metafilter talked about it.

Daniel Navarro of the University of Adelaide has a free e-book Learning statistics with R:
A tutorial for psychology students and other beginners

sarah-marie belcastro writes Adventures in Mathematical Knitting for American Scientist.

## Simpson’s paradox in the wild

Found on Wikipedia by Kate Owens: a chart of education by income and race. At each level of education, white Americans outearn Asian-Americans. But overall, Asian Americans outearn white Americans. How does this happen?

The answer, of course, is that Asian Americans have a higher level of education overall. If the two groups had the same overall level of education, white Americans would outearn Asian Americans. It’s an example of Simpson’s paradox in the wild. (Note: one example of Simpson’s paradox at the Wikipedia article involves characters called “Lisa” and “Bart”.)

The Wikipedia chart is based on 2003 data. I would like to be able to reconstruct this with present data, but unfortunately more recent data seems to not break out Asian Americans separately.

## Fractal broccoli

Did you know that broccoli is fractal in nature? It’s self-similar – little bits of broccoli look like big bits of broccoli.

To illustrate this, here’s a big piece of broccoli from tonight’s dinner at God Plays Dice headquarters:

And here’s a small piece of broccoli, against a backdrop of a smaller pattern:

They look quite similar!

I’m not the first to notice this: see Fractal Broccoli for the Gardening Geek and Fractal Broccoli with a Macro Lens, which features better photography. But what do you want before dinner?
My art department has a variety of fabric backdrops, mostly from recent quilting pursuits. More about that, perhaps, in a future post.