## Cross Validated thread on intro Bayesian statistics

From Cross Validated (stats.stackexchange.com, a web site which I think deserves to be better known): What is the best introductory Bayesian statistics textbook?

Some of the recommendations from this thread that I’ve seen before:

## Error propagation for Atwood’s machine, by simulation

A few weeks ago I mentioned that the propagation of errors is a bit tricky. Say we want to predict the acceleration in an Atwood machine. The machine consists of a string extended over a pulley with masses at either end, of masses M and m, with M > m. The acceleration is given by $a = g{M-m \over M+m}$

where g is the acceleration due to gravity, which we assume is known exactly. Let’s set g = 1, so we’ll have $a = {M-m \over M+m}.$

We previously found by analytic methods that if $M = 100 \pm 1$ and $m = 50 \pm 1$, then $a = (1/3) \pm 0.01$. But it’s instructive to do a simulation.

Specifically, fix some large $n$. For $i = 1, 2, \ldots, n$, let $M_i$ be normally distributed with mean 100 and standard deviation 1; let $m_i$ be normally distributed with mean 50 and standard deviation 1; and let $a_i = (M_i-m_i)/(M_i+m_i)$. Then the mean and standard deviation of the $a_i$ are estimates of the expected acceleration and its error.

This is very easy in R:

 n = 10^4; M = rnorm(n, 100, 1); m = rnorm(n, 50, 1); a = (M-m)/(M+m); 

When I ran this code I got mean(a) = 0.3333875, sd(a) = 0.00993982. Furthermore, the computed values of a are roughly normally distributed, as shown by this histogram and Q-Q plot. (The line on the Q-Q plot passes through the point (0, mean(a)) and has slope sd(a).)  This works even if the errors are not normally distributed. For example, we can draw the simulated data from a uniform distribution with the given mean and standard deviation:

 Mu = runif(n, 100-sqrt(3), 100+sqrt(3)) mu = runif(n, 50-sqrt(3), 50+sqrt(3)) au = (Mu-mu)/(Mu+mu) 

I got mean(au) = 0.3332557 and sd(au) = 0.009936519. The distribution of the simulated results is a bit unusual-looking: There’s also a way to compute an approximation to the error of the result using calculus, but simulation is cheap.

## Interactive population density map

World population density visualizer, by Derek Watkins, via Metafilter and gizmodo. The original idea goes back to William Bunge‘s “Continents and Islands of Mankind”, redrawn at Making Maps. There we have a map of the areas where population density is greater than 30 per square kilometer, roughly “where people live”; Watkins adds a slider so you can change that number “30” to anything from 5 to 500.  Here’s a static map of the same data.

You should in theory be able to determine the population of the world from something like this, but the slider only goes up to 500, so you can’t tell how many people live at densities greater than 500 per square kilometer; these are “urban” densities (roughly) and so that’s a lot of people. Robert Talbert mentioned something similar on Twitter a few days ago: can you estimate the population of Colorado from a population density map? Not really, since the population of Colorado is very concentrated.

## But Officer, you didn’t see me stop!

The paper is dated 1 April 2012, so it may be a joke, but the idea is at least theoretically reasonable. The ticket in question was for not stopping at a stop sign. The idea is that to a police officer, a driver might appear to not stop, despite having actually reached zero speed for a moment, if another car happens to obstruct the officer’s view at the critical moment.

But is that a “stop” anyway? Is there some minimum amount of time one must be stopped at a stop sign? Still, it’s a nice little piece of mathematical modeling.

## Weekly links for April 15

Andrew Gelman asks: do statisticians practice what we preach in teaching? (His conclusion: no.)

Samuel Arbesman, Probability and game theory in The Hunger Games and Brett Keller, Hunger Games survival analysis. Andrew Gelman writes: “I think it’s always good to get practice. Analyzing a book/movie is like doing sports statistics; it can keep you in shape.”

Here’s that annual list of the year’s best jobs. Mathematician is #10.

Luis Apiolaza has some ideas about what an introductory book on Bayesian statistics should be like; commenters there have listed some of their favorite such books. He also has a post on first impressions of Kruschke’s book “Doing Bayesian Data Analysis” (which has puppies on the cover!) and a link to a free-for-noncommercial-purposes PDF version of Joseph Kadane’s Principles of Uncertainty.

## Pink is a real color

They did it to Pluto, but not to pink! Please not pink!

Robert Krulwich points out that there is no pink in the rainbow, linking to a youtube video:

Saying that pink isn’t a color is a little silly, though. The space of colors that humans perceive is three-dimensional. Start with an actual light source, which has a continuous spectrum; human vision roughly projects from the space of possible spectra to a three-dimensional space, each dimension corresponding to one of the three types of cones. The “pure” colors (single wavelengths) correspond to a two-dimensional manifold in that space — one dimension for hue, one for brightness. Just because you wouldn’t call anything in that two-dimensional space “pink”, that means pink isn’t a real color?

There must be some aliens, somewhere out there in space, that have yellow-sensitive cones as well and are offended that we don’t think yeen and grellow are real colors.

(hat tip: LA)

## “The Joy of Stats” on KQED tonight

Last-minute announcement for people in the Bay Area: the excellent documentary The Joy of Stats is playing on KQED tonight at 10pm. (Also at 4 AM Friday, and on the KQED Life subchannel at 9 PM tomorrow and 3 AM Saturday.)

I would have mentioned this sooner, but I actually just learned this from flipping around to different channels.