Links for November 9

Pierre Cartier wrote a long profile of Grothendieck for the new journal Inference: International Review of Science.

Jonathan Touboul wrote a paper The hipster effect: why nonconformists all look the same. Here’s apopular summary by Gabe Bergado at Mic. Basically, if you don’t want to look like everyone else, but it takes you some time to figure out what everyone else is doing, you’ll end up synchronizing with the other people with the same preference for nonconformity,

Portrait of the Hilbert curve by Aldo Cortesi.

Becca Cudmore and Jennifer Daniel at Nautilus show us five ways to lie with charts.

Here’s an interesting paper on best practices for scientific computing. One of the authors, Greg Wilson, is from an organization called Software Carpentry which teaches programming to scientific researchers.

From Natalie Wolchover and Peter Byrne at Quanta, In a multiverse, what are the odds? (First in a series; the second one should be out tomorrow.)

jasmcole has written about the mathematics of stereographic lampshades, which are made so that the light that shines through them makes interesting patterns on your walls. This was inspired by a blog post of Alex Bellos on the work of Henry Segerman and Saul Schleimer. Of course this can be made into reality with a 3-D printer, and you can buy it at Segerman’s shapeways store.

Is it professional for a professor to ask “surprise” questions on a test?, from Academia Stack Exchange. (Short version: the question is poorly phrased, but yes, and perhaps it is part of the professor’s duty, because being able to figure out things you haven’t seen before is usually one of the things you should learn in a course.)

We always think we’re right, but we don’t think we’re always right.

Jordan Ellenberg on how many states Nate Silver is going to get wrong, according to Nate Silver. (This refers to the elections of US Senators taking place tomorrow.) For each state Silver gives a probability of winning; we can give a probability that Silver will be wrong which is just his own predicted probability that the underdog wins. The answer is an an expected value of 2.5. Silver has been saying since the 2012 election that he got lucky in calling all fifty states correctly. In some sense it would have been more impressive if he’d missed a couple, which would have shown his predictions were calibrated correctly. (I remember trying to explain this to colleagues at my job at the time, where I’d been for a bit over a month; I think I did so successfully, but it’s a subtle point.)

Silver’s famous 50-for-50 2012 presidential predictions are still available; according to his own predictions, he would have expected to get about 1.8 states wrong, on average. It’s hard to say just how good going 50-for-50 is, though, because the errors are correlated.

However, it almost never makes sense to look at binary outcomes, but rather at the continuous outcomes that they collapse. (For example, when looking at sports data use difference in points scores instead of win-loss records.) Andrew Mooney at the Boston Globe did exactly this, and saw that 68% of the time Silver got within his stated one-standard deviation margin of error, and 96% of the time within two standard deviations.

Index of ignorance, or just innumeracy?

From Zach Wener-Fligner at Quartz, <a href=”http://qz.com/288707/everything-you-think-you-know-about-the-news-is-probably-wrong/”>Everything you think you know about the news is probably wrong</a>, based on <a href=”https://www.ipsos-mori.com/researchpublications/researcharchive/3466/Perceptions-are-not-reality-10-things-the-world-gets-wrong.aspx”>this Ipsos MORI study</a> of online panels in fourteen countries: Australia, Belgium, Canada, France, Germany, Hungary, Italy, Japan, Poland, South Korea, Spain, Sweden, Great Britain and the United States of America. Ipsos MORI compute an “index of ignorance” – but to some extent this may just be an index of innumeracy.

For example, the average American, when asked, guessed that 24% of girls aged 15-19 give birth each year. The actual value is 3%. In every country surveyed people were off by a factor of at least five. I’d posit that this is not a question of being uniformed so much as innumerate. If 24% of girls aged 15-19 give birth each year, and nobody gives birth before 15, then the average number of children of a woman at age twenty would be 1.2. Do people seriously think the average twenty-year-old woman has more than one child? I doubt it.

The other questions were percentage of Muslims (most people overestimate), Christians (most underestimate), immigrants (overestimate), percentage who voted in the last major election (underestimate), percentage “unemployed and looking for work” (overestimate), and life expectancy of a child born in 2014 (pretty much right on).

One of these numbers is not like the others. We’ll all die someday, and we all have some idea of how long people live, so we naturally get this right. But the others are asking for percentages, and I don’t think most people could tell the difference between “10% of people have this trait” and “20% of people have this trait” just by guessing. South Koreans and Japanese overestimate the number of Christians – and those are the two countries on this list in which Christians are a minority. I wonder, if you looked at the estimates people gave for a lot of these percentages, if they’d show a peak at 50%, the thought process being “well, people with trait X exist, but not everybody has trait X, so what’s a number in between? I’ll just pick the simplest rational number between 0 and 1.”

I’m a bit puzzled about the unemployment numbers, though. These are generally fairly loudly trumpeted in the media, so I’d expect people at least give estimates in line with those ranges, and yet, for example, the US guess is 32 percent. (The percentage of people “unemployed and looking for work” is actually lower than the unemployment rate, by definition, as the unemployment rate is the percentage of people in the labor force who are looking for work – the unemployment rate has the same numerator and a smaller denominator.) Even if people think about the experiences of those close to them instead of the public at large, on average this shouldn’t change things unless unemployed people happen to have lots of friends and family who take these surveys.

I’d also be interested to see how estimates correlated with political views. For example, are people who think there are more immigrants more likely to be anti-immigration? Do people who think the unemployment rate is higher support policies that would stimulate their nations’ economies?

Links for November 2

John Rauser of Pinterest gave a talk at the Strata Conference / Hadoop World on statistics without the agonizing pain – you replace the pain of statistics with the pain of simulation, which for an audience of programmers is much less painful. Via revolution analytics.

Surfacing this week on Hacker News was this article by NASA engineer Don Pettit on “The tyranny of the rocket equation”.

Tim Gowers writes on the results of an experiment concerning computer-generated mathematical writing.

Vi Hart made a scary Halloween video featuring candy corn and the Sierpinski calendar.

The video from the Online Encyclopedia of Integer Sequences conference is available online.

A place value puzzle

Fawn Nguyen, a middle-school math teacher, has written on finding the greatest product of a three-digit number and a two-digit number made up from some set of five digits. For example: if you’re given the digits 8, 7, 5,4, and 2, you’d have (to pick a product at random) 745 \times 82 = 61090. The question is to write the largest possible such product (ideally without doing the multiplication explicitly).

In this case it’s pretty obvious that if you can make one factor larger by just switching around its digits, you should do it: so 754 \times 82 > 745 \times 82. But how can you move around the digits between the factors? Which is larger, 854 \times 72 or $\latex 754 \times 82$? The trick here is to rewrite as 10 \times 85.4 \times 72 and 10 \times 75.4 \times 82, and recall that of two pairs of numbers with the same sum, the ones closer together have a larger product. That is, (x-a)(x+a) > (x-b)(x+b) if and only if a<b. Since 85.4 + 72 = 75.4 + 82, we can conclude that 85.4 \times 72 < 75.4 \times 82.

So 754 \times 82 is larger. But now we can switch the 2 and the 4 by the same sort of logic: 754 + 82 = 752 + 84 and so 754 \times 82 < 752 \times 84. This, it turns out, is the best we can do, as we can check by brute force – but how do we know this holds up generally? I haven’t used the differences between digits explicitly, only their ordering, so perhaps everything only depends on the order of the digits. Let’s let our digits be a, b, c, d, e, with a > b > c > d > e. Then there are just ten possible products to look at if we’re trying to find the largest once, since the digits in each factor have to be increasing:

abc \times de, abd \times ce, abe \times cd, acd \times be, ace \times bd, ade \times bc, bcd \times ae, bce \times ad, bde \times ac, cde \times ab.

(Note to pedants: juxtaposition of letters means juxtaposition of the corresponding digits, so when I write ad I mean 10a + d, and so on.)

We want to show that $bce \times ad$ is the largest of these products. We can make “moves” of the form bde \times ac = 100 \times b.de \times ac > 100 \times c.de \times ab = cde \times ab, for example, to show that it’s larger than $cde \times ab$; the inequality in the middle follows from b.de + ac = c.de + ab. If I’m not mistaken, any time two of these ten products differ by just switching two letters we can prove an inequality between them. And after a seriously grungy case analysis (which I won’t bore you all with) I believe we get that bce \times ad is always the largest product. In any case, for this particular problem there are only {9 \choose 5} = 126 possibilities so you could check by brute force. (Not a good exercise for people who you’re trying to teach to think about place value, but also not a bad programming exercise…)

Is there a general rule, when the number of digits is not five? Certainly we want to spread out the large digits, but how exactly?