Blizzard of 2015

Nate Silver wrote an excellent blog post on how (in New York) it’s been snowing the same amount it always has but on fewer days. If it’s snowing on fewer days, those must be bigger storms. He ran the tests to show that the increase in very large storms since about 2000 is significant. As he observes, “Anthropogenic global warming, as I’ve said, is a plausible cause.”

(I was going to write this post. Now I don’t have to dig up the weather data.)

Since there was much less snow than forecast in New York, national (i. e. New York-centric) media have been wringing their hands about the difficulties of forecasting. See for example Adam Chandler in the Atlantic on meteorologists apologizing for bad forecasts, Harry Enten at fivethirtyeight on how the forecasts went wrong, Eric Holthaus at Slate on the same, and Zeynep Turecki at Medium on probabilistic forecasts. The common thread is that there’s generally an incentive to forecast high, because predicting more snow than actually happens causes less harm than predicting less snow than actually happens.

Meanwhile, here in Atlanta, just about a year ago we had two inches of snow coupled with a poorly timed forecast, and people were stranded overnight. One local station is doing “Snow Jam: Then and Now” on tomorrow evening’s news. I don’t know about you, but I’d take two feet in Boston — which I saw a few times in my four years there — over two inches in Atlanta.

A crossword clue

48-Across in today’s New York Times crossword: “group you can rely on when it counts?”

Twelve letters.

I’ve referred to their work as our nation’s once-every-decade exercise in large-scale enumerative combinatorics. (Although of course they do more than just the every ten years count required by the Constitution, and their mathematically trained employees are surely statisticians.)

Why everyone seems to have cancer

George Johnson wrote a few weeks ago in the New York Times on why everyone seems to have cancer. This is, somewhat paradoxically, good news – or at least not bad news. A larger proportion of people are dying of cancer now than in the past not because we’re getting worse at treating cancer – we’re actually getting better. But we’re getting better at treating other things (like heart disease) faster. See the graph on p. 10 of this CDC report on 2010 death statistics. Rates of death from complications of Alzheimer’s disease are actually increasing – and Alzheimer’s is even more of a disease of old age than cancer is.

This reminds me from a fact about cancer I learned from John Allen Paulos’s book Beyond Numeracy (1992), from a chapter explaining why correlation is not causation:

Nations that add fluoride to their water have a higher cancer rate than those that don’t. … Is fluoridation a plot? … [T]hose nations that add fluoride to their water are generally wealthier and more health-conscious, and thus a greater percentage of their citizens live long enough to develop cancer, which is, to a large extent, a disease of old age.

If we do cure cancer – which is a disease of old age because it essentially is due to accumulated errors in cell mutations – will we just find something that’s a disease of even older age?

Convex hulls, the TSP, and long drives

I’ve been reading the book In Pursuit of the Traveling Salesman by William Cook lately. In Chapter 10, on “The Human Touch”, Cook mentions a paper, by James MacGregor and Thomas Ormerod, “Human performance on the traveling salesman problem”. One thing that this paper mentions is that humans reach solutions to the TSP by looking at global properties such as the convex hull — there’s a theorem that a tour must visit the vertices of the convex hull in order.

This reminds me of one of my favorite travelogues, Barry Stiefel’s fifty states in a week’s vacation, in which Stiefel visited all fifty states on a week’s vacation (doing the lower 48 by driving, and flying to Alaska and Hawaii) Stiefel writes, in regard to getting to Kentucky: Now things were going to get complicated. Until then, my route had been primarily one of following the inside perimeter of the states on the outside perimeter of the 48 states. When planning the route, I had stared at the map for over an hour before concluding that Kentucky was going to be the toughest state to get. I just couldn’t plan a route that efficiently went through it. So I had to do a several-hour out-and-back loop to get it.

It’s not obvious that Kentucky would be so hard. Kentucky borders Virginia and Ohio, both of which are on that outside perimeter, while a bunch of states further west don’t border any of the perimeter states. (By my eyeballing, those are Kansas, Nebraska, and Missouri.) Stiefel chose to approach Kentucky from the southeast, though, and southeastern Kentucky is not exactly crawling with highways. And as you might gather from the title of Stiefel’s page, he was in search of a shortest-time route, not a shortest-distance one.

Rich states have Androids, rich people have iPhones

CN reports on a study claiming that smarter people use iPhones (here’s the original white paper by chitika, a mobile ad network.) There are several different models used (linear, stepwise linear, and logistic) to predict iPhone usage share for a state. Population density and education level (percent with bachelor’s degree) turns out to have a positive coefficient; median income and median age have negative coefficients.

Yet within the US, iPhone use is correlated with income; check out mapbox’s maps. The areas where iPhones predominate (red on the map) are richer than those where Android phones predominate (green on the map). This is a textbook example of the ecological fallacy: rich states have Androids, rich people have iPhones. (Compare the political fact in the US that rich people vote Republican but rich states vote Democrat.) I’d be interested to see a better study of this.

What’s the probability that an n-digit palindrome chosen at random is divisible by 11?

James Tanton asked in a tweet: what’s the probability that an n-digit palindrome chosen at random is divisible by 11?

This depends heavily on $n$. In particular, if $n$ is even, the probability is 1. We can notice that $11, 1001, 100001, \cdots$ are each multiples of 11 (these are $11 \times 1, 11 \times 91, 11 \times 9091, \cdots$) and write a palindrome as a sum of multiples of these. For example
52744275 = 5 \times 10000001 + 2 \times 10 \times 100001 + 7 \times 100 \times 1001 + 4 \times 1000 \times 11
is a multiple of 11.

Let’s also consider the case where the digit count is odd. This is a bit trickier, and Mike Lawler (https://mikesmathpage.wordpress.com/2015/01/20/a-nice-divisibility-rule-problem-from-james-tanton/) and his younger son worked out that out of the 900 palindromes with five digits, 82 are divisible by 11.. In the three-digit case it’s not hard to find the full list of palindromes divisible by 11, and there are eight of them, from a total of 90. they’re 121, 242, 363, 484, 616, 737, 858, 969. There’s an obvious pattern here – consecutive elements of the list differ by 121, except when they don’t (going from 484 to 616).

So let’s go back to the notation and say we’re consider an n-digit palindrome where n = 2k+1. We want to iterate over possible “first halves” of the palindrome and figure out what the middle digit should be: we’re answering questions like “what five-digit palindrome starting with 28 is divisible by 11?” There is at most one (2k+1)-digit palindrome with any first k-digit number which is divisible by 11. Consider the numbers
28082, 28182, \ldots, 28982.
Each of these differs by 100 from the last one, which is one more than a multiple of 11, so when we reduce mod 11 we get
k, k+1, \ldots, k+9
which are all different. Unless 28082 happens to be one more than a multiple of 11, one of these is a multiple of 11. As it turns out, 28082 is one {\it less} than a multiple of 11, and 28182 is a multiple of 11: 28182 = 11 \times 2562.

So there is either zero or one palindrome of each of the forms 10x01, 11x11, 12x21, \ldots, 99x99. When is there no such palindrome? Observe that, for example,
12000 - 21 = (10000 + 2000) - (20 + 1) = (10000 - 1) + 10 \times 2 \times (100 - 1) $
and so 12000 and 21 differ by a multiple of 11. So 12x21 and 24x00 are congruent mod 11. To get 24x00 to be a multiple of 11 we need 24x (that is, 240 + x) to be a multiple of 11 – so we take x = 2. Indeed, 12221 is a multiple of 11, with 12221 = 11 \times 1111.

But consider, for example, the number 27x72. This is congruent to 54x00 mod 11, and is a multiple of 11 if and only if 54x is – but none of 540, 541, \ldots, 549 are divisible by 11. Working backwards, this can be traced back to the fact that 27 \equiv 5 \pmod 11.

So the five-digit palindromes divisible by 11 are in one-to-one correspondence with the integers in $10, 11, \ldots, 99$ which are not congruent to 5 mod 11 — of which there are 90 - 8 = 82. This generalizes to (4k+1)-digit palindromes: there are 8182 nine-digit palindromes divisible by 11 (out of 90000 nine-digit palindromes), 818182 thirteen-digit palindromes (out of , and so on. These look an awful lot like the decimal expansion of 9/11, and in fact

For palindromes with three, seven, eleven, … digits the arithmetic works out a bit differently – but the final result can be expressed in this form: the number of (2k+1)-digit palindromes divisible by 11 is the closest integer to (9 \times 10^k)/11. The probability that Tanton asked for is almost exactly 1/11 — the same as the probability that a random integer is divisible by 1/11. The fact that the palindrome constraint has no effect is… obvious? Totally surprising. To be honest I don’t know.