How statistics lost their power – and why we should fear what comes next, by William Davies at the Guardian. Note that “statistics” shares a root with “state”, and states (in the sense of “sovereign state”, not “US state”) have always been quite interested in this sort of data.
Simon Parkin for Nautilus on how luck is engineered into video games. Some examples include near misses on slot machines (it’s actually illegal to deliberately manipulate these into happening, although since slot machines are now fully electronic it’s technically possible), the gameReally Bad Chess (chess with random peaces), and combat mechanics in video games that are tweaked to not follow the laws of probability. Because people get mad at actual randomness.
FiveThirtyEight’s puzzle feature, the Riddler, had a puzzle this week called Don’t throw out that calendar!. (Spoilers below; also, I’m delaying posting this until after the submission period for solutions ends.). This puzzle is due to Ben Zimmer.
Sometime in the 21st century, the following conversation takes place:
“Don’t throw out that calendar! You could reuse it in the future, when the days and dates on the calendar match up again.”
“OK, but that won’t happen for a long time. Forty years, in fact.”
“You’re right! In fact, this calendar has never had a 40-year gap before.”
What year is it?
We can start by observing that there are 14 possible calendars. To know what calendar we can use in a given year, we have to know whether it’s a common (non-leap) or leap year, and the day of the week of one particular day. Let’s pay attention to what John Conway called “Doomsday”, i. e. the last day of February (February 28 in common years, and February 29 in leap years). Doomsday moves forward one day in the week each year, except two days in each leap year – so, for example, Doomsday (February 28) 2015 was a Saturday, Doomsday (February 29) 2016 was a Monday, and Doomsday (February 28) 2017 will be a Tuesday.
So let’s consider when a 40-year gap will happen. A 40-year gap can’t begin with a common year. Say the year Y is the year after a leap year; then Doomsday of Y+6 will be the same as Doomsday of Y, since over those six years Doomsday will move forward six days in the week, plus one day for the leap year. If Y is two or three years after a leap year, then Y + 11 and Y will have the same Doomsday – Doomsday moves forward 11 days, plus three for the leap years. (This is basically cribbed from the calendar FAQ.
The exception is if Y is very near the end of a century that isn’t divisible by 400 – near enough that one of the otherwise intervening leap years would be a common year. In any of these cases Y and Y + 12 share a calendar – there are two intervening leap years.
(In most cases Y and Y + 28 share a calendar – that is, if no end-of-century common year intervenes. But we’d still have to do the casework anyway.)
So we must be in a leap year. But a 40-year gap, from Y to Y + 40, should lead to a Doomsday being one day later (40 + 10 = 50, which is one more than a multiple of seven) We must be in a leap year for which there are only NINE leap years in the next 40 – i. e. we’re in one of 2064, 2068, …, 2096 – since 2100 isn’t a leap year. But the calendars 2064 and 2068 will be repeated in 2092 and 2096 (28 years later, with 7 leap years), and 2092 and 2096 will be repeated twelve years later (with two leap years). So we must be in one of the five years 2072, 2076, 2080, 2084, and 2088.
Which one is it? See the note “You’re right! In fact, this calendar has never had a 40-year gap before.” By the same reasoning as above, 40-year gaps in the past could have only happened in starting in one of the leap years 1672-1688, 1772-1788, 1872-1888, since the Gregorian calendar – which insituted our current leap-year policy and thus the nine-in-forty loophole started in 1582. (1600 and 2000 were leap years, so 15xx and 19xx candidates are out.)
Doomsday in 1672 in the Gregorian calendar was Monday. I cheated by looking it up. For example you can type ncal -sFR 1672 at your friendly Unix prompt, where “FR” is the country code for France, a country that adopted the Gregorian calendar at the beginning. Honestly, it doesn’t matter what day of the week Doomsday in 1672 is, only the relationships between the Doomsdays. In 1772 Doomsday is Saturday (124 days later in the week, that is, 100 years plus 24 leap years); in 1872 it’s Thursday. So we can work out the doomsdays for each leap year. These are:
1672 through 1688: Mon, Sat, Thu, Tue, Sun
1772 through 1788: Sat, Thu, Tue, Sun, Fri
1872 through 1888: Thu, Tue, Sun, Fri, Wed
2072 through 2088: Mon, Sat, Thu, Tue, Sun
All seven days of the week occur here, so the puzzle seems to be broken. But if we’re in a country which switched calendars after 1672 but before 1772 – like, say, Great Britain and its colonies, which changed over in 1752 – then the doomsday-Monday calendar won’t have occurred yet but the other six have. (Again from the calendar FAQ), we could also be pretty much anywhere else in Protestant-dominated Europe or colonies thereof.) The answer is 2072.
Jason Crease shows that 2016 was, indeed, a year of surprisingly many celebrity deaths. The hard part is defining “celebrity”. There had been a previous BBC analysis based on the number of deaths of people with prewritten obituaries, but that is naturally skewed towards what one particular news organization is useful. Crease’s analysis uses Wikipedia data – both the length of the article and the number of revisions. It turns out that number of revisions of the Wikipedia article is a useful metric than the length of the article – a long article can be long in part because it includes lists of relatively uncontroversial material.
Other analyses include:
- Snopes, based on lists of notable deaths put out by various media organizations – but of course there’s probably some bias towards keeping the lists roughly the same length as in previous years.
- Researchers at the MIT media lab, C. Candia-Castro-Vallejos, Cristian Jara-Figueroa, César A. Hidalgo, who concluded that fewer famous people died in 2016 than expected (although not many fewer) Their notion of fame attempts to be more cross-cultural and looks at the number of languages someone has a Wikipedia article in.
It may just be that the Anglo-American axis had a bad year (and of course Brexit and the ascendancy of Donald Trump can’t have helped the mood in (the media in) either of those countries…
But 2017 has a total solar eclipse in the US, so we’ll be okay.
The game traditionally played with the dreidel is unfair, as Ben Blatt showed by simulation and Robert Feinerman showed analytically, but this is assuming that all four sides of the top are equally likely to come up when it is spun. The Nemiroffs took this one step further and checked whether the four sides of the dreidel are equally likely to come up. They took three dreidels and spun them (800, 1000, and 750 times respectively) and showed that these dreidels were unfair even in this more basic sense.
Interestingly, the patterns seem to tell a story about how the dreidels the Nemiroffs used were flawed. I reproduce their Table 1 here (and yes, they had a dreidel with Christmas imagery on it…)
|Driedel||ג (gimel)/ Santa||נ (nun)/ candy cane||ש or פ (shin or pei) / tree||ה (he) / snowman||total spins|
The letters נ (nun) and ה (he) appear opposite each other, as do ג (gimel) and whichever of ש or פ (shin or pei) is used. So what we see here is that:
- on the “old wooden” dreidel and the “santa” dreidel, two sides opposite each other are preferred – perhaps the dreidel is slightly wider in one direction than the other
- on the “cheap plastic” dreidel, one side is preferred and the side opposite it is dis-preferred – perhaps the dreidel is slightly heavier on one side or the handle is slightly off-center.
Presumably dreidels are allowed to be so unfair because nobody is playing dreidel for high stakes, so there’s no real incentive to construct the things properly.
Today is my 33rd birthday. In honor of that, here are some interesting properties of 33.
One from Wikipedia’s list which I like because I have a soft spot for integer partition problems, is that it’s the largest positive integer that cannot be expressed as a sum of different triangular numbers. The others are 2, 5, 8, 12, and 23: see OEIS A053614. There’s an almost-proof of this fact in this compilation of problems from mathematical olympiad selection tests; that compliation cites this review paper of Erdos and Graham on results in combinatorial number theory, but I can’t find the result there! If I make it to 128, it’s the largest number not the sum of distinct squares.
An idea of the proof is as follows: check by enumeration that 34 through 66 can be written as the sum of distinct triangular numbers, where 66 is not used: 34 = 28 + 6, 35 = 28 + 6 + 1, 36 = 36, 37 = 36 + 1, 38 = 28 + 10, …, 66 = 55 + 10 + 1. Then add 66 to each of these to get a way of expressing 67, 68, …, 132 as a sum of distinct triangular numbers – for example 104 = 66 + 38 = 66 + 28 + 10. Add the largest triangular number less than 132 (this turns out to be 120) to each of those decompositions to write each of 133, …, 252 as such a sum. And so on.
Why is this worth singling out from the list? Many of the others include some arbitrary constant, such as:
- “the sum of the first four positive factorials”
- “the smallest odd repdigit that is not a prime number” (a “repdigit” is a number that consists of the same digit repeated, so the constant 10 is hiding here; inf act you could argue this is basically a strange way of stating the identity 33 = 3(10+1))
It’s also pretty cool that 33 is a Blum integer – that is, a product of two distinct primes, each of which is congruent to 3 mod 4. (But it’s not the first Blum integer – that’s 21.)
Another property of 33, which is less negative, is that it’s the first member of the first cluster of three semiprimes (33 = 3 x 11, 34 = 2 x 17, 35 = 5 x 7). That is, it’s the first member of this sequence. In OEIS terms, I’d say that being the first member of a sequence, or the last member of a sequence, is more interesting than being just out in the middle of the sequence somewhere.
The semiprime thing appears to have an arbitrary constant of 3. But there are no clusters of four or more consecutive semiprimes – out of four consecutive integers, one is divisible by 4 – so 33 is the first member of the first cluster of semiprimes of maximal length.
Want to know what’s interesting about some number? You could trawl the OEIS or Wikipedia, or you could go to Erich Friedman’s list, which is a bit more selective, only listing one property of each number. In fact both of my interesting properties of 33 appear here – the semiprime one is, for Friedman, a property of 34, “the smallest number with the property that it and its neighbors have the same number of divisors”.