Math bracketologyology

Jordan Ellenberg wrote a piece for Sunday’s New York Times on The Math of March Madness. It’s centered around a paper by Michael J. Lopez and Gregory J. Matthews which claims that a model combining point spreads and “possession based team efficiency metrics” (i. e. average numbers of points scored or given up per possession) did quite well in Kaggle’s 2014 March Madness competition. (For legal reasons, Kaggle had to call it “March Machine Learning Mania”.)

Sadly, this article doesn’t include Jordan’s own contribution to bracketology, the “Math Bracket”, in which the school with the better math department is picked to win each game: here are the 2015, 2014,”>2013, and 2010 (the original). If there are 2011 or 2012 math brackets, they don’t appear on his blog. In 2010 the math bracket picked Berkeley (excuse me, “Cal”, since we’re talking athletics) to win; in 2013 through 2015, Harvard.

I don’t know how well a totally random bracket (i. e. picked by coin flips) would do, but the math bracket at least starts out better than this. The math bracket usually picks the higher-seeded team in the first round – 19 of 32 in 2015, 22 of 32 in 2014, 23 of 32 in 2013, and 23 of 32 in 2010. This is because bigger schools (up to a point) tend to have better math departments and are better at basketball. (Quality of a department is judged by how many people an anonymous group of number theorists and geometric group theorists can name so is correlated with department size, which in turn is correlated with undergraduate enrollment, etc.)

The math brackets seem to break down in the later rounds, though. The 2010 bracket has a final four featuring teams seeded 2, 4, 8, and 11; 2013 is 2, 6, 12, and 14; 2014 is 2, 2, 10, and 12; 2015 is 7, 11, 11, and 13. The average final four team in the math brackets is therefore an 8 seed (the average of those numbers is 127/16 = 7.9375); the average team in the tournament is of course an 8.5 seed. The very best basketball teams just aren’t at schools with the best math departments.

Posted in Uncategorized | Leave a comment

Worst pi approximation ever

The average sinuosity of rivers – their length as the river travels, divided by the distance from source to mouth as the crow flies – is supposed to be π.  But it’s actually 1.94, and depends heavily on the river.  Via James Grime, writing for the Grauniad.  

That’s worse than the Indiana pi bill or 1 Kings 7:23.  To be fair to the Biblical text, which refers to a “molten sea” that is “round all about”, thirty cubits around and ten cubits across, nobody there is claiming those are high-precision measurements.  And if you want to try really hard to make things work out you can argue that the inner brim was thirty cubits around and the outer diameter was ten cubits, which is stated three verses later.  

(It’s a shame that approximation isn’t in 1 Kings 7:22…)

This post is scheduled to go out at 12:26, because I am in Eastern time and Pi Day was originally invented in San Francisco. Or actually because I forgot to do a pi day post until after I saw other people did.

Posted in Uncategorized | 2 Comments

Time zones

It’s that day when everyone in the US pays attention to time zones, because we all lost an hour of sleep tonight.  And at least in my case, I’ll be a bit bitter tomorrow morning, when the sun rises at 7:56 in Atlanta – a city that really should be on Central Time, but is presumably Eastern because, well, look at a map, Georgia is on the East Coast.  (For a cheap thrill, drive to Alabama in less than an hour – as you can do on I-20. Then set your clock back  and have arrived before you left.) 

Time zones are basically a clustering problem with some extra restrictions.  You want to set times so that:

  • Almost everybody’s time differs by a whole number of hours from UTC;
  • Clock times are not too far from solar time;
  • The time where you are is the same as the time in nearby places you communicate with.
  • Time zone boundaries align with geographical boundaries

The second of these criteria keeps time zones narrow; the third keeps them wide.  The Basement Geographer has some examples where keeping the time zones wide – as in countries like Brazil, Russia, India, or China – means time zone boundaries with neighboring countries that aren’t the standard one-hour change upon moving east or west. And Alison Schrager at Quartz has suggested that the US should have two time zones, one hour apart.  (These would be UTC-5 in the east and UTC-6 in the west.).  All of Western Europe being on UTC+1 is another example – although from what I understand there’s some World War II history tied up in here. France, for example, was on UTC before the war – although the law called it Paris mean time, retarded by nine minutes and twenty-one seconds. Anything to avoid letting the British win.


Posted in Uncategorized | Leave a comment

Indian food (and wine) pairing

Scientists have figured out what makes Indian food so delicious, from Roberto A. Ferdman at Wonkblog. In Western cuisines, ingredients in a dish are more likely to share flavor components than ingredients picked at random; in Indian cuisines, ingredients in a dish are less likely to share flavor components than ingredients picked at random. (East Asian cuisines are like Indian ones in this respect.) This is a result from a paper spices form the basis of food pairing in Indian cuisine by Anupam Jain, Rakhi N K, and Ganesh Bagler.

The paper describes this sort of “negative food pairing” as possibly originating from a “copy-mutate model”, which comes from a paper called The nonequilibrium nature of culinary evolution by Osame Kinouchi, Rosa Diez-Garcia, Adriano Holanda, Pedro Zambiachi, and Antonio Roque. The copy-mutate model supposes that recipes (well, bags of ingredients) evolve by copying and mutation, where ingredients have an intrinsic fitness and mutations involve replacing inferior ingredients with superior ones. I’m not convinced by this, because there’s no reason to think that Indian food would be more prone to evolution than any other.

I learned about the first paper from my wife, a sommelier. So this raises an interesting question: how do you pair wine with Indian food? Do you pair food with wine that contains the same flavor compounds (which is roughly the Western way of thinking about wine)? Or would it be more appropriate, on some level, to pair with a wine that doesn’t contain the same compounds? Here are some recommendations from Serious Eats for pairing wine with Indian food, and here are some recommendations from a wine pairing web site by the British food and wine writer Fiona Beckett. Make your own conclusions.

Posted in Uncategorized | 1 Comment

Psych journal bans hypothesis testing

The journal Basic and Applied Social Psychology is banning the NHSTP. (That’s the “null hypothesis statistical testing procedure” you might remember from an intro stats course.) This includes banning confidence intervals, thanks to the duality between confidence intervals and hypothesis tests. The journal’s editors write that:

We hope and anticipate that banning the NHSTP will have the effect of increasing the quality of submitted manuscripts by liberating authors from the stultified structure of NHSTP thinking thereby eliminating an important obstacle to creative thinking.

I’ve seen the fixation on statistical significance be a big block in presenting results in business settings. I can’t count the times where I’ve had to explain, especially with “big data” sets, that just because something is statistically significant doesn’t mean it’s practically significant. How much of the cult of statistical significance comes from the choice of words? Presumably the same is true in the social science setting; at least in both cases you have people with some statistical education and who are generally used to looking at numbers but are not statisticians or otherwise specialists in a quantitative field.

However this may be a bit of an overreaction. To say “thou shalt not do X” could be just as restrictive as “thou shalt always do X”.

Posted in Uncategorized | 3 Comments

Nate Silver on sports data

Nate Silver has written on what’s so great about sports data. In short: it’s rich data (not just big data), we know the rules, and feedback comes quickly.

This is from ESPN The Magazine‘s “Analytics Issue”, which comes out each year connected with the Sloan Sports Analytics Conference in Boston on Friday, February 27 and Saturday, February 28. I’ve been, back in 2013 when I was working for a sports ticketing company; a lot of interesting talks happen there. Most talks from past conferences have been posted online so it’s worth poking around if you have an interest.

Posted in Uncategorized | 1 Comment

Chinese New Year during Lent?

As I’m writing this, tomorrow is Chinese New Year (in the US) and today is Ash Wednesday. (I suspect it’ll be a day later when you read this.) This raises a question: does Chinese New Year often fall during Lent (that is, on or after Ash Wednesday)? The coincidence creates conflicts for many Asian Catholics: see e. g. here, here, here, here.

Chinese New Year falls on the second new moon after the winter solstice.

Ash Wednesday is 46 days before Easter. (Yes, 46. I know, you thought Lent was 40 days. Sundays don’t count.) Easter is the Sunday after the first full moon on or after the spring equinox (the “Paschal full moon.)

How far apart are these? Well, there are 90 days between the winter solstice (December 22) and the spring equinox (March 21). This is between three and three-and-a-half lunar months (of 29.5 days each), so from Chinese New Year to the Paschal full moon (i. e. the full moon on or after the spring equinox) is either one-and-a-half or two-and-a-half lunar months. In the cases when it’s one-and-a-half, Ash Wednesday will fall around Chinese New Year; when it’s two-and-a-half, Ash Wednesday will be a month or so after Chinese New Year.

The short interval happens when Chinese New Year is relatively late in the window of dates it can occur, which is January 21 to February 20. In particular, if Chinese New Year is less than about 44 days (that is, one-and-a-half lunar months) before the spring equinox (March 21), then it’s 1.5 lunar months from the Paschal full moon, and we get a situation like this year’s. That is, Chinese New Year is roughly around the beginning of Lent if it’s around February 5 or later – about half the time. Not so unusual after all.

As for the day of the week – Ash Wednesday works out to be between 39 and 45 days before the Paschal full moon. Chinese New Year is about 44 days before in half of years. So Chinese New Year can only fall within the first couple days of Lent. (If you want to do the calculations: can it fall later than Thursday? I suspect this might be possible, because the two calendars work on different rules — the Chinese calendar is based on astronomical observation whereas the Christian ecclesiastical calendar is based on computations than can be done with relatively simple arithmetic.)

Posted in Uncategorized | 2 Comments