# Internet Problem Solving Contest

Some readers may find the Internet Problem Solving Contest of interest. You’ll have five hours to solve some puzzles of a recreational-mathematical nature; the best strategies are almost always programming-based. I’m reminded of Project Euler. It takes place this Saturday, June 2, from 10:00 UTC to 15:00 UTC. (This is unfortunately timed for Americans; the organizers are Slovakian.)

# Most extroverted cities list

Fun with small sample sizes: The most extroverted city in the US is Keota, Iowa.

Keota has 1,009 people. The cutoff for being included in this study was, supposedly, 1,000 people.

The top ten cities are listed at the Des Moines register article; they have populations 1009, 1431, 3183, 1000, about 350, 206, 1468, 950, and 249.  (Yes, you read that right; a lot of these cities appear to have populations smaller than 1000. I suspect there’s some conflation of “city” and “area served by that city’s post office” going on.)

This is the same effect that I mentioned last week with regard to Amazon’s best-read cities list, and again I wouldn’t be surprised if the most introverted cities are also very small. The Des Moines Register article says that pyco plans to release this list as well.

# Clustering of college graduates: is it getting worse?

An article in today’s New York Times, by Sabrina Tavernise, is entitled As College Grads Cluster, Some Cities Are Left Behind. A lot of old US cities with economies that used to be based on manufacturing are having trouble making the transition to our current, post-manufacturing economy. And one difficulty such cities face is a lack of college graduates.

Historically, most American cities had relatively similar shares of college graduates, in part because fewer people went to college. In 1970, the difference between the most-educated and least-educated cities, in terms of the portion of residents with four-year degrees, was 16 percentage points, and nearly all metro areas were within 5 points of the average. Today the spread is double that, and only half of all metro areas are within 5 points of the average, the Brookings research shows.

But what does “relatively similar” mean here? The proportion of adults in metropolitan areas that have college degrees, according to the accompanying infographic, has risen from 12% in 1970 to 32% in 2010. I would guess that a city with, say, 9% college graduates in 1970 is comparable to a city with 24% college graduates in 2010 — both have three-fourths of the average.

(Admittedly this doesn’t hold up if the percentages are quite large. For example, let’s say we’re looking at literacy rates; I’d say that a metropolitan area having 40% literacy in a time when the national rate is 50% is relatively better off than a state having 76% literacy when the national rate is 95%. But bear with me.)

Indeed, from the infographic you can also get the actual distribution of the percentage of college graduates in each of the metro areas in question. (The study includes 100 metro areas in each of 1970 and 2010.) In 1970 the average metropolitan area had 11.5 percent college graduates, with SD 2.9 percent; the standard deviation is 25 percent of the mean. In 2010 the average metropolitan area had 29.4 percent college graduates, with SD 6.2 percent; the standard deviation is 21 percent of the mean. In these terms, the disparity has gotten smaller!

So let’s normalize the share for every metropolitan area by comparing to the average. In 1970, for example, Washington, DC had 22.1% college graduates, compared to the average of 11.5%, so it had 1.92 times the average. In 2010, Washington, DC had 46.8% college graduates, compared to the average of 29.4%, so it had 1.59 times the average. In this respect it looks like Washington is getting more like the US, not less. (Washington was the most college-degreed metropolitan area, in both samples, which presumably has something to do with its dominant industry being government.)

If we make histograms of these normalized shares for 1970 and 2010 and superimpose them, we get the plot below. Black is 1970, red is 2010. The distribution gets narrower, not wider, as time passes when viewed on this scale.

I don’t mean to take away from the fact that this disparity exists between metropolitan areas. But the real problem is probably not so much that the educational disparity is growing as that the returns to a college education are larger with the departure of manufacturing jobs.

Matt Yglesias has commented on this from an economic point of view, echoing some points that Enrico Moretti makes in The New Geography of Jobs.  In particular, what’s the point of states funding public education if people are just going to move away from those states?

Edited to add, 4:49 pm: Junk Charts comments on the graphic itself.

# A couple short mathematical films by Cristobál Vila

Inspirations, a short mathematical film by Cristobál Vila, who works in computer graphics. The underlying conceit is “what might Escher’s workplace look like?”, although in a completely imaginary sense. Nature by numbers is similarly impressive, featuring animations of the appearance of the Fibonacci sequence in phyllotaxis. These aren’t really “educational” videos — you won’t learn anything from them if you don’t recognize what’s going on already — but some of you may recognize what’s going on, and if you don’t Vila has produced pages on the mathematics behind Inspirations and Nature by numbers.

# Weekly links for May 28

(Usually I’ve been posting weekly links on Sundays, but I was on a road trip all week and so I couldn’t keep up with what was coming over the wires too well. Also, road trips inspire some mathematical thoughts; you’ll see them later this week, I hope.)

An excellent metafilter post has links to lots of different types of graph paper.

A free online forecasting textbook (under construction) by Rob Hyndman and George Athanasopoulos.

A metafilter thread on the history of the stockbroker scam.

# Fajita pricing

Seen in a Mexican restaurant: fajita meal for three, \$36. For four, \$48. For five, \$60. They then offer the helpful suggestion that if your group has nine people, you could order the fajita meal for four and the fajita meal for five, and you’d be set. Of course this is exactly the same as saying that the fajita meal is \$12 per person, minimum of three people, since any integer that’s at least three can be written as a sum of threes, fours, and fives.

# Amazon’s best-read cities

Via Berkeleyside, I learned about Amazon.com’s best-read cities list. The top five are Alexandria, VA; Cambridge, MA; Berkeley, CA; Ann Arbor, MI; Boulder, CO. The populations (2010 Census estimates) of these cities are 139,966; 105,162; 112,580; 113,934; and 97,385.

You won’t be surprised, then, to learn that the survey only includes cities with populations of over 100,000. A lot of these very high-ranked cities barely get over that line. Amazon hasn’t released the whole list as far as I can tell, but I would suspect that the worst-read cities of population also are just barely over 100,000 in population. To be fair, I’ve cherry-picked a bit here; #6 Miami and #9 Washington, D. C., for example, are quite a bit larger than 100,000.

But we expect larger deviations from the norm in smaller samples. I wouldn’t be surprised to learn that, say, some 100,000-person portion of San Francisco or Boston is better-read than Berkeley or Cambridge. Perhaps the list of most well-read zip codes, then, would be more revealing.

# Weekly links for May 20

How to win at Battleship, in advance of the movie of the same name. Link goes to the non-technical description at Slate; here’s the technical version, by Nick Berry.

How subway networks evolve, from Scientific American.

Given a sack of sugar, an unbalanced scale, and two 5-pound weights, measure exactly 10 pounds of sugar. (H/T Dave Richeson)

Should we get rid of the “=” sign in mathematics?, from Republic of Math.

A mathematical challenge to obesity: the New York Times’ Claudia Dreifus speaks with Carson Chow of Pitt, who has a blog.

# A real-life example of a bimodal (or trimodal?) distribution

There are currently 174 books on my Amazon wishlist that I could order directly from Amazon. (My wishlist has a total of 195 books, but 21 are only available from other sellers.) Total price is approximately \$3,549 (I rounded all prices to whole dollars), for a mean of approximately \$20 per book.

But the median price of a book on my wishlist is (again to the nearest whole dollar) \$16; the difference between the median and the mean is a hint that the distribution is skewed. And there are actually two peaks — one centered on \$10 and one centered on \$16-17. The distribution looks like this:

I’ve cut off the histogram at \$100, which omits Mitchell’s Machine Learning at a list price of \$168.16. Here’s a zoomed-in version omitting the 23 most expensive (all those over \$30):

The two peaks are easy to explain: paperbacks and hardcovers, respectively. The long right tail is pretty much exclusively made up of technical books. I’d suspect that for those who read a lot but don’t buy technical books, the bimodality holds up but there’s a lot less skewness.

(If you look closely you might see a third peak, at around \$60, but in a data set of this size I’m not sure that’s real.)

This is a much less depressing example than my standard example of a bimodal distribution, salaries of first-year lawyers.

# Sedgewick slides on “Algorithms for the Masses”

Robert Sedgewick has the slides for a talk, Algorithms for the Masses on his web site.

My favorite slide is the one titled “O-notation considered harmful” — Sedgewick observes that it’s more useful to say that the running time of an algortihm is ~aNc (and back this up with actual evidence from running the algorithm) than to have a theorem that it’s O(Nc) (based on a model of computation that may or may not be true in practice).

The serious point of the talk, though, is that everyone should learn some computer science, preferably in the context of intellectually interesting real-world applications. This is what Sedgewick is doing in his Princeton course and in his book with Kevin Wayne, Algorithms, 4th edition, which I confess I have not read. There’s a Coursera course, in six-week parts, starting in August and November respectively. For a lot of the heavy-duty mathematics you can see Sedgewick’s book with Flajolet, Analytic Combinatorics, a favorite of mine from my grad-school days. There’s even a Coursera course: 5-week part 1 in February 2013 and 5-week part 2 in March 2013.