Weekly links for August 19

From Deathsplanation, How tall is an alpaca? (and how long did people live in the past?)

From the Aperiodical, how to win at (the UK game show) Pointless.

Igor Pak has a collection of attempts to define combinatorics and a blog post on the subject.

From the MAA, some beautiful drawings of Platonic solids from the 16th-century printmaker Wenzel Jamnitzer.

From the Telegraph, Quants: the maths geniuses running Wall Street.

George Hart on making music with Mobius strips (following Dmitri Tymoczko).

Andrew Gelman and Kevin O’Rourke ask how statisticians pick their methods.

Carlos Futuri’s cartographical map projections.

OpenSignal on how phone batteries measure the weather (via Hacker News)

Hanging Hyena on unbeatable words in Hanging with Friends.

Emily Singer for Quanta magazine: In Natural Networks, Strength in Loops.

Hannah Fry on why everyone is more popular than you.

Jiri Matousek has a collection of Thirty-three miniatures: Mathematical and algorithmic applications of linear algebra

A Quora question, What kind of math do you use in your work?

Revisiting Pythagoras goes linear

Let’s say for some reason that you know \sin \theta and \cos \theta, for some angle \theta, and you need to figure out what \theta is. Let’s say, furthermore, that you live in some benighted age which doesn’t have calculators or even trigonometric tables.

There are a few approaches to this. One is what I did in my “Pythagoras goes linear” post. We can fit a linear model to the points

(x_i, y_i) = (\cos \theta_i \sin \theta_i)

where the $\theta_i$ range over an arithmetic sequence with endpoints 0 and $\latex \pi/2$, namely

0, {\pi \over 2n}, {2\pi \over 2n}, \cdots, {n\pi \over 2n}.

The model you get out is quite simple:

\theta \approx \pi/4 - 0.7520 \sin \theta + 0.7520 \cos \theta

This is actually just a few lines of R.

 n = 10^6
 theta = (c(0:n))*pi/(2*n);
 x = cos(theta); y = sin(theta);
 summary(lm(theta~x+y))

I’ll save you the output, but be impressed: r2 = 0.9992$, and the mean square error is 0.0126 radians or 0.72 degrees. So we can write \theta \approx \pi/4 + 0.7520 (-\cos \theta + \sin \theta), in radians. In degrees this is \theta \approx 45^o + 43.08^o(-\cos \theta + \sin \theta).

For example, say you have an angle with \sin \theta = 5/13, \cos \theta = 12/13 — the smaller angle of a 5-12-13 right triangle. Then we get \theta \approx 45^o + 43.08^o (-7/13) \approx 21.8^o. In fact \theta = \tan^{-1} 5/12 \approx 22.6^o — we’re not off by much! And the error is never more than a couple degrees, as you can see in the plot below.

arctan error

This was inspired by a post by Jordan Ellenberg that I came across recently: How to compute arctangent if you live in the 18th century, which refers back to my Pythagoras goes linear. A better approximation, although nonlinear, is

\theta \approx (3y)/(2 + x)

where x = \sin \theta, y = \cos \theta. This is essentially a simplification of the rule that Ellenberg’s source (Hugh Worthington’s 1780 textbook, The Resolution of Triangles) gives, which can be translated into our notation as

\theta = {86*pi \over 180} {y \over x/2 + 1}. Applying this to our test angle with x = 12/13, y = 5/13, we get \theta \approx 15/38 = 0.39474 radians, while in truth \tan^{-1} 5/12 = 0.39479.

This formula \theta \approx (3y)/(2+x) is so nice that I can’t help but suspect there’s a simple derivation. Any takers?

Weekly links for August 12

Cathy O’Neil asks if lawmakers should use algorithms.

Daina Taimina gave a talk at TEDxRiga, Crocheting adventures in hyperbolic planes.

Corey Chivers gave a talk on competitive data science (Kaggle) using R and Python.

Colm Mulcahy has another nice mathematical card trick.

Mark Pearson on average distances to airports.

Jennifer Ouellette on the mathematics of learning language.

Geoffrey de Smet on false assumptions for vehicle routing.

From embed.ly, visualizing reddit discussions.

What’s the theoretical limit for the rate of success of predicting wins and losses in NHL games?

Some statistics from Tim Day’s experience solving Project Euler problems.

Alexander Klotz examines the gravity tunnel in a non-uniform Earth.

Carl Rasmussen and Christopher Williams’ book Gaussian Processes for Machine Learning is available free online.

Joseph Rickert summarizes Nate Silver’s appearance at this year’s Joint Statistical Meetings.

From Wired a couple years back, Jason Fagone, Teen Mathletes Do Battle at Algorithm Olympics (i. e. the IOI).

David Bau’s conformal map viewer.

Predicting the NFL using Twitter, by Shiladitya Sinha, Chris Dyer, Kevin Gimpel, and Noah A. Smith.

“A lottery is a taxation upon all the fools in creation” – Fielding

From James Harvey: The Powerball Jackpot is $425m. Should you play? (It’s too late for you to care – I’m writing this at 7:51 PM Pacific time, and the drawing is at 7:59 – but if nobody wins, you might care on Saturday, with a bigger number.) Harvey’s conclusion: it basically never makes sense to play, because as the jackpot goes up the likelihood of having to split it also goes up. But he’s still in it, for the entertainment value.

I’ve got some tickets, too, because some coworkers went in on a pool, and I don’t want to be the sucker who has to go to work tomorrow.

Weekly links for August 5

Roberto Tamassia at Brown is the editor of the Handbook of Graph Drawing and Visualization, which you can read online.

David Eppstein on how to play Planarity.

Edmund Harriss wrote a blog post on “Form Follows Functions”, a teaser for these notes of the same name graphically exploring how changes in functional forms translate into changes in the corresponding graphs.

Chris Stucchio on Mechanical Turk and error correcting codes.

Insight Data Science on the transition from PhD to data scientist (via Hacker News).

Miguel Rios at the official twitter blog on visualizing volume data from twitter; the corresponding paper is Miguel Rios and Jimmy Lin, “Visualizing the ‘Pulse’ of World Cities on Twitter. Of particular interest is Twitter data from Riyadh, where prayer times and Ramadan are immediately observable.

Steve Staude at Fangraphs did a two-part analysis of forecasting strikeout rates in batter-pitcher matchups from the averages for the batter and the pitcher: part one, part two. It’s a hell of a lot better than using data on the individual matchup between batter X and pitcher Y, which is plagued with notoriously small sample sizes.

From Cathy O’Neil: The Stacks Project gets ever awesomer with new viz
Analyzing the complexity of the Stacks Project graphs; from Jordan Ellenberg, How much is the Stacks Project graph like a random graph? (The Stacks project is an online textbook of algebraic geometry, but these posts are not about algebraic geometry.)

Timo Bingmann has constructed some visualizations/audibilizations of sorting; Brady Haran has similar video in his Computerphile channel.

34 million pizzas is a massive understatement

During a baseball game today, heard a commercial for Domino’s Pizza claiming that they have 34 million different pizzas. This is apparently a claim that Domino’s started making in saying that they shouldn’t be forced to list calorie counts: see the Washington Post from June 2012.

In any case, I wondered where they got that number. From their online ordering tool I find:

  • on the first page (size & crust): 10 combinations of size and crust;
  • on the second page (cheese & sauce): six different amounts of cheese (none, light, normal, extra, double, triple) and thirteen sauce choices: four kinds of sauce each in three different amounts, or none;
  • on the third page, 25 different toppings, each of which you can order in six different amounts (none, light, normal, extra, double, triple).

The “34 million” number is presumably 225 = 33,554,432 (yes or no for each topping), but a more accurate calculation of the number of possible pizzas is 10 \times 6 \times 13 \times 6^{25} or about $2.2 \times 10^{22}$. Call this number N. And even that’s ignoring that you can make independent choices for the two halves of your pizza – so if we really wanted to inflate the number, there are N + {N \choose 2} possible pizzas, where the {N \choose 2} pizzas are just pairs of choices for the two halves. This is around 2.4 \times 10^{44}.

But nobody would have believed that. And how many of those pizzas are any good?