Weekly links for October 21

Rick Wicklin simulates playing craps with unfair dice.

StubHub has a data blog. The first post examines the correlations between preference in sport (baseball vs. football) and political party (Democratic vs. Republican).

Integrals don’t have anything to do within discrete math, do they?. By Mark Kayll, winner of the MAA’s Allendoerfer prize for articles published in Mathematics Magazine.

Allen Downey has a rough draft of his book Think Bayes.

Steven Strogatz has written a brief, nontechnical introduction to catastrophe theory for the New York Times.

Probability and game theory in The Hunger Games.

Daniel Engber asks how the Internet fell in love with a stats class cliche. (You know which one.)

Shapley’s layman’s account of the general principles of game theory. (Was anyone else surprised, when the Nobel was announced, to learn that Shapley was still alive?)

The Daily Show with Nate Silver.

Brian Hayes writes about sphere packing for American Scientist and talks about how he made the images for his blog.

David Barber has written a textbook, freely available online, entitled Bayesian reasoning and machine learning.

A right triangle optimization problem

James Tanton asked in a tweet a few days ago: “If a,b,c are the sides of a right triangle with hypotenuse c, what is the largest possible value of a/c + b/c?”

The answer is the square root of 2. But how to see this?

Well, without loss of generality we can assume c = 1. So we want to know, if a and b are the sides of a right triangle with hypotenuse 1, what is the largest possible value of a + b?

Now a and b are, respectively, cos θ and sin θ for some acute angle θ… so we want to maximize $\sin \theta + \cos \theta$ over acute angles $\theta$. But how? Differentiate? That sounds like work.

Take a look at the picture below. Imagine moving the upper right corner of the (black) right triangle along the (black) circular arc; we’re looking for the point where the sum of the lengths of the legs is maximized. But now say we can move the upper right corner of the right triangle wherever we want it. As long as we stay along one of those red lines — which have slope -1, i. e. make a 45-degree angle with each axis, the sum of the leg lengths stays constant!

As we move the upper right corner along the circular arc, the sum of the leg lengths is maximized when it’s locally constant — that is, when the circle is tangent to one of those lines. The red lines make a 45-degree angle with the coordinate axes; the optimal hypotenuse will be perpendicular to them, also making a 45-degree angle with the coordinate axes. The optimal right triangle is the one with 45-degree angles… so if its hypotenuse is 1, its legs are each $1/\sqrt{2}$ , and their sum $\sqrt{2}$ . (This is not the triangle in the plot — the triangle in the plot is in the 3-4-5 ratio. The sum of its legs is 1.4, not much short of optimality.)

This is, secretly, the principle behind using Lagrange multipliers to maximize a function subject to a constraint. It’s also much easier to see than to explain.

Older should mean bolder, says Speigelhalter

David Spiegelhalter suggests that older should mean bolder. Lethal risks are less dangerous if you’re old, because you have less to lose.

Bill Watterson thought of this too. (I thanks Michael Yingling’s Calvin and Hobbes search engine for helping me find this.)

Weekly links for October 14

Freakonomics Q&A with Nate Silver.

A new geometric minimal composition every day.

forecasting the Presidential election using regression, simulation, or dynamic programming.

Best practices for scientific computing. Worth reading if you are, like so many scientists, a self-taught software developer. (Actually, very little of what’s said here is specific to scientific computing.)

Stein’s paradox in statistics, by Bradley Efron and Carl Morris.

The Simons Foundation has an article by Erica Klarreich, Getting Into Shapes: From Hyperbolic Geometry to Cube Complexes and Back”, on Thurston’s geometrization conjecture.

Peter Norvig’s spelling corrector in 25 lines of Python

Square numbers as products of triangular numbers

James Tanton asks: “An old question: Triang nmbrs:1,3,6,10,15,.. For every triangular nmbr is there a second larger triang nmbr so that their product is square?”

Answer: yes! I thought “hey, this is a question I can answer in my head!” but didn’t get further than noticing that 1 times 36 is 36. Then I got home and wrote some Python. (You’ll probably see this blog’s language of choice shifting from R to Python, mostly because I’m using Python at work and this gives me a chance to practice.)

import math

def tri(n):
    return n*(n+1)/2

def is_square(integer):
    root = math.sqrt(integer)
    if int(root + 0.5) ** 2 == integer:
        return True
    else:
        return False

def next_tri(n, m):
    z = False
    i = 0
    while ((not z) and (i < m)):
        i += 1
        if is_square(tri(n)*tri(n+i)):
            return n+i
    return 0

def answers(n, m):
    list = [0]*n
    for i in range(1, n+1):
        list[i-1] = next_tri(i, m)
    return list

The first function here, tri, outputs triangular numbers; the second tells you if its input is a square, and is taken from this Stackoverflow answer. next_tri is where the “magic” happens; it outputs the index of the first triangular number k greater than n such that tri(n)*tri(k) is a square. But since this loop may, a priori, be infinite, we cut this off at k = n+m. If none of tri(n)tri(n+1), tri(n)tri(n+2), …, tri(n)tri(n+m) are square, it outputs 0. Finally, calling answers(n,m) outputs the list answers(1,m), answers(2,m), …, answers(n,m).

If you call answers(10, 1000) you get the output

[8, 24, 48, 80, 120, 168, 224, 49, 360, 440].

So, for example, tri(1) tri(8) = (1)(36) = 36 is square, namely 6²; tri(2) tri(24) = (3)(300) = 900 = 30²; and so on. If you ignore the eighth element of this list and stare for a moment, you start to see a quadratic! Namely,

$tri(n) tri(4n(n+1))$

appears to be a square. And in fact we can write

$tri(n) tri(4n(n+1)) = {n(n+1) \over 2} {(4n^2+4n)(4n^2+4n+1) \over 2} = {n(n+1) \times 4(n(n+1))(2n+1)^2 \over 4} = [n(n+1)(2n+1)]^2.$

This is an identity that’s not hard to prove, and answers Tanton’s question in the affirmative, but I think it’s hard to discover without computation. Of course you could do the computation by hand, but it’s 2012.

But this identity tells us that $tri(8) tri(288) = (8)(9)(17)^2$ — and this is true, but 288 is not the smallest solution to the problem that Tanton originally posed. In fact the computer search gives $tri(8) tri(49) = 210^2$ . When do smaller solutions exist?

Weekly links for October 7

Steven Strogatz digs up a clip about the birthday problem on the Tonight Show.

UK MPs have weak ability to do basic statistics. (I wonder if the same would be true of US congresspeople.)

In the Guardian, is it Mitt Romney’s fault, if he loses?, with regressions based on approval ratings.

Some data on job listings for data scientists

Weekly links for September 30

The Mathematics Behind xkcd: A conversation with Randall Munroe.

Carl Bialik at the Wall Street Journal writes about which voters are the most powerful in US presidential elections.

Forget your fancy data science: try overkill analytics.

Dynamic pricing for drinks, from Wired via io9.

An overview of mathematics at Google.

Identification of fraudulent elections. (Basically, if a region has abnormally high turnout and most of the votes go to a single candidate, be suspicious!)

Call me maybe and the prisoner’s dilemma.

Andrew G Haldane: The dog and the frisbee, via John D. Cook (Cook points out that you want to use a simple model when possible, because complicated models are less robust.)

Optimizing your baggage claim experience.

What is the optimal way to find a parking spot?

Weekly links for September 23

An analysis of (leaked) PIN numbers from DataGenetics

Friends you can count on, from Steven Strogatz’ current New York Times series “Me, Myself, and Math”.

The Toolbox, the first episode of Samuel Hansen’s 8-episode audio series featuring stories from the world of mathematics.

Terry Tao has written a probablistic heuristic justification of the abc conjecture.

This is old news, but Stanford has an archive of Knuth video lectures.

(Oh, hey, I got a job! I start tomorrow.)

Triangular numbers between square numbers

James Tanton asked on Twitter:

As there are “more” triang nmbrs than sq nmbrs http://www.jamestanton.com/?p=1009 let f(N) = nmbr triangs >= N^2 but < (N+1)^2. Curious:What graph like?

The $k$ th triangular number is about $k^2/2$ (more precisely, it’s $(k^2+k)/2$ .) So there are about $\sqrt{2} n$ triangular numbers less than $n^2$ . Therefore, “on average”, in each interval $[N^2, (N+1)^2)$ there are $\sqrt{2}$ triangular numbers.

For example, in the interval [9, 16) there are two triangular numbers, namely 10 and 15; this is f(3). In the interval [16, 25) there is one triangular number, namely 21; this is f(4).

Let’s write down an explicit formula for f(n). Let g(x) be the number of triangular numbers less than x. To figure this out, I’ll introduce a function t(x), which takes as input x and outputs the index of x in the triangular-number sequence. For example, t(10) = 4, t(15) = 5. But we also want to be able to compute, say, t(12). But that’s fine! t(n) is just the inverse of the function which takes n to the nth triangular number, the function $n \to (n^2+n)/2$ ; in particular, solving the quadratic,

$t(n) = {\sqrt{8n+1}-1 \over 2}.$

So $t(10) = (\sqrt{81}-1)/2 = (9-1)/2 = 4$ ; $t(12) = (\sqrt{97}-1)/2 \approx 4.42$ .

Next we write g(x) in terms of t(x). It’s tempting to say that g(x) = \lfloor t(x) \rfloor, but it’s not. t(10) = 4, for example, but we want g(10) = 3. We’ll say that $g(x) = \lfloor t(x-1/8) \rfloor$ — we’ll only need this formula to work when x is an integer. So, for example, $g(10) = \lfloor t(9.875) \rfloor$ , and the index of 9.875 in the triangular number sequence, whatever that means, is between 3 and 4. But $g(11) = \lfloor t(10.875) \rfloor = 4$ .

Why the constant 1/8? Because

$t(x-1/8) = {\sqrt{8x}-1 \over 2} = {\sqrt{2x}} - {1/2}$

which makes the formula marginally easier to write.

Finally $f(n) = g((n+1)^2) - g(n^2)$ . Take the number of triangular numbers less than $(n+1)^2$ , and subtract the number less than $n^2$ , and you get the number in the interval in between. For example $g(5^2) = \lfloor \sqrt{50} - 1/2 \rfloor = 6$ ; there are 6 triangular numbers less than 25, namely 1, 3, 6, 10, 15, and 21. And $g(4^2) = \lfloor \sqrt{32} - 1/2 \rfloor = 5$ . Thus $f(4) = g(5^2) - g(4^2) = 6-5 = 1$ , indicating the triangular number 21. So at long last we have the formula

$f(n) = \lfloor (n+1) \sqrt{2} - {1 \over 2} \rfloor - \lfloor n \sqrt{2} - {1 \over 2} \rfloor$ .

In particular the arguments of these two floor functions differ by $\sqrt{2}$ , which is between 1 or 2, so f(n) is always either 1 or 2. The graph that Tanton asked about is below.

You can see some hints of periodicity in the function; for example, from a quick glance at the graph it might look like $f(x)$ has period 12, each period containing five 2s and seven 1s. But this can’t hold, not unless $\sqrt{2} = 17/12$ . In fact $f(x)$ can’t be periodic, because $\sqrt{2}$ is irrational.

Weekly links for September 16

Is an auction the best way to solve the roommate/rent dilemma? At Freakonomics, referring to The rent is too damn fair! by Michael Jancsy et al. The title is a reference to “The Rent Is Too Damn High!”, political party and e-book by Matt Yglesias. (Conflict-of-interest disclosure: I know Jancsy, and I went to college with Yglesias’ wife.)

Larry Wasserman writes on Hunting for Manifolds. Given data that are close to some manifold, how do we estimate the underlying manifold?

RAND’s presidential election poll features some unorthodox methodology, including asking the same people repeatedly and asking them explicitly for the percentage chance that they’ll vote.

Brian Hayes explains the abc conjecture. (He’s done this before.)

Steven Strogatz has a new series of math blog posts at the New York Times.

John Allen Paulos on Letterman. (Presumably from 1988.)

Life expectancy doesn’t measure how long you’re expected to live.

Howard Wainer writes of the most dangerous equation: (ignorance of) what I call the “square root law”. (From Wainer’s website for his intro stat course which contains some other interesting links.)

Austin Mohr has created Spacebook, a searchable database of topological spaces inspired by Counterexamples in Topology.

Handouts from Geometry and the Imagination, a summer workshop by John Conway, Peter Doyle, Jane Gilman and Bill Thurston in 1991.

Animation of Bruce Springsteen’s diffusion. (For The Girlfriend and my father. The Girlfriend is from Arkansas, so this is a good excuse to point to the Walmart diffusion animation as well.)

An interesting visualization of prime factorization.

The best video I’ve ever seen on combinatorial explosion.