Explaining banding in a scatterplot of Goldbach’s function

David Radcliffe asks for an explanation of the “bands” in the scatterplot of the number of solutions to p + q = 2n in primes. To give an example, we have

2 × 14 = 28 = 23 + 5 = 17 + 11 = 11 + 17 = 5 + 23
2 × 15 = 30 = 23 + 7 = 19 + 11 = 17 + 13 = 13 + 17 = 11 + 19 = 7 + 23
2 × 16 = 32 = 29 + 3 = 19 + 13 = 13 + 19 = 3 + 29

and so, denoting this function by f, the elements of this sequence corresponding to n = 14, 15, 16 are f(14) = 4, f(15) = 6, and f(16) = 4 respectively. (Note that the counts here are of ordered sums, with 23 + 5 and 5 + 23 both counting; if you use unordered sums everything works out pretty much the same way, since every sum except those like p + p appears twice and I’m going to talk about ratios and inequalities and the like.)

My re-rendering of something similar to the original scatterplot is here:


and there are heuristic arguments that f(n) \approx n/(log(n)^2)), so let’s divide by that to get a plot of the “normalized” number of solutions:


There are definitely bands in these plots. Indeed the situation for n = 14, 15, 16 is typical: f(n) “tends to be” larger when n is divisible by 3 than when it isn’t. A handwaving justification for this is as follows: consider primes modulo 3. All primes (with the trivial exceptions of 2 and 3) are congruent to 1 or 5 modulo 6, and by the prime number theorem for arithmetic progressions these are equally likely. (For some data on this, see Granville and Martin on prime races, which is a nice expository paper.) So if we add two primes p and q together, there are four equally likely cases:

  • p is of form 3n+1, q is of form 3n+1, p+q is of form 3n+2
  • p is of form 3n+1, q is of form 3n+2, p+q is of form 3n
  • p is of form 3n22, q is of form 3n+1, p+q is of form 3n
  • p is of form 3n+5, q is of form 3n+2, p+q is of form 3n+1

So if we just add primes together, we get multiples of three fully half the time, and the remaining half of the results are evenly split between integers of forms 3n+2 and 3n+1.

We can make the bands “go away” by plotting, instead of f(n), the function which is f(n)/2 when n is divisible by 3 and f(n) otherwise. Call this f_3(n). But there’s still some banding:


Naturally we look to the next prime, 5. A given prime is equally likely to be of the form 5n+1, 5n+2, 5n+3, or 5n+4; if we work through the combinations we can see that there are 4 ways to pair these up to get a multiple of 5, and 3 ways to get each of the forms 5n+1, 5n+2, 5n+3, 5n+4. So it seems natural to penalize multiples of 5 by multiplying their $f(n)$ by 3/4; the banding then is even less strong, as you can see below.


The natural thing to do here is to just iterate over primes. For the prime p we get that there are p-1 ways to pair up residue classes 1, 2, \ldots, (n-1) \pmod p to get the residue class 0 (i. e. multiples of p) and p-2 ways to get each of the classes 1, 2, \ldots, n-1. That is, multiples of 2p are more likely than nonmultiples to be sums of randomly chosen primes, by a factor of (p-1)/(p-2). Correcting for this, let’s plot x against

f^*(n) = f(n) \times \left( \prod_{p|n} {p-2 \over p-1} \right);

in this case you get the plot below. The lack of banding in this plot is basically the extended Goldbach conjecture.


Although I didn’t know this when I started writing, apparently this is known as Goldbach’s comet: see e. g. Richard Tobin or Ben Vitale or this MathOverflow post.

And although this is a number-theoretic problem, much of this is an exercise in statistical model fitting; I proceeded by making a plot, checking out the residuals compared to some model to see if there was a pattern, and fitting a new model which accounted for those residuals. However, in this case there was a strong theory backing me up, so this is, thankfully, not a pure data mining exercise.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s