How many pieces can a puzzle have?

Patrick Honner tweeted a few days ago:

Of course you could easily make a 300-piece puzzle, for example 15 by 20. And you could make a 299-piece puzzle — that factors as 13 times 23. But a 301-piece puzzle would have to be 7 by 43, assuming that the pieces form a nice grid,and that doesn’t seem like a “reasonable” size for a puzzle.

So which numbers could actually be reasonable sizes for puzzles? This could be nice to know if, say, you want to know if you have all the edge pieces. (But a 500-piece puzzle could be 20 x 25, or some reasonable slightly larger numbers like 19 x 27 = 513, or 18 x 28 = 504.) It’s a nice puzzle nonetheless.

So let’s say that a number is a “puzzle number” if it’s of the form m \times n with m \le n \le 2m — that is, a puzzle has to have an aspect ratio between 1 (square) and 2. (The choice of 2 is arbitrary here, but any other constant would be more arbitrary.) We can easily work out the first few puzzle numbers:

2 = 1 \times 2, 4 = 2 \times 2, 6 = 2 \times 3, 8 = 2 \times 4, 9 = 3 \times 3
which is enough to find them in the OEIS: that’s A071562, defined as “Numbers n such that the sum of the middle divisors of n (A071090) is not zero.” (What’s a “middle divisor”, you ask? It’s a divisor of n that’s between \sqrt{n/2} and \sqrt{2n}.) Titling it that way seems a bit strange to me: I’d have called it “Numbers such that the number of middle divisors of n (A067742) is nonzero”.

The first 10,000 puzzle numbers have been calculated; the 10,000th is 35,956. There are 43 under 100, 336 under 1,000, and 2,932 under 10,000 — it doesn’t appear that they have contant density. It’s not hard to take this further — there are 26870 under 10^5, 252,756 under 10^6, 2409731 under 10^7, and 23169860 under 10^8. For example, in R, you can generate the puzzle numbers as follows. (This take a few seconds to run. Note that you don’t have to compute any prime factorizations.)

N = 10^8
= rep(0, N)

for (m in 1:floor(sqrt(N))){
max_n = min(2*m,floor(N/m))
a[m*(m:max_n)] = a[m*(m:max_n)]+1

puzzle = which(a >= 1)

I had at first thought this sequence had a natural density, because I was only trying numbers up to a few hundred in my head, and because the number of middle divisors appears to average somewhere around \log(2). There’s a nice heuristic for this – a middle divisor of n is somewhere between \sqrt{n/2} and \sqrt{2n}; the “probability” that a number is divisible by k is 1/k, so the “expected number” of middle divisors of n is

\sum_{k=\sqrt{n/2}}^{\sqrt{2n}} {1 \over k} \approx \int_{\sqrt{n/2}}^{\sqrt{2n}} {1 \over x} \: dx = \log \sqrt{2n} - \log \sqrt{n/2} = \log \sqrt{4} = \log 2.

But there must be more numbers as you go further out that have many middle divisors, and more zeroes. This is similar to the behavior you see for the problem of “how many ways can an integer be written as a sum of two squares”. In that case the “average” is \pi/4, but an integer greater than one can be written as a sum of two squares if and only if its prime decomposition contains no prime congruent to 3 mod 4 raised to an odd power; asymptotically the number of positive integers below x which are the sum of two squares goes like bx/\sqrt{\log x} for a constant b (the constant is the Landau-Ramanujan constant). That connection might stem from the fact that both sequences are multiplicative. For sums of two squares, that follows from the factorization-based characterization or from the fact that

(a^2 + b^2) (c^2 + d^2) = (ac-bd)^2 + (ad + bc)^2.

For puzzle numbers, it follows from a remark of Franklin T. Adams-Watters in the OEIS entry. So numbers which have lots of factors both tend to be sums of two squares and to be puzzle numbers in many different ways; as we get to bigger numbers those start “crowding out” the smaller numbers.

The math internet has noticed this at least once before. John D. Cook wrote in 2014: Jigsaw puzzles that say they have 1,000 pieces have approximately 1,000 pieces, but probably not exactly 1,000. Jigsaw puzzle pieces are typically arranged in a grid, so the number of pieces along a side has to be a divisor of the total number of pieces. This means there aren’t very many ways to make a puzzle with exactly 1,000 pieces, and most have awkward aspect ratios.

1000 as 25-by-40 seems reasonable, but 27 x 37 = 999 or 28 x 36 = 1008 would also work. I assume that you wouldn’t actually see a 999-piece puzzle because the lawyers would claim calling that 1000 was false advertising. A puzzle blog indicates that most “500-piece” puzzles are actually 27 x 19 = 513 and most “1000-piece” puzzles are 38 x 27 = 1026 – both of these aspect ratios are approximations of \sqrt{2}. That is a good aspect ratio to use if you want the ability to make both “500-piece” and “1000-piece” (or more generally, “N-piece” and “2N-piece”) versions of the same puzzle. Similarly the A0, A1, … series of paper sizes used in most of the world have that aspect ratio and therefore an An piece of paper can be cut into two A(n+1) pieces.

Will the rest of the hurricane season be worse?

Isaias will be the ninth tropical storm of the season, likely to form on July 30. This is unusually early. From CNN on July 28: “Once it is given the name Isaias — pronounced (ees-ah-EE-as) — it will be the earliest storm to begin with an “I” on record. The previous record was set on August 7, 2005, part of the busiest season to date.”  Storms are named in alphabetical order, so a storm starting with “I”  is the ninth storm of the season.

This did seem early to me. The recent example that I remember Irma, which made landfall on the mainland US on September 10, 2017. (I spent a day seeing lots of people with Florida license plates driving around Atlanta, thinking “at least I don’t live in Florida!”, and then the next day having my power out and being glad that the trees the storm took out in my backyard were ones that needed taking out anyway.)

What does this mean for the rest of the hurricane season? Should we expect a busy one?

There’s an R package, HURDAT, that compiles the NOAA hurricane database. For each year, we can count the number of storms. First let’s make a list of all the storms:

al <- get_hurdat(basin = 'AL') cutoff_month = 7 cutoff_day = 29 storms = al %>% group_by(Key) %>% filter(Status %in% c('TS', 'HU')) %>%
mutate(rk = rank(DateTime)) %>% filter(rk == 1) %>%
mutate(season_part = ifelse(month(DateTime) < cutoff_month |
month(DateTime) == cutoff_month & day(DateTime) <= cutoff_day, 'early', 'late')) %>%
select(Key, Name, DateTime)

and count the storms by time in the season:

storm_counts = storms %>% mutate(yr = year(DateTime)) %>%
group_by(yr, season_part) %>% count()

But how do we know about are storms? Not all storms were observed in the past, because some stay out to sea. The “satellite era” began in 1966, according to this article about this year’s Tropical Storm Gonzalo (the earliest “G” storm ever). So let’s plot the number of storms before and after July 29, separately for each of the two “eras”.

satellite_start = 1966
storm_counts %>% spread(season_part, n, 0) %>%
mutate(satellite = ifelse(yr >= satellite_start, 'satellite', 'pre-satellite')) %>%
ggplot() + facet_wrap(~satellite) + geom_jitter(aes(x=early, y=late)) +
stat_smooth(aes(x=early, y=late, group = satellite), method = 'lm') +
scale_x_continuous(paste0('storms before ', cutoff_month, '/', cutoff_day)) +
scale_y_continuous(paste0('storms after ', cutoff_month, '/', cutoff_day))


But 2005 is a big outlier! I remember that season. Of course it was the year of Katrina. And in December there was a hurricane named Epsilon, which I could not take seriously because epsilon is small. Indeed if we drop it the effect of early-season storms on late-season ones gets much smaller.


So there’s a small effect – but not as large as might be expected! So how many storms will we have this season? Calling predict(storm_model, data.frame(early = 9), interval = 'prediction', level = 0.95), gets a 95% prediction interval for the number of storms in the rest of the season: 12.5 plus or minus 7.2. So we could reasonably have as few as 5 more or as many as 20 more, for a total of 14 to 29.

A 50% prediction interval goes from 10 to 15 more storms in the remaining part of the season, for a total of 19 to 24. For comparison, the quartiles of the distribution of the number of storms after 7/29 in the satellite era are 7 and 12. So if we knew nothing about what’s happened so far, we’d give a 50% prediction interval of 16 to 21 storms for the number at the end of the season.

Let’s see what happens. And let’s hope these storms stay far out to sea, because pandemics and hurricanes don’t seem very compatible.

Double (or triple) and sort

Here’s a sequence I keep mulling over in my head:

1, 2, 4, 8, 16, 23, 46, 29, 58, 116, 223, 446, …

Does it grow without bound? First we should specify what it is. Each number is double the one before, but with the digits sorted into ascending order. The doubling makes the number larger… but the sorting makes it smaller, so perhaps this sequence tops out somewhere?

This sequence actually becomes periodic pretty quickly. The full sequence is:

1, 2, 4, 8, 16, 23, 46, 29, 58, 116, 223, 446, 289, 578, 1156, 1223, 2446, 2489, 4789, 5789, 11578, 12356, 12247, 24449, 48889, 77789, 155578, 111356, 122227, 244445, 48889, …
and then it repeats with period 6. The key is in the final step: once we get a number that ends with 5, there’s a 0 in its double so the next term is actually shorter. (It is surprisingly hard to keep the mental arithmetic straight to the point where I can do this in my head.) There are other such loops: if you start with 3 you eventually reach

69, 138, 267, 345, 69, …

(note that everything stays divisible by 3 here!). If you start with 7 you get to

167, 334, 668, 1336, 2267, 3445, 689, 1378, 2567, 1345, 269, 358, 167, …

and if you start with 9 you get to

9, 18, 36, 27, 45, 9, …

And there’s a series of these sequences. Every number in that period-6 loop that you reach if you start with 1 has a repeated digit, and you can have less or more of those:

49, 89, 178, 356, 127, 245, 49, …
489, 789, 1578, 1356, 1227, 2445, 489, …
4889, 7789, 15578, 11356, 12227, 24445, 4889, …

The structure is quite a bit richer than the triple-and-sort sequence, presumably because divisibility by three “plays well” with rearranging digits. There, as far as I can tell, no matter what number you start with you end up with 45. (Tripling that give 135, tripling again gives 405, which sorts back to 45.) The reason for this seems to be that any repeated digits, when tripled, give you repeated 3s or 6s, and tripling those gives repeated 0s. For example, consider the start number 9261632697:

  • Tripling gives 27784898091 – which incidentally has a zero – and sorting that gives 1247788899.
  • Tripling again gives 3743366697 – the repeated 7s and 8s “turn into” repeated 3s and 6s — and sorting gives 3334666779.
  • At this point everything collapses – tripling gives 10004000337, and sorting gives 13347.
  • Another collapsing step: 13347 × 3 = 40041, which sorts to 144.
  • The remainder of the sequence is uninteresting:  234, 27, 18, 45, 135, 45, 135, …

This is the “typical” behavior – but it’s a bit surprising to me that it always works. I suspect a proof would be in terms of vectors that show the number of each digit, rather than with the raw numbers themselves.

How full is the pool?

Riddler Express for July 3, 2020:

Suppose a queue of swimmers arrives at [a] pool when it opens at 9 a.m. One at a time, each person randomly picks a lane from among the lanes that are available (i.e., the lane has no swimmer already and is not adjacent to any lanes with swimmers), until no more lanes are available.

At this point, what is the expected number of swimmers in the pool?

For five lanes the pool will be full when lanes 1, 3, and 5; 1 and 4; or 2 and 5 are occupied.

If lane 2 is first to be filled then we will certainly end up with just two occupied lanes (2 and 4); similarly with 4.

If lane 3 is first we certainly end up with three occupied lanes, since 1 and 5 remain and they don’t interfere with each other.

If the first swimmer uses lane 1 we actually have to think. The remaining lanes are 3, 4, and 5. If the second swimmer fills one of the “new” edge lanes (3 or 5) first then we’ll end up with three full lanes; if the second swimmer fills up 4 we end up with two full lanes. So if the first swimmer fills lane 1 we have an expectation of 2 2/3 lanes full, and similarly for lane 5.

Putting it all together the expected number of full lanes is (2/5) (2) + (1/5) (3) + (2/5) (8/3) = 37/15.

This is sort of a discrete version of “Renyi’s parking problem”: if you have a street of length x and cars of length 1 park at random, how full is the street expected to be when nobody else can park?

In the n-lane case there’s a recursive solution. Let f(n) be the expected number of full lanes in an n-lane pool. Clearly f(1) = 1, f(2) = 1; we already worked out f(5) = 8/3.

To find f(n), consider where the first swimmer goes.

  • If they use lane 1 or lane n then there is one remaining sub-pool of n-2 lanes.
  • If they use lane 2 or lane n-1 then there is one remaining sub-pool of n-3 lanes.
  • If they use lane k with 3 \le k \le n-2 then there are two remaining sub-pools of k-2 and n-k-1 lanes.

So this gives us, if n \ge 4,
f(n) = 1 + {2 \over n} f(n-2) + {2 \over n} f(n-3) + \sum_{k=3}^{n-2} {1 \over n} (f(k-2) + f(n-k-1)) $
with the base cases f(1) = 1, f(2) = 1, f(3) = 5/3.

If we compute this for large n we get f(n) \sim 0.4323n, which agrees with the Monte Carlo simulations in Georgiou, Kranakis, and Krizanc 2009. (Simulation isn’t necessary here but it’s necessary if you consider an analogous problem on a grid.) The constant 0.4323 is (1-e^{-2})/2. Kranakis and Krizanc have some online notes on the urinal problem. as well, and the problem goes back at least to a problem posed in the SIAM Review in 1962 by Freedman and Shepp.

(2020-07-07: fixed numerical expression of constant 0.4323)

The smallest non-obvious composite

It’s been a while, I know.

My partner sent me a link to this tweet yesterday:

Testing divisibility by 17 is hard!  There’s a test: 10A + B is divisible by 17 if and only if A – 5B is.  (Proof: 10A + B is divisible by 17 if and only if 12(10A + B) is, since 12 and 17 are relatively prime; then observe 12(10A + B) – 17(7A + B) = A – 5B.). But nobody has internalized this test.

But 51 is also divisible by 3, and that’s easy to see if you know the usual test: a number is divisible by 3 if and only if the sum of its digits is.  5 + 1 = 6.

So assume you know this test. Also assume you know how to test for numbers that are divisible by 2 or 5 (you can look at the last digit), 11 (for two-digit numbers you can just check if the two digits are the same), and you know the squares.  Then the smallest number that looks prime but isn’t is the product of the first two distinct primes I haven’t mentioned, 7 × 13 = 91.

Back in 2016 Christian Lawson-Perfect collected data on this from a game he made that asked people which numbers were prime; 51 is the number people got wrong most often.  But 91 is the composite most often identified as prime that’s not a multiple of three.

57 also looks prime.

A tale of zloty and forint

I had a wonderful vacation in Vienna and Budapest in May.  (Vienna is entirely irrelevant to this story.)

My wife every so often talks about taking a trip to Poland for no particular reason. (Readers: why do I want to visit Poland?)

So I started looking through Wikitravel to answer the above question, and every so often it mentions a price. I need to get the exchange rate to put it in context. So I type “usd to pln” into Google:

Screen Shot 2017-08-04 at 7.28.59 PM

In the chart, it looks like the zloty has been appreciating against the dollar. Is that because the dollar has been going down or because the zloty has been going up? Well, let’s see if the forint has made the same move. Type “usd to huf” into Google:

Screen Shot 2017-08-04 at 7.30.16 PM

It has! Hmm, I guess the dollar isn’t doing so well in the past few months. You might wonder why I checked the exchange rate for Hungarian forints and not for euros. The reason is because I know the forint was trading at around 275-280 to the dollar in May, whereas I can’t remember what the euro was. It looks like it was around $1.15 US but I used “vacation math” and just treated prices in euros as if they were in US dollars.

And then because I like typing random things into Google, try “pln to huf”:

Screen Shot 2017-08-04 at 7.33.29 PM

Is there some currency arbitrage opportunity? Of course not, the math works out perfectly:

Screen Shot 2017-08-04 at 7.34.35 PM

Can you even directly trade zloty for forint, or would you end up going via some major currency anyway?

Also, somewhere in Budapest you can see this:


and here’s a closeup of the text at the bottom:


A transcription of the text (sorry if I have introduced any errors):

A Rubik-kocka bármely keverés utan legfeljebb 20 tekerésből kirakható, de az osszetettsége miatt osszesen 43252003274489856000-féle eltérő állás hozható létre. “Mindig van megoldas és nem is csak egy!”(Rubik Ernő)

I do not read Hungarian (nor any language related to it, because there are very few such languages) but I know what this says. I suspect some of you do too.

The riddle of the number chain

This week’s “Riddler Classic” puzzle, from FiveThirtyEight:

From Itay Bavly, a chain-link number problem:

You start with the integers from one to 100, inclusive, and you want to organize them into a chain. The only rules for building this chain are that you can only use each number once and that each number must be adjacent in the chain to one of its factors or multiples. For example, you might build the chain:

4, 12, 24, 6, 60, 30, 10, 100, 25, 5, 1, 97

You have no numbers left to place after 97, leaving you with a finished chain of length 12.

What is the longest chain you can build?

It turns out this question is literally as old as I am. Carl Pomerance wrote a paper On the longest simple path in the divisor graph, C. Pomerance, Proc. Southeastern Conf. Combinatorics, Graph Theory, and Computing, Boca Raton, Florida, 1983, Cong. Num. 40 (1983), 291–304. The result of this paper is that if f(n) is the length of the longest chain in \{1, 2, \ldots, n\}, then f(n) = o(n). That is, as n gets large, we can’t “use up” a constant fraction of the integers. This paper was dedicated to Erdos on his seventieth birthday.

But the paper is not constructive. The natural thing to do is to think of this as a graph-theory problem. Construct a graph with vertices 1, 2, \ldots, n and draw an edge between latex a$ to b if a | b. But finding the longest path in a graph is NP-hard.

For example, with n = 10 this graph appears as follows:


and you can see that 7 is going to cause something of a problem. If we just avoid it we can find the path

9, 3, 6, 1, 4, 8, 2, 10, 5

so f(n) \ge 9. You can also see something of the structure of this problem – numbers which share prime factors cluster together in the graph, and we end up putting numbers with the same prime factors together in the sequence. Here all the multiples of three are at the beginning, all the multiples of two in the middle, and all the multiples of five at the end.

In fact f(n) = 9, which we can figure out by arguing around 7. We can’t have a path that has 7 in the middle because 7 has degree 1 in this graph, so if 7 occurs at all it must occur at the start or the end. Without loss of generality let’s say we have a path that starts with 7, of length 10 We must have 7, 1, and then a path in this graph that goes through all eight vertices:


but such a path clearly doesn’t exist – if it existed it would have to go between the two vertices of degree 1, which are 9 and 5, but one can’t navigate properly around 2.

So what to do? I know nothing better than trial and error. We can quickly build the following path of length 30:

64, 32, 96, 48, 16, 8, 24, 72, 36, 12, 4, 2, 6, 18, 54, 27, 9, 81, 3, 1, 5, 15, 45, 90, 30, 10, 60, 20, 40, 80

which basically zig-zags through the 3-smooth numbers (that is, numbers with all factors 2 and 3) and then the numbers you get if you multiply those by 5. Then we want to insert little excursions out along different primes. For example between 3 and 1 we can insert

3, 51, 17, 68, 34, 1

which is essentially the path 3, 1, 4, 2 multiplied by 17. Doing a few of these insertions gives a length-49 sequence

64, 32, 96, 48, 16, 8, 24, 72, 36, 12,
4, 84, 28, 7, 14,
2, 88, 44, 22, 11, 99, 33, 66,
6, 18, 54, 27, 9, 81,
3, 51, 17, 68, 34,
1, 38, 76, 19, 95,
5, 15, 45, 90, 30, 10, 60, 20, 40, 80

where I’ve bolded the newly inserted bits, which go “out along” the primes 7, 11, 17, and 19 respectively. (Don’t ask me what happened to 13.) We can keep hacking away at this, for example sticking some multiples of 25 in between 5 and 15 to get

64, 32, 96, 48, 16, 8, 24, 72, 36, 12,
4, 84, 28, 7, 14,
2, 88, 44, 22, 11, 99, 33, 66,
6, 18, 54, 27, 9, 81,
3, 51, 17, 68, 34,
1, 38, 76, 19, 95,
5, 100, 50, 25, 75, 15, 45, 90, 30, 10, 60, 20, 40, 80

Since there are no 13s in here, let’s put them in. One way to do this is in the second line, which we can replace with the second and third lines here:

64, 32, 96, 48, 16, 8, 24, 72, 36, 12,
4, 56, 14, 42, 21, 84, 28,
7, 91, 13, 39, 78, 26, 52
2, 88, 44, 22, 11, 99, 33, 66,
6, 18, 54, 27, 9, 81,
3, 51, 17, 68, 34,
1, 38, 76, 19, 95,
5, 100, 50, 25, 75, 15, 45, 90, 30, 10, 60, 20, 40, 80

and this has length 62. Things get trickier now – the idea becomes to look for numbers that aren’t in the sequence but have all their factors small. 63 isn’t in the sequence and it feels insertable, and if we replace 27, 9, 81, 3 with 27, 81, 9, 63, 3 we can do it. A few more little moves like this get us up to length 67:

64, 32, 96, 48, 16, 8, 24, 72, 36, 12,
4, 84, 28, 56, 14, 98, 49,
7, 91, 13, 39, 78, 26, 52,
2, 88, 44, 22, 11, 99, 33, 66,
6, 18, 54, 27, 81,
9, 63, 21, 42,
3, 51, 17, 68, 34,
1, 38, 76, 19, 95,
5, 80, 40, 20, 100, 50, 25, 75,
15, 45, 90, 30, 60, 10, 70, 35

which is the solution I submitted.

The “missing” numbers now are

23 29 31 37 41 43 46 47 53 55 57 58 59 61 62 65 67 69 71 73 74 77 79 82 83 85 86 87 89 92 93 94 97

and you can see what the basic problem ends up being.  All of these have at least one prime factor that’s greater than or equal to 11. Can we squeeze any of these in? In particular 55 and 77 seem like they could be inserted along this basic route, although I can’t figure out how.

Picturing math

The Metropolitan Museum of Art (New York) has an exhibit Picturing Math: Selections from the Department of Drawings and Prints, running from January 31 to May 8.  I was in New York mid-January, and I may have reason to be there again in June or July, so this seems like they’re setting their schedules to toy with me.  Fortunately, you can “view exhibition objects” at their web site or you can look at this article about the exhibit by Allison Meier for Hyperallergic.

Do people on busier roads get more yard signs?

I live in Georgia’s sixth congressional district.  As you may have heard, we’re having a special election to replace Tom Price, the new secretary of health and human services.  Price was a popular incumbent, but Trump barely scraped out a win in this district, so there’s a lot of national attention on this race.  And even if there weren’t, Georgia has a system for special elections where everyone will run in a primary (on April 18), and if nobody gets 50% of the vote (which there won’t), then there will be a runoff between the top two candidates on June 20.

And I do mean everyone.  There are 11 declared Republican candidates and 5 declared Democrats.    What little polling there’s been indicates that we should end up with a standard partisan election in June, with Karen Handel (former Secretary of State of Georgia and former candidate for Governor and Senator) for the Republicans, and Jon Ossoff, former congressional aide, current maker of investigative films, and person who is younger than me and therefore unelectable due to things he posted on Facebook, for the Democrats.

But this post is not about politics.  It’s about yard signs.  If I’m not mistaken – and I very well may be mistaken – I’m seeing more yard signs than I did for the presidential election.  Now, I live in an American suburb, which means that there are houses on main roads that people pass by,  and there are houses in subdivisions that nobody drives by unless they live in the subdivision. And I’m noticing that yard signs seem to be more prevalent on the main roads.

Is this true? It’s hard to count and drive at the same time – it’s easy enough to count the signs, but counting the houses is tricky, and without the denominator this would be worthless – and I wouldn’t have been able to get a large enough sample anyway.  But it seems to make sense.  I live on one of those roads that nobody drives down unless they live on it.  If I had a sign, who would see it?   I’m not much for putting up signs anyway, but if I were, I’d be more likely to if I lived on a road where the sign would have more effect. As it turns out, I’m not the only one. The paper Understanding Visible Political Participation: An Analysis of Yard Sign-Displays during the 2008 Presidential Election came up with a model to predict which properties display yard signs, and traffic volume (subjectively coded into six categories from “dead-end” to “major artery” does turn out to be significant.  I found out about this from a Slate article by Sasha Issenberg which rounds up a lot of the research on signs.

By the way, this is one of those things that’s very hard to Google, because the advice on  how to put up signs swamps the research on signs.  But the conventional wisdom seems to agree that signs on busier roads are more valuable.  One sign vendor writes (emphasis added):

  1. The most effective way is to combine your dooknocking [sic] with your yard sign recruitment. Target the busiest streets for door to door work first, and for those people who are friendly, ask them if it would be okay to put a sign in the yard.

Perhaps this is why nobody bothers to canvass my neighborhood and offer me a sign.

A sequence of Fibonacci facts

The Hampshire College Summer Studies in Mathematics program has what it calls an “Interesting Test” that interested students are invited to take as part of their application. Their web page offers a sample of past test problems. One of these, under the heading “Problems which at first seem to lack sufficient information”, is as follows:

From the third term on, each term of the sequence of real numbers a_1, a_2, \ldots, a_{10} is the sum of the preceding two terms; that is, a_n = a_{n-1} + a_{n-2} for n = 3, 4, 5, \ldots, 10. If a_7 = 17, what is the sum of all the ten terms?

I’d heard this one before (though it’s hard to find a source for things like this – the cloesst I could find was this Math StackExchange question. The solution is as follows: we can write a_3, a_4, \ldots, a_{10} as linear combinations of a_1, a_2. In particular we have a_3 = a_1 + a_2, a_4 = a_1 + 2a_2, \ldots, a_{10} = 21a_1 + 34 a_2, where the general term is a_n = F_{n-2} a_1 + F_{n-1} a_n (we’ll use this later). Here F_n is the nth Fibonacci number, where F_1 = F_2 = 1, F_n = F_{n-1} + F_{n-2}. Adding everything up we get

a_1 + \ldots + a_{10} = \left( 1 + 0 + 1 + 1 + 2 + 3 + 5 + 8 + 13 + 21\right) a_1 + \left( 0 + 1 + 1 + 2 + 3 + 5 + 8 + 13 + 21 + 34 \right) a_2

and doing those sums gives a_1 + \ldots + a_{10} = 55 a_1 + 88 a_2. But a_7 = 5 a_1 + 8 a_2, and so the sum is just 11a_7 = 11 \cdot 17 = 187. For example, consider the sequence with a_1 = 5, a_2 = -1:

5, -1, 4, 3, 7, 10, 17, 27, 44, 71

which you can easily verify sums to 187. In the case where you actually have the whole sequence, this is a reasonably well-known “arithmetic trick” that one can show grade school children! The HCSSiM version, of course, doesn’t let you see the whole sequence.

But are there similar tricks to this with shorter or longer sequences? That’s usually the way with the Fibonacci numbers; facts about them tend to be parametrizable. A bit of experimentation shows us that with six terms we have

a_1 + \ldots + a_6 = 8 a_1 + 12 a_2 = 4 (2a_1 + 3a_2) = 4 a_5.

See for example the series 4, 1, 5, 9, 14, 23; its sum, 56, is four times its fifth term. And with fourteen terms we have

a_1 + a_2 + \ldots + a_{14} = 377 a_1 + 609 a_2 = 29 (13a_1 + 21a_2) = 29 a_9

(I’ll leave you to come up with your own example.) It would appear that we have a_1 + \ldots + a_{4k+2} = L_{2k+1} a_{2k+3}, where the L_j are the Lucas numbers L_1 = 1, L_2 = 3, L_n = L_{n-1} + L_{n-2}. Consider the sum of the first 4k+2 terms satisfying a Fibonacci recurrence:

a_1 + a_2 + \ldots + a_{4k+2} = F_{4k+2} a_1 + (F_{4k+3}-1) a_2.

Now we have F_{4k+2} = F_{2k+1} L_{2k+1} and F_{4k+3}-1 = F_{2k+2} L_{2k+1}. These are both special cases of the identity givnen in Wikipedia, F_{n+k} + (-1)^k F_{n-k}= L_k F_n. So we have

a_1 + a_2 + \ldots + a_{4k+2} = F_{2k+1} L_{2k+1} a_1 + F_{2k+2} L_{2k+1} a_2 = L_{2k+1} (F_{2k+1} a_1 + F_{2k+2} a_2) = L_{2k+1} a_{2k+3}.

So the sum of a fourteen-term Fibonacci sequence is equal to a multiple of the ninth term; the sum of a ten-term Fibonacci sequence, a multiple of the seventh term; the sum of a six-term Fibonacci sequence, a multiple of the fifth term. What of a sum of a two-term Fibonacci sequence? We get a_1 + a_2 = L_1 a_3. Recall L_1 = 1 — the sum of the first two terms is, trivially, a multiple of the third term.

(I don’t claim this is original. Only that I had some fun thinking about it. That probably means I need to get a life.)