Pairs of cities with the same population

I was looking at the list of US cities by population on Wikipedia yesterday, because I noticed that Sunnyvale, a suburb of San Jose that I had occasion to go to yesterday, had a surprisingly large population of 140,095. There are a lot of places like this in California — despite having about 12% of US population, it has 64 of the 275 largest cities (all those with population above 100,000), or about 23%.

And among those 275 cities there are three pairs with the same population in the 2010 Census:

  • Fargo, North Dakota and Norwalk, California, both at 105,549
  • Arvada, Colorado and Ventura, California, both at 106,433
  • Aurora, Illinois and Oxnard, California, both at 197,899

Of course census data shouldn’t actually be taken to be exact. But how many pairs like this would we expect?

The starting point here is Zipf’s law for cities, or the rank-size rule. This rule states that the nth largest city in a region will have population 1/n times that of the largest city. As it turns out, this isn’t quite true for the structure of cities in the US, but they do roughly follow a power law. If we regress log(population) against log(rank), we get the regression line

\log(pop) = 15.6103 - 0.7287 \log(rank)

or, if we exponentiate both sides,

pop = 6018207 \times rank^{-.7287}

For example, we predict that the hundredth-largest city should have population 6018207 \times 100^{-0.7287} = 209926. The actual hundredth-largest city is Spokane, Washington, with population 208916. See below for a graph of city size vs. city rank:

Because I don’t want to rewrite these numbers over and over, I’m going to rewrite that as p = a r^{-b}, and plug in the numbers at the end. Now let’s invert this relationship. How many cities do we expect to have population greater than some constant p? That’s just the rank the corresponds to p;. Solving for r gives r = (p/a)^{-1/b}. Let’s write this as $r = f(p)$.

The expected number of cities having population exactly p is thern

-f^\prime(p) = a^{1/b} {1 \over b} p^{-(1+1/b)}

Taking the derivative here is actually the crux of the analysis, so I’ll elaborate a bit. The expected number of cities having population at least p is f(p); the expected number of cities having population at least p+1 is f(p+1). The expected number of cities having population exactly p, then, is f(p)-f(p+1) = -(f(p+1) - f(p)). But f(p) varies slowly so we can approximate f(p+1) - f(p) by f^\prime(p). Let g(p) = -f^\prime(p) for later ease of notation.

Roughly speaking, g(p) is the density of cities per unit population, at p. For example, if we let p = 105,000 we get that we expect 0.0034 cities of population 105,000. Extrapolating to the range from 100,000 to 110,000, we expect 10,000 times this many cities, or 34, in that population range; there are in fact 39.

So now take this expected value, and figure that the actual number of cities of population p is a Poisson random variable with mean g(p). The probability that such a random variable is equal to 2 is e^{-g(p)} g(p)^2/2. Since g(p) is very close to 0, I’ll drop the exponential term in what follows. Furthermore for ease of calculation, let’s assume these Poissons are never greater than 2. For example, the probability that a Poisson with mean 0.0034 is at least 2 is exactly

1 - e^{0.0034} (1 + 0.0034) \approx 5.767 \times 10^{-6}

and I use the approximation 0.0034^2/2 = 5.78 \times 10^{-6}. The number of pairs of cities with population greater than c and the same population is then predicted to be

\sum_{p \ge c} g(p)^2/2

but I’d rather do an integral instead of a sum, so we’ll approximate this as

\int_{c}^\infty g(p)^2/2 \: dp.

Recalling that g(p) = a^{1/b}/b p^{-(1+1/b)}, we get

\int_c^\infty {a^{2/b} \over 2b^2} p^{-(2+2/b)} \: dp

and doing the integral gives

{a^{2/b} \over 2b^2} {b \over b+2} c^{-(1+2/b)}

Plugging in the values from above, c = 100000, a = 6018207, b = 0.7287, gives 0.1924. So the expected number of such coincidences is about one-fifth; in the 2010 census it was three.

If you compare data from 2000 the first such coincidence is at rank 467 – Royal Oak, MI and Bristol, CT both had population 60,062 that year. (Note: I scanned the data by eye, so it’s possible I missed something.) You expect to start seeing coincidences this far down; plugging in c = 60000 with the 2010 coefficients gives 1.3. (Properly speaking I should use the 2000 coefficients, but I’d have to compute them first.) So 2010 is probably unusual. Still, I can’t help but suspect that the Census might be fudging the data a little bit to make these cities tie so that the lower-ranked member of each couplet doesn’t complain…

I’m looking for a job, in the SF Bay Area. See my linkedin profile.

Advertisements

7 thoughts on “Pairs of cities with the same population

  1. On the subject of “suprisingly big cities,” it’s kind of a fun exercise to go down the list and find out what’s the biggest US city you’ve never heard of. Mine was in California. (Biggest US university you’ve never heard of is also good.)

  2. Mine was #87, in Texas, which is the state I expected it would be in before I started looking. (I suspect before I moved to California it would have been in California.) For universities, if you go by the Wikipedia list of universities I come up dry at #4, but I’m guessing that’s not the list you had in mind – do you know a better one?

  3. It’s hard to find your articles in google. I found
    it on 20 spot, you should build quality backlinks , it will help you to increase traffic.

    I know how to help you, just type in google – k2 seo tips

  4. Some advance algorithms also produce specific shipping cost or free shipping to specific customers depending upon their past
    purchasing behavior. Many mobile operators promote 4G service
    in the smartphones. Desktop remains where it is, of course but the growing usage
    of smartphones can’t be overlooked when it comes to website designs for the year.

  5. Just desire to say your article is as astonishing. The clarity for your publish is
    just great and i could assume you’re knowledgeable in this subject.
    Well together with your permission allow me to clutch your feed to keep updated with imminent post.
    Thanks one million and please keep up the rewarding work.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s