Tony Wood, who writes about weather for my hometown paper, the Philadelphia Inquirer, observes that at the current rate, “2012 precipitation in Philadelphia would finish at 26.87, which would make it the driest year on record”. My native Philadelphia has had very low rain so far this year, only 14.06 inches through yesterday; this is the fourth-lowest amount of rain to have occurred through July 10, behind only 1992, 1995, and 1963. (Wood gives 1922 instead of 1992.)

This is essentially true, although the calculation is actually a bit off because of leap year.) Not only is this an all-time minimum, but it’s far below the old minimum of 29.48 inches in 1922. (1922 actually had 20.81 inches of rain by July 9, which is about average, but it had the second-driest second half in the 127 years of records I have. (Disclaimer: I’m calculating this by adding up the Franklin Institute’s daily data, and it may differ from what you see tabulated elsewhere.)

So if Philly is only at fourth-lowest rain year-to-date right now, why would keeping up at the same pace lead to the all-time lowest amount of rain?

First, rain is seasonal. It turns out that this actually isn’t a problem, though, in Philadelphia’s climate; between 1873 and 1999, in an average year 51.7 percent of the rain fell in the 191 days, or 52.3 percent of the year, up to July 9. (That’s in common years; it’s a bit different in leap years.)

More importantly, though, there’s regression to the mean. One might naively assume that if it rains more in the first half of the year, we should expect it to rain more in the second half of the year as well. Still, the rainiest first half is likely not to come in the same year as the rainiest second half, and the driest first half is likely not to come in the same year as the driest second half, since the correlation is imperfect.

Actually, the correlation is very imperfect. The coefficient of correlation between the amount of rain in the first half of the year and in the second half of the year is about -0.05. That’s right, it’s negative! (But it’s not significantly different from zero.) The amount of rain in the first half of the year tells us basically nothing about the second half. The regression line for predicting amount of rain in the second half of the year from the first half is

(second half rain) = (21.29 inches) – 0.05779 (first half rain)

but the slope of the line has standard error 0.1066. See the plot below:

We should expect this year to be drier than average in Philadelphia, overall, but only because the first half was so dry. The regression line for predicting total year-end rain from first-half rain is

(year rain) = (21.29 inches) + 0.9422 (first half rain)

which you could have guessed; just add first half rain to the first equation. A scatterplot is below:

For this year, the first-half rain is 14.06 inches; the predicted second-half rain is 20.47 inches, for an overall total of 34.54 inches. This is drier than all but 21 years in the 127-year sample, or about one out of six. 2012 as a whole will likely be dry, but not historically dry.

*I’m looking for a job, in the SF Bay Area. See my linkedin profile.*

And it’s worth noting that, according to your scatterplot, this is definitely not the driest ever first-half-of-the-year in Philadelphia; I see at least three dots that look lower.

In fact this has been the fourth-driest first half of the year in Philadelphia, as the data show. So there are exactly three dots that look lower. (My data set only covers 1873 to 1999, so I could be off; the point I’m trying to make, that first-half and second-half rain are pretty close to uncorrelated, shouldn’t be changed much by that data and I didn’t want to go to the trouble of getting it.)