# Santana’s no-hitter redux

Slate‘s sports podcast, “Hang Up and Listen”, talked about Johan Santana’s June 1 no-hitter in their most recent episode; I mentioned it back on June 2. Starting at 47:47, they talk briefly about this post by tangotiger. tangotiger argues that of the 27 outs in the game, all but six were “routine” outs; he figures that given the distribution of batted balls, Santana should have given up about two hits. If like me, you didn’t see the game, you can see video of all 27 outs at mlb.com. (The “blown call” that’s been mentioned in a lot of places came in the top of the sixth, with Carlos Beltran batting. It’s a line drive down the third base line that was ruled foul.)

But every no-hitter has some degree of luck. Consider the following model: the batter hits the ball. Depending on where he hits it, that sets the probability of heads of a certain (imaginary) coin, i. e. a Bernoulli random variable. Take this to be 0 for a strikeout, 1 for a home run, and somewhere in between for balls in play. (Of course you could go back a step and start with the pitcher pitching.) Then if that coin comes up heads, the ball is a hit, and if not it’s an out. For each innings, record the number of hits until getting three outs; nine innings make a ball game.

Then for each team in every ball game you get two numbers: the sum of those probabilities of heads, which you could call the “expected” number of hits, and the actual number of hits. On average they’ll be the same. And of course they’re highly correlated. But conditional on the actual number of hits being 0, which is well below the average, the sum of the probability of heads — the “expected” number of hits — will be somewhere greater than 0, always. (Unless we’re talking about a 27-strikeout game, which happened once in the minors in 1952 and never in the majors. This is just regression to the mean.

With the right data set you could empirically determine the probability that any given batted ball goes for a hit, and for recent no-hitters (where that data is presumably available somewhere) compute how much the “average” amount of luck is. I don’t have that data, though. But some pitchers of no-hitters benefited more from luck than others, and this wouldn’t be a horrible way to quantify that.

I’m looking for a job, in the SF Bay Area. See my linkedin profile.