Matt Yglesias writes on what he calls the hazy metaphysics of probability as it applies to election forecasting. For example, as of right now, Nate Silver’s forecasts have a 42.2% chance of the Democrats (plus independents) getting the majority in the Senate, while Sam Wang’s forecasts have that same number at 39%. As Yglesias points out, we’ll just never get enough data points to know which of these models is closer to the truth. (By the time we do, the political system will change underneath us.)
Yglesias’ vox has a roundup of various forecasts as well. (Silver is the fox logo; Wang is the orange-and-black Princeton shield.) One thing that jumps out, for me, is that the Washington Post tends to give much more extreme win probabilities – that is, closer to 0 or 100 percent – than the other models. I suspect that the models with more extreme win probabilities generally have the same point estimate – in all these models, the point estimate is basically the average of poll results, although of course you can quibble about what polls to average and how to weight them and so on. The secret sauce of any of these models is going to be how accurately the results today – nearly a month before the election – predict the results on Election Day. Silver claims Wang has historically gotten this wrong in his explanation of the models he uses, and critiqued Wang at Political Wire; here’s Wang’s response.
This is one of those places where the end of theory folks are just wrong. In some sense election forecasting is “big data” – at least, Silver, Wang, and the like are trying to predict a big data result (millions of votes), although the samples aren’t so large. We don’t get to run the elections over and over again – or, to go bigger, we get one shot at getting this climate change thing right. See for example Big Data: the end of theory in healthcare?, by Ben Wanamaker and Devin Bean. And there’s a reason that in my big data job I work with a bunch of people who were academic scientists in their former lives.