How the 538 model works

Here’s a explanation by Nate Silver of how his Senate prediction model works. 10,000 words, and denser than the typical FiveThirtyEight post, but it’s food for thought if you’ve been curious about what’s going on under the hood of FiveThirtyEight’s flagship product.

Make sure to click through to the footnotes – lots of links to subsidiary analyses from the past that explicate some of the interesting tidbits Silver and co. have built up over time.

A data scientist is…

My wife sent me this¬†tweet by David M. Wessel this morning. It’s a photograph of a presentation slide giving three definitions of data scientists:

“A data scientist is a statistician who lives in San Francisco.
Data science is statistics on a Mac.
A data scientist is someone who is better at statistics than any software engineer and better at software engineering than any statistician.”

Now, at my last job I lived in San Francisco, used a Windows machine, and was called a “quantitative analyst”. Now I live in Atlanta, use a Mac, and am called a “data scientist”.

(Oh, yes. I forgot to mention that. In the turmoil of a cross-country move blogging fell by the wayside. I’m hoping to get back in the habit.)

My conclusion (n = 1) is that the “uses Mac” variable has a higher weight than the “lives in San Francisco” variable. This may actually be true; a lot of data scientists are using Unix tools and those in general integrate better with Macs.

A final question: where are these quotes originally from?

It looks like the Mac quote is from big data borat in August 2013.

The last quote (slightly rephrased) is probably due to Josh Wills in May 2012.

In a Quora answer from January 2014, Alon Amit attributes the San Francisco quote to Josh Wills, who says he was riffing on nivertech saying “”Data Scientist” is a Data Analyst who lives in California.” Most of the google hits for this quote are from January through March of 2014 but I feel like I heard it earlier; can anyone find a better citation?