From the Census Bureau via Slate, on the income gaps between opposite-sex married couples:

in 3.9 percent of couples, the husband earns 5,000 to 9,999 more dollars (per year) than the wife;

in 25.4 percent of couples, the husband earns within 4,999 dollars of the wife;

in 2.8 percent of couples, the wife earns 5,000 to 9,999 more dollars than the husband.

(The rest of the couples have more than a 10,000-dollar differential.)

Something seems fishy here. Call the wife’s earnings and the husband’s earnings ; we’re interested here in the distribution of the random variable . (Of course it’s difficult to write out the distribution of ; we know and are correlated, by assortative mating.) The three bins above correspond to being in the intervals and . The second interval is twice as wide as the others – so we’d expect twice as many couples to be in that middle bin as the ones on either side of it.

But instead we have six to nine times as many. Any explanations? All I can think of to explain this phenomenon – if it’s real – is that there are a surprisingly large number of cases where the husband and wife do the same job (not just working at the same place, but actually doing the same thing, for the same pay)… but how many couples like that can there be? It seems more likely to be an artifact of how the survey works.

3 thoughts on “Men and women making exactly the same (not a post about the pay gap)”

Your expectation that the middle one would have twice as many as the other bins assumes that men and women marry randomly with regard to income, and that incomes themselves have a uniform distribution. But “assortative mating” is a thing (couples meet in college, grad school, jobs, etc), and the individual income distribution itself (something like lognormal) will make incomes naturally clumped together a bit too. 25.4% may in fact be high, but you’d expect some clumping near 0 difference…

Disclaimer: I am a mathematician but not a statistician.

Intuitively, it seemed natural to me that the distribution near 0 would be higher (even assuming no correlation).
I tried to make this formal.

For example, suppose that both men and women’s income has the same uniform distribution say on [a,b]. Then, assuming no correlation, the distribution of the difference is an “upright triangle” with a maximum at 0 no?

If the original distribution itself has a central bulge, then this phenomean is amplified and the difference distribution gets an ever bigger bulge around 0.

Anyway, I may be missing something but it looks to me as though there is not much needing explaining, it suffices to assume that the distribution for both men and women have central bulges that are not too far apart from each other. (Again, this is even with no correlation. Correlation would amplify this even more.)

Your expectation that the middle one would have twice as many as the other bins assumes that men and women marry randomly with regard to income, and that incomes themselves have a uniform distribution. But “assortative mating” is a thing (couples meet in college, grad school, jobs, etc), and the individual income distribution itself (something like lognormal) will make incomes naturally clumped together a bit too. 25.4% may in fact be high, but you’d expect some clumping near 0 difference…

Disclaimer: I am a mathematician but not a statistician.

Intuitively, it seemed natural to me that the distribution near 0 would be higher (even assuming no correlation).

I tried to make this formal.

For example, suppose that both men and women’s income has the same uniform distribution say on [a,b]. Then, assuming no correlation, the distribution of the difference is an “upright triangle” with a maximum at 0 no?

If the original distribution itself has a central bulge, then this phenomean is amplified and the difference distribution gets an ever bigger bulge around 0.

Anyway, I may be missing something but it looks to me as though there is not much needing explaining, it suffices to assume that the distribution for both men and women have central bulges that are not too far apart from each other. (Again, this is even with no correlation. Correlation would amplify this even more.)

Self reporting of income to the census and joint-filing of taxes probably create some noise in this data.