Reverse Anscombe

At Cross Validated, someone asked about why they get wildly different histograms from the same data. The user Glen_b gave an excellent answer based around an example for which data sets which differ from each other just by adding a constant have very different-looking histograms. Other commenters suggest using kernel density estimates or cumulative distribution plots, both of which wouldn’t fail on this particular question.

Anscombe’s quartet comes to mind – four bivariate data sets with the same mean and variance of each coordinate and the same correlation, which look wildly different when plotted. This is sort of a reverse-Anscombe: here data sets that look essentially the same when plotted have wildly different summary statistics.

Reverse Anscombe

Published by Michael Lugo

Leave a comment Cancel reply

Share this:

Related

Published by Michael Lugo

Leave a comment Cancel reply