by Ray Poynter
When I am teaching my workshop on finding the story in the data, I make a point of encouraging people to look for alternative explanations for the phenomena they see in the data. For example, in the 1980s it was noticed that the women being prescribed HRT were less likely to suffer from cardiac disease, leading to more women being prescribed HRT. However, a properly controlled random probability trial showed that a) HRT was slightly bad for women’s hearts and b) that in the past it was women with better health outcomes who were being prescribed HRT, which is why it looked as though HRT was associated with better outcomes.
Simpson’s Paradox and the Sexual Discrimination that Didn’t Happen
Simpson’s Paradox describes a situation where the data at the total level do not represent the true picture, and where the true picture can only be revealed by looking at the data in more detail. Sometimes, this is phrased as patterns visible in groups disappear or reverse when the groups are combined.
A commonly quoted example of Simpson’s Paradox is that of University of California, Berkeley in 1973. The University noticed that the acceptance rate for male students (44%) was higher than for female applicants (just 35%) and it was concerned that perhaps there was discrimination taking place (and concerned that it might be sued). However, when the University analysed the data by department, they found that in every department the acceptance rates for men and women were about equal, except in those cases where the acceptance rate for women was higher than for men. But, if there was no difference at the level of department how could there be a difference at the total level? The answer was that men were tending to apply to courses that were easier to be admitted to, and women were applying to courses that were harder to be admitted to. (Side note, this was at a time when there was lots of funding for STEM subjects such as engineering, which meant there were more places available, and at that time more men than women applied for STEM courses).
We can show this effect with a small hypothetical example, using just two departments, and just 24 students, 12 men and 12 women.
Simplified Example, based on the University of California case.
In this example 8 of the 12 men applied for a STEM course, and 4 applied for a course in Classics. Amongst the women the picture is reversed, with 4 applying for STEM courses and 8 applying to study Classics. The acceptance rate for STEM is 75%, which means 6 of the 8 men are accepted and 3 of the 4 women. The acceptance rate for Classics is 25%, so just 1 of the 4 men who applied are successful, and just 2 of the 8 women who applied are accepted. Putting this all together, 7 of the 12 men were accepted (6 for Stem and one for classics). Amongst the women just 5 were accepted (3 for Stem and 2 for Classics). So, with no difference at the individual course level we end up with an acceptance rate of 58% for the men and 42% for the women – Simpson’s Paradox.
(Side note: there is still the deeper issue of why fewer women were applying for STEM subjects, something which continues to be a problem in many countries today.)
Raising the IQ of Both Countries
There is a popular joke/comment that is told around the world about neighbouring regions or countries, so to avoid offence, I will tell the joke using Jonathan Swift’s Lilliput and Belfuscu. This story also illustrates the point about the total not apparently reflecting the parts. When somebody moves from Belfuscu to Lilliput they raise the IQ in both countries. How can this be? The joker’s reply is that only less than averagely bright citizen of Belfuscu would dream of moving to Lilliput, so when they leave the average IQ goes up in Belfuscu. But, even a less bright Belfuscu citizen is brighter than the average Lilliputian, so the average IQ in Lilliput goes up too. In market research, reallocating a customer (or group of customers) from one segment or category to another could similarly increase the average spend or the average satisfaction in both groups.
Other Examples of Simpson’s Paradox
If you Google Simpson’s Paradox, you will find a wide variety of examples, dealing with a variety of measures including averages, percentages, and even correlations. One of the key underlying reasons that can cause this paradox is the presence of a confounding variable. For example, in the case of HRT the confounding variable was the impact of money and/or health insurance (people with money and/or health insurance tend to be healthier in the US, and women with money and/or health insurance were more likely to be prescribed HRT). The differences in the health outcomes and in whether or not women were prescribed HRT was being impacted by the confounding variable ‘access to healthcare’.
Implications for finding the story in the data
The main implication for anybody working with data is that the first impression might be misleading. Try to look for alternative explanations for what you see. The most dangerous cases are probably those where the overall impression agrees with our biases. In the case of the University of California, if you were predisposed to expect bias against women then the difference in acceptance rates at the overall level (44% versus 35%) would confirm your bias and you might be tempted to accept it at face value. However, if you were to look to see where the bias was worse, you would soon find out that this was an example of Simpson’s Paradox. So, even when the data show the pattern we expect, it is worth looking deeper to understand its range and scope – because looking to understand its range and scope might show us that the pattern is illusory.
This article was also published on https://newmr.org/blog/simpsons-paradox/, 3 July 2019
For more information about the subject, please join the MOAcademy "Finding and communicating the story in the data; identifying the insights" with Ray Poynter as workshopleader on October 24th.