top of page

More Lemons, Safer Highways

As a leader, you can’t know everything. You gotta hire smartly and be able to trust the people on your team to give you insights on topics you may know nothing about. But you have to be smart too, and ask questions so that you can have confidence that the advice or conclusions are well-supported. Trust but verify.

In some organizations, the Quality and/or Continuous Improvement teams are respected and even admired for their advanced knowledge of Probability and Statistics. Findings are often presented that are backed by statistical “proof”. A little knowledge, however, can be a bad thing. Do you know enough to ask the right questions?

Correlation, according to Wordnik: The tendency for two values or variables to change together, in either the same or opposite way.

The graph has been used so often that it’s hard to determine the original source. It’s commonly presented in introductory courses on statistics, to illustrate the concept of correlation.

Here, one can see the US Highway Fatality rate (on the vertical, Y axis) versus the tons of fresh lemons imported from Mexico to the United States between 1996 and 2000. The notation at the top means that there is 97% correspondence between the fall of highway deaths and the increase in lemon imports. Given this relationship, wouldn’t a natural conclusion be that to lower the death rate further, we should accelerate the import of Mexican lemons?

Preposterous? Of course! This is an example of how to lie with statistics. Just because two events follow the same pattern, or are inversely related, as with this example, does not mean that one causes the other. Correlation is not the same as causation. On the other hand, one thing that actually does cause another is likely to result in a high correlation, whether positively or negatively. We’re led to believe, for example, that the number of cases of COVID-19 is highly (negatively or oppositely) correlated to factors such as mask-wearing, hand-sanitizing and social distancing, which makes intuitive sense and is worth investigation.

Another misleading feature of this example is the scale of the graph. In order to show a drastic relationship between the variables, the vertical axis starts at 14.8 and rises to 16.0. What would the diagonal line look like if the scale were started at 0 and went to 16? It would be almost flat. The correlation would still be very high, 97%, but the visual would not be so striking.

There have been a number of books written on the theme of “How to Lie with Statistics”. To be an effective leader does not require facility with probability and statistics. But there’s so much value to be had, and so many ways to misinterpret the data, intentionally or non-intentionally, that it’s worthwhile to gain a basic understanding on the key concepts.


bottom of page