How to Lie with Numbers

There is a common saying in English, “numbers never lie”, which presumably expresses the idea that any argument backed up by data, statistics or quantifiable information must therefore be trustworthy or inherently credible. One of the key tasks of TOK is to step back, analysize and reflect on the basis of anything presented as evidence for an argument. We live in a world where data is generated at an unquantifiable rate, we are bombarded by constant information whilst being asked to trust scientists, economists, psychologists, geographers, mathematicians, politicians, authority figures etc… because ‘the numbers never lie.’ Let us therefore, step back and reflect on the ways in which numbers are often used, not necessarily to lie, but are in fact misused, mishandled or abused in order to increase the persuasiveness of knowledge claims.

The worst examples are of course where numbers are just guessed or plucked out of thin air because there is no actual data available, or the data simply is too unclear or unreliable to make a strong case. A case in point was the UK’s Vote Leave campaign claim that they would make available £350 millions a week for the National Health Service. Once the vote was won, this claim simply disappeared or was described as part of a ‘series of possibilities’ by a key Vote Leave campaigner. What is clear is that the number was not based on any real data or research and no justification was ever provided for it.

All of us constantly and according to many Neuroscientists, subconsciously, sift through the information we seek or are presented with . Selecting the facts, data, numbers that suit our point of view or provide support for our argument, is a commonly used trick to mislead one’s audience. Governments of course do this all the time, as does anyone trying to sell you something. We naturally highlight the benefits of our approach whilst ignoring or downplaying the costs, and this is of course a common way of misusing data.

Estimates presented as cold hard facts are of course pretty common too. Much of the information we and others use to bolster our claims, is of course acquired by a variety of means, surveys being one of the most common and unreliable of them all. Objective surveys are notoriously difficult, some would say impossible, to set up. From the choice of questions, the way they are worded, the order in which they are asked, the way in which the data is collected, the reliance on the accuracy and honesty of the respondents etc… it is no wonder survey results seem to be as good as misleading as to inform; yet much of the data presented on a daily basis will have been gathered in precisely that way.

The issue of interpreting data is of course to be kept in mind. Even if, by some unlikely chance, the data one has gathered is indeed representative of the facts one is basing one’s conjecture or argument on, there is still the question of what the numbers actually mean. It is not unusual to see people coming up with entirely opposite views of what the facts mean once they have attempted to interpret them. Interpreting data is fraught with problems, from the incompleteness of it, its built in bias, the exact context from which it emerged and its repeatability, the assumptions and agenda of the interpreters etc… A naked fact might be true but what it means may indeed be a matter of contention even among the experts who use it.

Post Hoc Ergo Propter Hoc – this Latin expression captures a very common error when handling data. After this, therefore because of this, is the standard translation of the Latin phrase. It simply expresses the idea that two seemingly related sets of data may in fact not be causally connected at all, or only tangentially. This is known as mistaking correlation for causation. Just because fact ‘a’ is always or often followed by fact ‘b’, does not necessarily means that ‘a’ is indeed the cause of ‘b’. This is a very common way in which data is misused, proving causation is notoriously difficult, yet we all blithely make causal claims despite the difficulty of proving that what we claim isn’t simply a case of coincidental correlation.

The numbers never lie? Producing and interpreting data, information, numbers, statistics accurately and reliably is fraught with problems and it is difficult to see how one can ever fully trust them . We all mishandle, misinterpret, mislead, misuse and abuse data, sometimes unknowingly but having become aware of this should the saying not become “the numbers sometimes tell the truth”? However, I do not see that one catching on, can you?

Recent Posts

5 Activities to Get Your DP Students Exam-Ready

OSC Study for Schools: Now with IB Exams!

5 Hot Tips for the Maths Exams. With OSC Study and OSC Exams