We live in a world flooded with numbers. News stories,
nutrition labels, results from research studies, and even sports highlights are
riddled with numbers! Numbers are comforting; they seem solid and dependable.
“The numbers don’t lie.” But what if they do? Numbers can be
manipulated to tell contradictory stories. Here are five things to think about
next time you see a reported number.
1. Error Margin
2. Sample Size
3. Sample Bias
4. Replication
5. Rounding Errors
1. Error Margin
Scientists obsess over error margins. One type of this error comes from our measuring tools. All measurements have uncertainty because we do not have infinitely precise tools.
5. Rounding Errors
1. Error Margin
Scientists obsess over error margins. One type of this error comes from our measuring tools. All measurements have uncertainty because we do not have infinitely precise tools.
Imagine that you have a foot-long rule, with no inches
marked. Any measurement will only be precise up to the number of feet; the number
of inches will be a guess. A 5.5 foot height would be reported as 5.5 feet plus
or minus 0.5 feet.
Unfortunately, these error margins are often neglected when
a number is quoted.
2. Sample Size
One out of one writers agree that sample size is important.
Scientists use large sample sizes to account for the
individual differences between people or animals.
Imagine you take your foot-long ruler to find the average
American height by measuring the heights of the first ten people you meet. You
find the average height to be 6.25 feet. The danger occurs if a headline reads
"The average person is 6.25 feet tall!" without mentioning the small
sample size.
Humans love seeing patterns. We tend to use one example as
evidence for some larger truth: “my neighbor was sick, but then she ate some
sunflower seeds and got better. Hey, sunflower seeds cure sickness!”
The falsity of this pattern is summarized beautifully here:
“The plural of anecdote isn’t data.”
3. Sample Bias
General claims are only useful if they are based on samples representative of the whole population.
3. Sample Bias
General claims are only useful if they are based on samples representative of the whole population.
One famous example: researchers are now realizing that heart
attack signs in women are not the same as in men (e.g., pain in the chest). Most
heart attack studies were done using all males. The typical signs don’t even
apply to half of the population.
Sample bias is also important for studies that take volunteers. The results of a phone poll will be skewed by the fact that certain types of people are more likely to be at home to receive the pollster's calls.
Any time you see a number referring to some population
result, try to find out the identities of the subjects.
4. Replication
Sometimes, a number seems too good to be true. One question to ask is whether it can be reproduced.
Let’s use a political race as an example. Five independent polls
take place. Four of them find that candidate A has a big lead, but one states
that candidate B has a big lead. Looking at all the results, you conclude the
fifth poll was erroneous. But, if you only had the result from the fifth poll,
then you do not have an accurate reading of the political race.
This phenomenon most commonly occurs when one lab publishes a
sensational scientific result, such as the autism-vaccine connection.
Be excited about new results, but treat them with a healthy
amount of skepticism.
5. Rounding Errors
Spray butter: a delicious condiment used on corn-on-the-cob
or baked potatoes. A friend told me once that she could eat as much spray
butter as she wanted. The reason: the nutrition label stated that there were
zero calories.
Why does such a statement exist? If there are fewer than
five calories in a serving, the FDA allows a company to label that as zero
calories per serving.
The culprit in this case is a rounding error. If the spray
has 3 calories, then one serving is rounded down. Five sprays, however, contain
5*3=15 calories, not 5*0=0 calories.
Errors propagate every time you round a number so scientists
only round at the last step of a problem.
These five features of numbers should help you navigate the
number-filled world we live in.
Woot, first comment. Very nice. I would add that even given well collected, processed, and annotated data, error often occurs in the human being interpreting the data into false causation. I would go as far as to say that this is the most common method used to manipulate the outcome of a test when there is a strong emotional investment in the results.
ReplyDelete