Noise A Flaw in Human Judgment Abstract
Noise A Flaw in Human Judgment Abstract
Noise A Flaw in Human Judgment Abstract
Rating ? Qualities ?
Analytical
Noise
A Flaw in Human Judgment
Daniel Kahneman, Olivier Sibony and Cass R.
Sunstein • William Collins © 2021 • 464 pages
Take-Aways
• Judgment seeks to find “true value,” which is not the same for everyone.
• Noise and bias contribute to errors in judgment.
• Mechanical judgment eliminates complexity and randomness, and is thus more reliable than clinical
judgment.
• System noise, level noise and pattern noise contribute to error in different proportions.
• Improve your judgments by using “decision observers” to reduce bias.
• Deploy “decision hygiene” methods to prevent noise before it happens.
• In ranking systems, noise occurs when judgments are absolute, not relative.
• Eliminating noise entirely may not always be worth the trouble.
1 of 6
Summary
Judgment seeks to find “true value,” which is not the same for everyone.
The human mind is a “measuring instrument,” and judgments are the measurements. A judgment is a
conclusion, not an argument. Computation resides on one side of the spectrum, and taste and opinion on the
other. Between them lies the realm of judgment.
Making a good judgment is not the same as having good judgment overall. Judgment does not factor into
decisions regarding taste, which is variable and desirable. Judgment aims at true value, which varies from
person to person. The unwanted variability of human judgment leads to human fallibility.
“A general property of noise is that you can recognize and measure it while knowing
nothing about the target or bias.”
Judgments fall into two categories in which inconsistency is problematic, but for different reasons:
1. Predictive judgment – Forecasters judge outcomes on the basis of probabilities. When two doctors or
two weather forecasters come to vastly different conclusions using the same data, that indicates noise.
2. Evaluative judgment – These judgments rely on values and preferences, and noise occurs when
decisions appear arbitrary, instead of conforming to agreed-on criteria.
Measuring the accuracy of predictive judgments after the fact is almost impossible, especially if they are
conditional or long-term. Disparities in evaluative judgments, particularly in systems, lead to unfairness.
Inconsistency tarnishes trust and credibility.
To understand the difference between bias and noise, imagine a target and shooters. Biased shooters, for
example, consistently miss the bull’s-eye in a recognizable pattern. Noisy shooters, on the other hand,
produce random scatter, which proves more difficult to measure because you cannot discern if they’re
aiming at the target. Bias indicates consistent deviation from predicted outcomes, such as a scale that
consistently adds five pounds to your weight. Noise indicates deviation from an average, such as a manager
who consistently underestimates or overestimates how long a project will take.
“What people usually claim to strive for in verifiable judgments is a prediction that
matches the outcome.”
www.getabstract.com
2 of 6
Many judgments are predictive and, therefore, verifiable. They teach a lot about noise. Comparing
professionals, machines and simple rules, professionals commit the most errors. To measure this error,
a noise audit uses the “percent concordant,” which makes it possible to compare clinical and mechanical
judgments to determine which is more accurate.
For example, take two candidates, and measure how accurately you can predict their eligibility for
a job. While the mechanical judgment has more constraints, and weights factors equally, its constraints
ensure reliability. Too often, human judgment relies on so many intuitive factors that decision-making
becomes almost random. You may think your judgment is more nuanced than a machine’s, but your mood,
the moment and your internal preferences can’t replicate the accuracy of a mechanical prediction.
“There is so much noise in judgment that a noise-free model of a judge achieves more
accurate predictions than the actual judge does.”
In recent times, machine learning – or AI – has come to prominence in making predictions on the basis
of vast troves of data. With greater accuracy than any human, AI is capable of predicting random events.
Humans have little tolerance for error in machines, though they tolerate it in themselves. People making
predictive judgments too often rely on gut instincts, leading to needless errors.
Wherever prediction exists, ignorance does also – and more than you might think. Admitting ignorance is
the first step to addressing uncertainty, and an improvement over allowing overconfidence to flourish and
noise to accumulate accordingly.
When people jump to conclusions, they stick to them – either by substituting a simpler question for a
difficult one, by “prejudging” and forcing a conclusion to match it, or by forming coherent impressions
quickly and declining to change them. These biases contribute to noise. Psychological bias can lead to
statistical bias, but everyone has different biases, which create system noise.
“Multiple, conflicting cues create the ambiguity that defines difficult judgment problems.”
When you face difficult, complex or ambiguous decisions, your mind seeks to fulfill two criteria: that your
judgment is worthy, and no better alternatives exist. What you believe and think others believe is not always
www.getabstract.com
3 of 6
Three factors contribute to stable pattern noise: weight of ranking components, personal reactions, and
individual qualitative differences among judgments. If you add your unique experiences and your personal
quirks, your judgments can be even noisier, though they may exhibit internal consistency in line with your
personality.
Noise contributes more to error than bias contributes. Among the different kinds of noise, pattern noise is
significantly more prevalent than level noise – usually, by double.
To improve judgments, conduct a noise audit by having multiple judges assess the same problems.
The variability in their judgments is noise. If you have a problem with system noise, consider replacing
individuals with simple rules or algorithms. Be aware that AI cannot replace human judgment. Naturally,
you want to line up the best judges to improve your error rate, but factors that make someone a good judge
are not always clear. Start with people who already have a reputation for good judgment. They will be
confident in their judgments and able to explain their reasoning. With many years’ experience, they excel at
forming coherent narratives.
“Bias leads to errors and unfairness. Noise does too – and yet, we do a lot less about it.”
Alternatively, seek judges with a cognitive style featuring careful thought. These people
interrogate information to ascertain whether it is accurate or trustworthy. They are usually more humble, as
well as open to criticism and to changing their minds as facts change. When working on a noise audit, these
people can observe the decision-making process and alert the team to unidentified biases.
Noise is harder to identify and fix because unlike bias, noise is more unpredictable and harder to explain. To
address noise, focus on prevention, not cure. This approach is decision hygiene, and resembles handwashing
among health professionals. You will never know which errors you prevented exactly, but you will have
statistically reduced their number.
“
Just like handwashing and other forms of prevention, decision hygiene is invaluable but
thankless.”
www.getabstract.com
4 of 6
In ranking systems, noise occurs when judgments are absolute, not relative.
Everyone dreads performance reviews, which have grown increasingly complex over the years. While they
are endemic, they remain useless for ascertaining an employee’s true worth. Defining scale in performance
ratings is a decision hygiene method. Choose a single dimension, and rank employees against one another,
rather than using absolute scales. Ranking reduces pattern noise and level noise, producing results that are
more consistent – and thus, more accurate.
“You can improve judgments by clarifying the rating scale and training people to use it
consistently.”
Noise is a problem when you’re hiring new people. Interviewers bring cognitive biases to the process. Often,
they rely on first impressions, and then seek coherence. The solution? Structure complex judgments by
aggregating different judges’ assessments. Google, for example, uses these principles in its structure:
1. Decomposition – Break the decision down into components. That focuses the judges on the relevant
information.
2. Independence – Ask predefined questions about candidates’ behavior in various situations.
3. Delayed holistic judgment – Do not exclude your intuition about a candidate. Delay it. Form a
committee to review all the data interviewers collected to make a collegial decision.
Data drive Google. Thus, the final decision is not mechanical, but informed by averaging combined scores.
Costs can outweigh benefits when you’re trying to eliminate noise. Paramount among costs is unfairness,
since mechanical judgments can’t replace human discernment, particularly when people’s lives are at stake.
The financial costs may be too much for public institutions such as educational entities to bear.
Sometimes, noise reduction causes more errors than it fixes. For example, algorithms outperform humans
in making noise-free judgments. However, they allow unacceptable biases. Humans value their judgment
because it is more discerning and nuanced, and relies on moral underpinnings that no one wants to
disregard. Mercy, for example, is a human quality that no one wants an algorithm to eliminate. If the noise-
reduction methods are unfair or crude, but the noise causes irredeemable unfairness, the solution is to
create better noise-reduction methods, not to ignore the problem.
www.getabstract.com
5 of 6
Social values evolve continuously, and flexibility in judgments can allow new values and beliefs to flourish.
In workplaces, having mechanical rules that govern your tasks can seem dehumanizing, and squelch
creativity. Noise reduction is beneficial in rules-based systems.
Regarding standards – which are more open to interpretation and, therefore, judgment – reducing noise
is more desirable. Standards are vague for a reason: They require more nuance. For example, a university
may have a standard policy regarding sexual harassment, but not rules for how to behave in every situation.
However, when you’re exercising judgment, remain aware that your goal is accuracy, not self-expression.
6 of 6