Heuristics are 'rules of thumb' that are cognitively undemanding and often produce approximately accurate answers. A good heuristic is one which is efficient, quick, robust and generally right. They are supposed to make decisions less cognitively demanding as time is short, information-limited, and has limited cognitive resources.
For example: We go to a restaurant. We obviously want a great dish, but we have no idea how to go about choosing one from the many options available. we could do a rational analysis, and evaluate how much we think we would enjoy each dish on the menu, and then pick the one we predict we'd prefer most. Though we would get the best dish, it would take far too long to choose this way, and it would take a lot cognitive effort to compare all the dishes. So we may just order the special, or the same thing every time, and trust that they will be great (which they probably will), saving ourselves time and effort.
This demonstrates how a decision (based on the previous assumption that ordering the special usually turns out to be okay) results in a choice that is satisfactory, efficient and quick. Rather than thinking of every instance or calculating odds, we save time and effort by using methods that are rapid, economical and reasonably likely to work.
Inevitably heuristics lead to systematic errors in our thinking that cause mistakes and biases. One such systematic error is known as confirmation bias, where people pay more attention to and remember information that confirms their beliefs, rather than information that disproves them. As a result, information is gathered and remembered selectively and interpreted in a biased way. worryingly this can result in someone with a strong opinion reading 10 pieces of evidence and only remembering the 2 parts that confirmed his/her opinion, while completely ignoring the other 8!
A second type of systematic error is pattern matching where people look for a pattern and assume that the correct information is that which fits into the pattern.
These biases are not attributable to motivational effects such as wishful thinking or the distortion of judgements by pay offs or judgements. Severe errors of judgement occur despite the fact that subjects in some particular experiments were encouraged to be accurate or were rewarded for correct answers (Edwards, 1968).
Tversky & Kahneman (1974) proposed three common heuristics responsible for many biases:
The representativeness heuristic is where people assume that if A is representative of B then it probably comes from B. Probabilities are evaluated according to similarities; the degree to which A resembles B or A is representative of B regardless of other information. For example: "ken is big, has long hair, wears mostly black and rides a motobike" is ken more likely to enjoy rock or pop music? most people choose rock music because the description given fits the idea of a 'rock music lover' more than it does a 'pop music lover'
Base rate fallacy
"In Uganda 90% of the population are farmers, 5% work in business and accounting and 5% work for the government. James is a Ugandan who enjoys maths and is very good at it. he isn't very sociable, and hasn't go many friends. he dislikes sports but enjoys watching football on television. what do you think James' profession is?"
Many people will read the above example and say "accountant", as the information on James is representative of an accountant. however this is not likely, as only 5% of the population are in this profession. there will be a large number of people who enjoy maths who do nt become accountants. the fact that "In Uganda 90% of the population are farmers, 5% work in business and accounting and 5% work for the government." is called base rate information, and people routinely ignore it when given other information. this is called "base rate fallacy"
We base our judgements on our past experiences or assumptions rather than base rate probabilities. Tversky and Kahneman (1973) asked participants to read a description of 'Jack' and then decide if Jack was an engineer or lawyer. They were all told the description had been selected at random from 100 descriptions. Half were told there were 70 lawyer and 30 engineer descriptions and half told there were 70 engineer and 30 lawyer descriptions. Participants took no account of the base rate information (70 to 30 split) and decided, on average, that there was a 0.90 probability that Jack was an engineer. They based their decision on the degree to which the description was representative of the two stereotypes, with no regard for the prior probabilities of the categories. It is also important to note that participants used probabilities correctly in the absence of a personality description. So, when judging the probability that an unknown person is an engineer or a lawyer, they deemed it to be 0.7 and 0.3 respectively. But as soon as a description was introduced, such probabilities were essentially ignored, even when the description was not informative.
Conjunction fallacy (Tversky and Kahneman 1983) is the mistaken belief that the probability of the conjunction or combination events is greater than that of an event on its own (A or B).
Linda is a bank teller question is a good example:
Linda is 31 years old, single, outspoken, and very bright. She studied philosophy at University. As a student, she was deeply concerned with issues of discrimination and social justice, and also participated in anti-war demonstrations.
Which is more likely?
A) Linda is a Bank Teller
B) Linda is a Bank Teller and is active in the feminist movement.
People have a tendency to vote for option B because of the previous information- option B is more representative of the description. In fact option A is far more likely because whenever B is true, A is also true, so the probability of A must be at least equal to B, and it is probably greater.
Tversky and Kahneman found that people made the same mistakes whether they were statistical experts or naive students and that even doctors make mistakes when diagnosing patients as they use intuitive judgements.
Gamblers Fallacy (Tversky and Kahneman, 1974), also known as the law of averages, refers to the notion that if an event has occurred less often than its probability suggests, it is more likely to happen in the near future. An example of this is as follows:
If a coin is tossed ten times and produces 10 heads, what is the likelihood that tails will be the next result?
Gambler's fallacy in this case is a tendency to overestimate the probability that the coin will fall on tails next, due to the belief that a tail is somewhat "due" in the sequence - because of the sheer amount of heads beforehand. The base rate 50/50 chance of either a head or a tail being chosen is therefore ignored in this example. This illustrates how the probability of tails being next is evaluated according to its similarity to the previous results; chance is commonly viewed as a self-correcting process in which a deviation in one direction induces a deviation in the opposite direction to restore the equilibrium (Tversky and Kahneman, 1974). So, as "tails" is a different result to the previous sequence of 10 heads, we overestimate the probability of it appearing next in the sequence.
Insensitivity to predictability
People will generally use how things are right now to predict how things will be in the future. However a problem arises when people use how things are now to predict the future even when they know how things are now has little predictive ability or is unreliable. This is because if something is described as good, a good future will be more representative of it than a bad one.
For example: Jenny is a psychology student. Her year 11 arts teacher once said she was her favourite student. She is always on time. Do you think Jenny will do well/badly on her exams? We may take this information as a good sign, but is the opinion of 1 teacher really predictive of her abilities? And is being on time really a good predictive factor of her grades?
Regression towards the mean
we all know that as we collect more data, it will converge on a mean. if we select a piece of data it may appear quite extreme, but the next piece of data probably wont be so extreme. when this happens we can say the data is regressing towards to mean.
This occurs in the real world too. lets say you record your 100m running time 15 times on one day and get an average of 12 seconds. if the next day you run it in 16 seconds (an extreme score compared to your mean), the next time would probably be less than 16 secs, and therefore closer towards the mean, as results get less extreme and closer to the mean over time.
This heuristic involves estimating the frequency of events on the basis of how easy or difficult it is to retrieve relevant information from long term memory. The easier it is to recall an event or information, the greater the probability that will be assigned to it. For example, even though you are more likely to have an accident crossing the road, plane crashes are more vivid, thus coming to mind more easily than road accidents. They are more available to us. But, the reliance on this heuristic leads to predictable biases.
This is a 'child' of the availability heuristic.
Goldstein and Gigerenzer (2002) asked German and American students questions such as the following:
Which city has a larger population? –San Antonio or San Diego? –Hamburg or Cologne?
The study found that students were more likely to be correct about the city with the larger population when the question related to cities that weren't in their country. So if we say San Diego has the larger population for example's sake, then the German students were more likely to get this correct than the American students! Conversely, American students were more likely to pick the correct German city than the German students, illustrating a 'less is more' effect. This may seem counter-intuitive but the students made use of the information that was available to them when making their decisions. Although the American students knew less about the two German cities, they may have heard more about "Hamburg" (such as in the news, movies, etc) than "Cologne". This meant that "Hamburg" came easier to mind due to its increased availability, the availibility heurisitic, and in this sense it is mostly right and robust. In contrast, as the American students knew more about their own two cities, there was a tendency to overthink as more factors were involved when considering which had the larger population size (such as landmarks, own trips there, transport systems, etc) - ultimately meaning that although there was more information available about their own cities, this had a more detrimental effect on choosing the correct answer.
Retrievability of Instances
This is the idea that a class whose instances are more retrievable will be more numerous than one that isn't. For example, Tversky and Kahneman (1974) asked people 'what is more likely, a plane or car crash?' People were likely to answer plane crash because they receive much more publicity than car crashes, so people can recall them easier, despite air travel being the safest form of travel.
Lichtenstein et al (1978) asked participants to judge the relative likelihood of causes of death. Participants thought that accidents caused about the same amount of deaths as diseases. They also thought that murder was more common than suicide. In actual fact, diseases cause approximately 16 times as many deaths as accidents, and suicide is twice as frequent as homicide. The participants' error could be interpreted in terms of the availability heuristic - media coverage on murder cases tends to be quite extensive - especially in comparison to suicide cases. This then means that examples of murder will be retrieved much easier than that of suicide - causing us to judge the frequency of murder to be much higher. Similarly, an accident is a more dramatic way of dying than the common cause of disease and thus may be stored as salient within our memories, making it more accessible during retrieval.
Biases due to effectiveness of search set
This is the idea you remember certain things better than others because the search set you're using to find these instances is superior. For example, if asked to think of words beginning or ending with 'R', it is much easily to list words that start with 'R' because the search set is superior. Listing things in alphabetical order like we do makes tasks like this easier, if we listed words by the way they ended it would perhaps be easier to remember those words.
Chapman and Chapman (1967) found a bias in the judgement that 2 events will co-occur. They found participants overestimated natural associates such as anxiety and peculiar eyes. This is based on the strength of the associative bond, and when this is strong one concludes they're frequently paired. Strong associates also conclude that these events co-occur frequently.
Kahneman and Tversky argue that the illusory correlation is a result of the availability heuristic; if two things or events come to mind together, they may be mistakenly perceived as related, even if they are only remembered together because they occurred at the same time. Though it can lead to errors it is a reasonable assumption to make. After all, things that occur together often are related.
Adjustment and Anchoring
Once people make an estimate about a probability (their Anchor) they start to adjust it according to the new evidence. People have the tendency to make insufficient adjustments (using information most available and/or representative) or use irrelevant information that lead to biases.
Tversky and Kahnmen (1974) asked participants to guess the percentage of African countries that are in the UN. When asked is it more or less than 10% participants answered 25% on average. Participants that were asked if it was more or less thann 65% answered 45% on average. This shows how the percentages provided at the beginning affected their final answer. This also link to the Framing Bias (Tversky and Kahneman, 1981) in that the way the question is posed affects the decision that is made.
Another example is when participants were asked:
1x2x3x4x5x6x7x8x9 = ? They give a relatively low estimate - say around 500-1000.
9x8x7x6x5x4x3x2x1 = ? This leads participants to estimate a relatively high estimate - eg, 3000-5000.
Infact they are the same sum but participants take the first few numbers as their anchor and then from this adjust their answer accordingly, i.e. the participants saw the first number as being large and therefore this influenced them to produce a higher final sum. The two orders change the size of the initial anchor, and so lead to different estimates.
It has been found that irrelevant information can bias the initial estimate. Slovic et al (1971) found that different starting points, although they may yield different values, tend to almost always be biased towards the initial values, something which he termed, 'phenomenon anchoring'.
There are many effects of anchoring which have been highlighted in a study by Bar-Hillel (1973) in which subjects were given a choice to bet on one of three events; simple events (such as choosing a red marble from a bag containing 50 red and 50 white marbles), conjunctive events (such as choosing a red marble 7 times in a row out of a bag containing 90 red and 10 white marbles) and disjunctive events (such as choosing a red marble at least once in 7 goes from a bag containing 10 red and 90 white marbles). It was found that subjects tended to bet more for conjunctive events (probability of 48%) over simple events (probability of 50%). Also preferred to go with simple events over disjunctive events, even though they had a probability of 52%. Therefore most subjects bet on the less likely event in both comparisons. Shows that people tend to overestimate the probability of conjunctive events and underestimate the probability of disjunctive events.
Framing effects cause us to solve problems differently based on how they are presented. An example of where this occurs is in the 'Asian Disease Problem' constructed by Tversky and Kahneman (1981). Participants assumed the role of a mayor of a town, who had to decided between 2 plans of action in response to a 'deadly Asian disease'. The options were worded differently so they were framed either as a gain (how many lives would be saved) or framed as a loss (how many people would die).
gain condition Program A: 200 hundred people will be saved. Program B: there is a 1/3 probability that 600 people will be saved , and a 2/3 probability that no one will be saved.
loss condition Program 1: 400 people will die. Program 2: there is a 1/3 probability that no one will die, and a 2/3 probability that everyone will die.
notice how the problems are exactly the same, just worded differently!
Framing the option as a gain or a loss affected people's choices, as 72% of participants in the gain condition chose the first option to save the 200 people, whereas the majority in the loss condition chose program 2.
It could be argued that perhaps the way something is presented or 'framed' is what results in humans producing systematic errors. After all, heuristics do make us smarter, perhaps the framing issue is what is preventing us from utilising this tool in our environment.Possibly the best way to overcome this is to use natural frequencies as opposed to direct probabilites. This has been highlighted through a study by Gigerenzer and Edwards (2002) when doctors were asked to describe (in one of two conditions, either through probabilities or natural frequencies) what percentage of women would get breast cancer. Those doctors who used probabilities reported from 1% to 90%, when the correct answer is in fact 8%, however, those doctors who reported in natural frequencies obtained the correct answer or were close to it.
Framing links to the study by Thibodeu and Boroditsky (2011) in which they argue that different metaphors affect the way that people make decisions even if they have a certain political view on important social issues, such as crime.
Prospect Theory Kahneman and Tversky (1979+1984)
Prospect theory was developed in response to traditional economics view that people will always make the most rational decision by weighing gains and losses accurately.
There are two main assumptions:
(1) Individuals identify a reference point representing their current state.
(2) Individuals are much more sensitive to potential losses than to potential gains. This explains why people are unwilling to accept a 50-50 bet unless the amount they might gain is about twice the amount they might lose. people hate losing more than the like gaining. This is called loss aversion
As a result of this, if asked "would you rather i gave you £10 or flipped a coin and gave you £20 on a heads and £0 on a tails" most people will choose £10, whereas if the question is framed "would you rather i took £10 from you or flipped a coin and took £20 on a heads but £0 on a tails" people will take the risk on the coin flip.
the more we gain, the less each gain means to us, and this is called decreased marginal utility. It means that a pay rise of £2000 means a lot to someone who earns £15,000 a year, means less to someone who earns £30,000 a year and means nothing to someone who earns £100,000 a year.
Tversky, A. and Kahneman, D. (1974). Judgement under uncertainty: Heuristics and biases. Science, 185, 1124-1130.