Is Human Probability Intuition Actually ‘Biased’?

Is Human Probability Intuition Actually ‘Biased’?

February 22, 2023

Originally published at Economics from the Top Down

Blair Fix

According to behavioral economics, most human decisions are mired in ‘bias’. It muddles our actions from the mundane to the monumental. Human behavior, it seems, is hopelessly subpar.1

Or is it?

You see, the way that behavioral economists define ‘bias’ is rather peculiar. It involves 4 steps:

  1. Start with the model of the rational, utility-maximizing individual — a model known to be false;
  2. Re-falsify this model by showing that it doesn’t explain human behavior;
  3. Keep the model and label the deviant behavior a ‘bias’;
  4. Let the list of ‘biases’ grow.

Jason Collins (an economist himself) thinks this bias-finding enterprise is weird. In his essay ‘Please, Not Another Bias!’, Collins likens the proliferation of ‘biases’ to the accumulation of epicycles in medieval astronomy. Convinced that the Earth was the center of the universe, pre-Copernican astronomers explained the (seemingly) complex motion of the planets by adding ‘epicycles’ to their orbits — endless circles within circles. Similarly, when economists observe behavior that doesn’t fit their model, they add a ‘bias’ to their list.2

The accumulation of ‘biases’, Collins argues, is a sign that science is headed down the wrong track. What scientists should do instead is actually explain human behavior. To do that, Collins proposes, you need to start with human evolution.

The ‘goal’ of evolution is not to produce rational behavior. Evolution produces behavior that works — behavior that allows organisms to survive. If rationality does evolve, it is a tool to this end. On that front, conscious reasoning appears to be the exception in the animal kingdom. Most animals survive using instinct.

That brings me to the topic of this essay: the human instinct for probability. By most accounts, this instinct is terrible. And that should strike you as odd. As a rule, evolution does not produce glaring flaws. (It slowly removes them.) So if you see flaws everywhere, it’s a good sign that you’re observing an organism in a foreign environment, a place to which it is not adapted.

When it comes to probability, I argue that humans now live in a foreign environment. But it is of our own creation. Our intuition, I propose, was shaped by observing probability in short samples — the information gleaned from a single human lifetime. But with the tools of mathematics, we now see probability as what happens in the infinite long run. It’s in this foreign mathematical environment that our intuition now lives.

Unsurprisingly, when we compare our intuition to our mathematics, we find a mismatch. But that doesn’t mean our intuition is wrong. Perhaps it is just solving a different problem — one not usually posed by mathematics. Our intuition, I hypothesize, is designed to predict probability in the short run. And on that front, it may be surprisingly accurate.

‘Bias’ in an evolutionary context

As a rule, evolutionary biologists don’t look for ‘bias’ in animal behavior. That’s because they assume that organisms have evolved to fit their environment. When flaws do appear, it’s usually because the organism is in a foreign place — an environment where its adaptations have become liabilities.3

As an example, take a deer’s tendency to freeze when struck by headlights. This suicidal flaw is visible because the deer lives in a foreign environment. Deer evolved to have excellent night vision in a world without steel death machines attached to spotlights. In this world, the transition from light to dark happened slowly, so there was no need for fast pupil reflexes. Nor was there a need to flee from bright light. The evolutionary result is that when struck by light, deer freeze until their eyes adjust. It’s a perfectly good behavior … in a world without cars. In the industrial world, it’s a fatal flaw.

Back to humans and our ‘flawed’ intuition for probability. I suspect that many apparent ‘biases’ in our probability intuition stem from a change in our social environment, a change in the way we view ‘chance’. But before I discuss this idea, let’s review a widely known ‘flaw’ in our probability intuition — something called the gambler’s fallacy.

The gambler’s fallacy

On August 18, 1913, a group of gamblers at the Monte Carlo Casino lost their shirts. It happened at a roulette table, which had racked up a conspicuous streak of blacks. As the streak grew longer, the gamblers became convinced that red was ‘due’. And yet, with each new roll they were wrong. The streak finally ended after 26 blacks in a row. By then, nearly everyone had gone broke.

These poor folks fell victim to what we now call the gambler’s fallacy — the belief that if an event happens more frequently than normal during the past, it is less likely to happen in the future. It is a ‘fallacy’ because in games like roulette, each event is ‘independent’. It doesn’t matter if a roulette ball landed on black 25 times in a row. On the next toss, the probability of landing on black remains the same (18/37 on a European wheel, or 18/38 on an American wheel).

Many gamblers know that roulette outcomes are independent events, meaning the past cannot affect the future. And yet their intuition consistently tells them the opposite. Gamblers at the Monte Carlo Casino had an overwhelming feeling that after 25 blacks, the ball had to land on red.

The mathematics tell us that this intuition is wrong. So why would evolution give us such a faulty sense of probability?

Games of chance as a foreign environment

It is in ‘games of chance’ (like roulette) that flaws in our probability intuition are most apparent. Curiously, it is in these same games where the mathematics of probability are best understood. I doubt this is a coincidence.

Let’s start with our intuition. Games of chance are to humans what headlights are to deer: a foreign environment to which we’re not adapted. As such, these games reveal ‘flaws’ in our probability intuition. The Monte Carlo gamblers who lost their shirts betting on red were the equivalent of deer in headlights, misled by their instinct.

And yet unlike deer, we recognize our flaws. We know that our instinct misguides us because we’ve developed formal tools for understanding probability. Importantly, these tools were forged in the very place where our intuition is faulty — by studying games of chance.

It was a gamblers dispute in 1654 that led Blaise Pascal and Pierre de Fermat to first formalize the mathematics of probability. A few years later, Christian Huygens published a book on probability called De Ratiociniis in Ludo Aleae — ‘the value of all chances in games of fortune’. The rules of probability were then further developed by Jakob Bernoulli and Abraham de Moivre, who again focused mostly on games of chance. Today, the same games remain the basis of probability pedagogy — the domain where students learn how to calculate probabilities and discover that their intuition is wrong.

Why did we develop the mathematics of probability in the place where our intuition most misleads? My guess is that it’s because games of chance are at once foreign yet controlled. In evolutionary terms, games of chance are a foreign environment — something we did not evolve to play. But in scientific terms, these games are an environment that we control. Why? Because we designed the game.

By studying games we designed, we get what I call a ‘god’s eye view’ of probability. We know, for instance, that the probability of drawing an ace out of a deck of cards is 1 in 13. We know this because we designed the cards to have this probability. It is ‘innate’ in the design.

When we are not the designers, the god’s eye view of probability is inaccessible. To see this fact, ask yourself — what is the ‘innate’ probability of rain on a Tuesday? It’s a question that is unanswerable. All we can do is observe that on previous Tuesdays, it rained 20% of the time. Is this ‘observed’ probability of rain the same as the ‘innate’ probability? No one knows. The ‘innate’ probability of rain is forever unobservable.

Because games of chance are an environment that is both controlled yet foreign, they are a fertile place for understanding our probability intuition. As designers, we have a god’s eye view of the game, meaning we know the innate probability of different events. But as players, we’re beholden to our intuition, which knows nothing of innate probability.

This disconnect is important. As game designers, we start with innate probability and deduce the behavior of the game. But with intuition, all we have are observations, from which we must develop a sense for probability.

Here’s the crux of the problem. To get an accurate sense for innate probability, you need an absurdly large number of observations. And yet humans typically observe probability in short windows. This mismatch may be why our intuition appears wrong. It’s been shaped to predict probability within small samples.

Innate probability, show your face

When you flip a coin, the chance of heads or tails is 50–50.

So begins nearly every introduction to the mathematics of probability. What we have here is not a statement of fact, but an assumption. Because we design coins to be balanced, we assume that heads and tails are equally likely. From there, we deduce the behavior of the coin. The mathematics tell us that over the long run, innate probability will show its face as a 50–50 balance between heads and tails.

The trouble is, this ‘long run’ is impossibly long.

How many coin tosses have you observed in your life? A few hundred? A few thousand? Likely not enough to accurately judge the ‘innate’ probability of a coin.

To see this fact, start with Table 1, which shows 10 tosses of a simulated coin. For each toss, I record the cumulative number of heads, and then divide by the toss number to calculated the ‘observed’ probability of heads. As expected, this probability jumps around wildly. (This jumpiness is what makes tossing a coin fun. In the short run, it is unpredictable.)

Table 1: A simulated coin toss
Toss Outcome Cumulative number of heads Observed probability of heads (%)
1 T 0 0.0
2 T 0 0.0
3 H 1 33.3
4 H 2 50.0
5 T 2 40.0
6 H 3 50.0
7 H 4 57.1
8 H 5 62.5
9 H 6 66.7
10 T 6 60.0

In the long run, this jumpiness should go away and the ‘observed’ probability of heads should converge to the ‘innate’ probability of 50%. But it takes a surprisingly long time to do so.

Figure 1 illustrates. Here I extend my coin simulation from 10 tosses to over 100,000 tosses. The red line is the coin’s ‘innate’ probability of heads (50%). This probability is embedded in my simulation code, but is accessible only to me, the simulation ‘god’. Observers know only the coin’s behavior — the ‘observed’ probability of heads shown by the blue line.

Figure 1: In search of ‘innate’ probability. I’ve plotted here the results of a simulated coin toss. The blue line shows the ‘observed’ probability of heads after the respective number of tosses. The red line shows the ‘innate’ probability of heads (50%), which is embedded in the simulation code but inaccessible to observers.

Here’s what Figure 1 tells us. If observers see a few hundred tosses of the coin, they will deduce the wrong probability of heads. (The coin’s ‘observed’ probability will be different from its ‘innate’ probability.) Even after a few thousand tosses, observers will be misled. In this simulation, it takes about 100,000 tosses before the ‘observed’ probability converges (with reasonable accuracy) to the ‘innate’ probability.4

Few people observe 100,000 tosses of a real coin. And that means their experience can mislead. They may conclude that a coin is ‘biased’ when it is actually not. Nassim Nicholas Taleb calls this mistake getting ‘fooled by randomness’.

Not only do we fool ourselves today, I suspect that we fooled ourselves repeatedly as we evolved. Before we designed our own games of chance, the god’s eye view of probability was inaccessible. All we had were observations of real-world outcomes, which could easily mislead.

For outcomes that were frequent, we could develop an accurate intuition. We are excellent, for instance, at using facial expressions to judge emotions — obviously because such judgment is a ubiquitous part of social life. But for outcomes that were rare (things like droughts and floods), patterns would be nearly impossible to see. The result, it seems, was not intuition but superstition. The worse our sense for god’s-eye probability, the more we appealed to the gods.

When coins have ‘memory’

Even when we know the god’s-eye probability, we find it difficult to suppress our intuition. Take the gambler’s fallacy, whereby we judge independent events based on the past. When a coin lands repeatedly on heads, we feel like tails is ‘due’. And yet logic tells us that this feeling is false. Each toss of the coin is an independent event, meaning past outcomes cannot affect the future. So why do we project ‘memory’ onto something that has none?

One reason may be that when we play games of chance, we are putting ourselves in a foreign environment, much like deer in headlights. As a social species, our most significant interactions are with things that do have a memory (i.e. other humans). So a good rule of thumb may be to project memory onto everything with which we interact. Sure, this intuition can be wrong. But if the costs of falsely projecting memory (onto a coin toss, for instance) are less than the costs of falsely not projecting memory (onto your human enemies, for example), this rule of thumb would be useful. Hence it could evolve as an intuition.

This explanation for our flawed intuition is well trodden. But there is another possibility that has received less attention. It could be that our probability intuition is not actually flawed, but is instead a correct interpretation of the evidence … as we see it.

Remember that our intuition has no access to the god’s eye view of ‘innate’ probability. Our intuition evolved based only on what our ancestors observed. What’s important is that humans typically observe probability in short windows. (For instance, we watch a few dozen tosses of a coin.) Interestingly, over these short windows, independent random events do have a memory. Or so it appears.

In his article ‘Aren’t we smart, fellow behavioural scientists’, Jason Collins shows you how to give a coin a ‘memory’. Just toss it 3 times and watch what follows a heads. Repeat this experiment over and over, and you’ll conclude that the coin has a memory. After a heads, the coin is more likely to return a tails.

To convince yourself that this is true, look at Table 2. The left-hand column shows all the possible outcomes for 3 tosses of a coin. For each outcome, the right-hand column shows the probability of getting tails after heads.

Table 2: The probability of tails after heads when tossing a coin 3 times
Outcome Probability of tails after heads
HHH 0%
HHT 50%
HTH 100%
HTT 100%
THH 0%
THT 100%
Expected probability of tails after heads 58%

Modeled after Jason Collins’ table in Aren’t we smart, fellow behavioural scientists.

To understand the numbers, let’s work through some examples:

  • In the first row of Table 2 we have HHH. There are no tails, so the probability of tails after heads is 0%.
  • In the second row we have HHT. One of the heads is proceeded by a tails, the other is not. So the probability of tails after heads 50%.

We keep going like this until we’ve covered all possible outcomes.

To find the expected probability of tails after heads, we average over all the outcomes where heads occurred in the first two flips. (That means we exclude the last two outcomes.) The resulting probability of tails after heads is:

\displaystyle \begin{aligned} P(T ~|~ H) &= \frac{50\%+100\%+100\%+100\%}{6} \\ \\&= \frac{350\%}{6} \\ \\ &= 58\% \end{aligned}

When tossed 3 times, our coin appears to have a memory! It ‘remembers’ when it lands on heads, and endows the next toss with a higher chance of tails. Or so it would appear if you ran this 3-toss experiment many times.

The evidence would look something like the blue line in Figure 2. This is the observed probability of getting tails following heads in a simulated coin toss. Each iteration (horizontal axis) represents 3 tosses of the coin. The vertical axis shows the cumulative probability of tails after heads as we repeat the experiment. After a few thousand iterations, the coin’s preference for tails becomes unmistakable.

Figure 2: When tossed 3 times, a simulated coin favors tails after heads. I’ve plotted data from a simulation in which I repeatedly toss a balanced coin 3 times. The blue line shows the observed probability that tails follows heads. The horizontal axis shows the number of times I’ve repeated the experiment.

The data shouts at us that the coin has a ‘memory’. Yet we know this is impossible. What’s happening?

The coin’s apparent ‘memory’ is actually an artifact of our observation window of 3 tosses. As we lengthen this window, the coin’s memory disappears. Figure 3 shows what the evidence would look like. Here I again observe the probability of tails after heads during a simulated coin toss. But this time I change how many times I flip the coin. For an observation window of 5 tosses (red), tails bias remains strong. But when I increase the observation window to 10 tosses (green), tails bias decreases. And for a window of 100 tosses (blue), the coin’s ‘memory’ is all but gone.

Figure 3: Favoratism for tails (after heads) disappears as the observation window lengthens. I’ve plotted data from a simulation in which I repeatedly toss a balanced coin n times and measure the probability of tails after heads. As the observation window n increases (from 5 to 10 to 100 tosses), tails favoratism decreases. The vertical axis shows the cumulative probability of tails after heads. The horizontal axis shows the number of times I’ve repeated the experiment.

Here’s the take-home message. If you flip a coin a few times (and do this repeatedly), the evidence will suggest that the coin has a ‘memory’. Increase your observation window, though, and the ‘memory’ will disappear.

The example above shows the coin’s apparent memory after a single heads. But what if we lengthen the run of heads? Then the coin’s memory becomes more difficult to wipe. Figure 4 illustrates.

I’ve plotted here the results of a simulated coin toss in which I measure the probability of getting tails after a run of heads. Each panel shows a different sized run (from top to bottom: 3, 5, 10, and 15 heads in a row). The vertical axis shows the observed probability of tails after the run. The colored lines indicate how this probability varies as we increase the number of tosses in our observation window (horizontal axis).

Figure 4: Wiping a coin’s ‘memory’. I’ve plotted here the results of a simulated coin toss in which I measure the probability of getting tails after a run of heads. Each panel shows a different sized run (from top to bottom: 3, 5, 10, and 15 heads in a row). The horizontal axis shows the number of tosses observed. The vertical axis shows the observed probability (the average outcome over many iterations) of getting tails after the corresponding run of heads. The longer the run of heads, the more tosses you need to remove the coin’s apparent preference for tails.

Here’s how to interpret the data in Figure 4. When the observed probability of tails exceeds 50%, the coin appears to have a ‘memory’. As the observation window increases, this memory slowly disappears, and eventually converges to the innate probability of 50%. But how long this convergence takes depends on the length of the run of heads. The longer the run, the larger the observation window needed to wipe the coin’s memory.

For a run of 3 heads (top panel), it takes a window of about 1000 tosses to purge the preference for tails. For 5 heads in a row (second panel), it takes a 10,000-toss window to purge tails favoritism. For 10 heads in a row (third panel), the memory purge requires a window of 100,000 tosses. And for 15 heads in a row (bottom panel), tails favoritism remains up to a window of 1 million tosses.

The corollary is that when we look at a short observation window, the evidence shouts at us that the coin has a ‘memory’. After a run of heads, the coin ‘prefers’ tails. The data tells us so!

Playing god with AI

With the above evidence in mind, imagine that we play god. We design an artificial intelligence that repeatedly observes the outcome of a coin toss and learns to predict probability.

Here’s the catch.

Every so often we force the AI to reboot. As it restarts, the machine’s parameters (its ‘intuition’) remain safe. But its record of the coin’s behavior is purged. This periodic reboot forces our AI to understand the coin’s probability by looking at short windows of observation. The AI never sees more than a few thousand tosses in a row.

We let the AI run for a few months. Then we open it up and look at its ‘intuition’. Lo and behold, we find that after a long run of heads, the machine has an overwhelming sense that tails is ‘due’. To the machine, the coin has a memory.

The programmers chide the AI for its flawed intuition. ‘Silly machine,’ they say. ‘Coins have no memory. Each toss is an independent random event. Your intuition is flawed.’

Then a programmer looks at the data that the machine was fed. And she realizes that the machine’s intuition is actually accurate. The AI is predicting probability not for an infinite number of tosses (where ‘innate’ probability shows its face), but for a small number of tosses. And there, she finds, the machine is spectacularly accurate. When the sample size is small, assuming the coin has a memory is a good way to make predictions.

The AI machine, you can see, is a metaphor for human intuition. Because our lives are finite, humans are forced to observe probability in short windows. When we die, the raw data gets lost. But our sense for the data gets passed on to the next generation.5 Over time, an ‘intuition’ for probability evolves. But like the AI, it is an intuition shaped by observing short windows. And so we (like the AI) feel that independent random events have memory.

Correct intuition … wrong environment

Let’s return to the idea that our probability intuition is ‘biased’. In economics, ‘bias’ is judged by comparing human behavior to the ideal of the rational utility maximizer. When we make this comparison, we find ‘bias’ everywhere.

From an evolutionary perspective, this labelling makes little sense. An organism’s ‘bias’ should be judged in relation to its evolutionary environment. Otherwise you make silly conclusions — such as that fish have a ‘bias’ for living in water, or humans have a ‘bias’ for breathing air.

So what is the evolutionary context of our probability intuition? It is random events viewed through a limited window — the length of a human life. In this context, it’s not clear that our probability intuition is actually biased.

Yes, we tend to project ‘memory’ onto random events that are actually independent. And yet when the sample size is small, projecting memory on these events is actually a good way to make predictions. I’ve used the example of a coin’s apparent ‘memory’ after a run of heads. But the same principle holds for any independent random event. If the observation window is small, the random process will appear to have a memory.

When behavioral economists conclude that our probability intuition is ‘biased’, they assume that its purpose is to understand the god’s eye view of innate probability — the behavior that emerges after a large number of observations. But that’s not the case. Our intuition, I argue, is designed to predict probability as we observe it … in small samples.

In this light, our probability intuition may not actually be biased. Rather, by asking our intuition to understand the god’s eye view of probability, we are putting it in a foreign environment. We effectively make ourselves the deer in headlights.

Simulating a coin toss

With modern software like R, it’s easy to simulate a coin toss. Here, for instance, is R code to generate a random series of 1000 tosses:

coin_toss = round( runif(1000) )

Let’s break it down. The runif function generates random numbers that are uniformly distributed between some lower and upper bound. The default bounds (which go unstated) are 0 and 1 … just what we need. Here, I’ve asked runif to generate 1000 random numbers between 0 and 1. I then use the round function to round these numbers to the nearest whole number. The result is a random series of 0’s and 1’s. Let 0 be tails and 1 be heads. Presto, you have a simulated coin toss. The results look like this:

 [1] 0 0 1 1 0 1 1 1 1 0 …

Modern computers are so incredibly fast that you can simulate millions of coin tosses in a fraction of a second. The more intensive part, however, is counting the results.

Suppose that we want to measure the probability of tails following 2 heads. In R, the best method I’ve found is to first convert the coin toss vector to a string of characters, and then use the stringr package to count the occurrence of different events.

First, we use the paste function to convert our coin_toss vector to a single character string:

coin_string = paste(coin_toss, collapse="")

That gives you a character string of 0’s and 1’s:

[1] "0011011110…"

Now suppose we want to find the probability that 2 heads are proceeded by tails. We start by counting the occurrences of HHH. In our binary system, that’s 111. We use the str_count function to count the occurrences of 111:


# count occurrence of heads after 2 heads
n_heads = str_count(coin_string, paste0("(?=","111",")"))

Next we count the occurrences of HHT. In our binary system, that’s 110:

# count occurrence of tails after 2 heads
n_tails = str_count(coin_string, paste0("(?=","110",")"))

The observed probability of tails following 2 heads is then:

p_tails = n_tails / (n_tails + n_heads)

If you’re simulating the coin toss series once, the code above will do the job. (You can download it here.) But if you want to run the simulation repeatedly (to measure the average probability across many iterations), you’ll need another tool.

To create the data shown in Figure 4, I wrote C++ code to simulate a coin toss and count the occurrence of different outcomes. You can download the code at GitHub. I simulated each coin-toss window 40,000 times and then measured the average probability across all iterations.


[Cover image: Pixabay]

  1. Here’s how novelist Cory Doctorow summarizes the behavioral economics ‘revolution’:

    Tellingly, the most exciting development in economics of the past 50 years is “behavioral economics” – a subdiscipline whose (excellent) innovation was to check to see whether people actually act the way that economists’ models predict they will.

    (they don’t)


  2. What behavioral economists are doing is essentially falsifying (over and over) the core neoclassical model of human behavior. To understand the response, it’s instructive to look at what happened in other fields when core models have failed.Take physics. In the late-19th century, most physicists thought that light traveled through an invisible substance called ‘aether’ — a kind of background fabric that permeated all of space. Although invisible, the aether had a simple consequence for how light ought to behave. Since the Earth presumably traveled through the aether as it orbited the Sun, the speed of light on Earth ought to vary by direction.

    In 1887, Albert Michelson and Edward Morley went looking for this directional variation in the speed of light. They found no evidence for it. Instead, light appeared to have constant speed in all directions. Confusion ensued.

    In 1905, Albert Einstein resolved the problem with his theory of relativity. Einstein assumed that light needed no transmission medium, and that its speed was a universal constant for all observers. After Einstein, physicists abandoned the idea of ‘aether’ and moved on to better theories.

    In economics, the response to falsifying evidence has been quite different. Instead of abandoning their rational model of man, economists ensconced it as a kind of ‘human aether’ — an invisible template used to judge how humans ought to behave. When humans don’t behave as the model predicts, economists label the behavior a ‘bias’.↩︎

  3. Seemingly ‘flawed’ behavior can also signal that the organism isn’t controlling its own actions. The virus Toxoplasma gondii, for instance, turns mice into cat-seeking robots. That’s suicidal for the mouse, but good for virus, which needs to get inside a cat to reproduce.↩︎
  4. The mathematics tell us that ‘true’ convergence takes infinitely long. That is, you need to toss a coin an infinite number of times before the observed probability of heads will be exactly 50%. Anything less than that and the observed probability of heads will differ (however slightly) from the innate probability. For example, after 100,000 tosses my simulated coin returns heads at rate of 49.992% — a 0.008% deviance from the innate probability.↩︎
  5. OK, this is an oversimplification. What actually happens is that a person with intuition x reproduces more than the person with intuition y. And so intuition x spreads and evolves.↩︎