Rethinking Causation in the Social Sciences
January 11, 2020
Originally published on Economics from the Top Down
Blair Fix
For the last few weeks, I’ve been thinking about causation in the social sciences. As with many instances of reflection, this was prompted by rejection. A political economy journal recently rejected a paper that I had submitted.
The paper (available here) studied the correlation between hierarchical power and income. Throughout the paper, I stressed that I was dealing with correlation only. So I thought I was safe from the ‘causal police’ (I thank Steve Dahlke for this term). But I was wrong.
Among the minutia of the rejection letter was the following sentence that will be forever seared in my memory. “Correlation, by itself,” the reviewer informed me, “is not of scientific interest.”
I was (and am) infuriated and astonished at this response. I’m infuriated that such a foolish response could be grounds for rejecting a paper. And I’m astonished that a scientist could say such a thing.
Correlation, in my view, is the backbone of science. Yes, science is ultimately about finding causes. But to discover the cause of something, we need to start somewhere. And that somewhere is usually correlation.
If correlation is “not of scientific interest”, we need to throw out a huge part of the scientific literature. We shouldn’t publish the correlation between smoking and lung cancer. Nor should we publish the correlation between global temperature and carbon dioxide concentration. Neither of these correlations can be definitely (as in lacking any doubt) shown to be causal. So by the reviewer’s logic, these correlations are of no interest. I hope you see how ludicrous this thinking is.
Now that I’ve gotten the indignation off my chest, I promise that the rest of this post won’t be a rant about a rejected paper. Instead, I’m going to write about some reflection that was prompted by the reviewers remarks.
Over the last few weeks I’ve been asking myself — what do social scientists mean when they say that x causes y? I’ve concluded that social scientists don’t have a coherent definition of causation. The problem, as I see it, is that we can’t talk about causation in a complex system without drawing boundaries. Without boundaries, we get an infinite regression of cause and effect.
As social scientists, we need to rethink what we mean by ‘causation’. To be coherent, our concept of causation needs to come with a boundary that defines the limits of cause and effect.
Notating causation
Much of my reflection here is inspired by the work of Judea Pearl. In The Book of Why, Pearl introduces a convenient way to notate causation. To say that A causes B, we write:
A \longrightarrow B
While simple, this causal notation is a powerful tool. It allows us to visualize causation in ways that would otherwise be difficult to think about.
Here’s a more complicated example:
A \longrightarrow B \longrightarrow C
Here A causes C, but the effect is mediated by B. To explain mediation, Judea Pearl uses the example of citrus fruit and scurvy. We know that eating citrus fruit prevents scurvy. But it’s not the fruit per se that prevents scurvy. Scurvy is caused by a vitamin C deficiency. Citrus fruit prevents scurvy because it contains vitamin C. In Pearl’s language, we say that vitamin C ‘mediates’ the relation between citrus fruit and scurvy:
citrus fruit \longrightarrow vitamin C \longrightarrow prevents scurvy
The concept of mediation is important. Let’s look at how it applies to the social sciences.
Mediating variables in the social sciences
My discussion of mediating variables in the social sciences will focus on income, because that’s what I know best. I’ll begin with the neoclassical theory of income.
According to neoclassical economics, individual income is caused by productivity. And this productivity, in turn, is caused by human capital (a stock of skills and knowledge endowed in individuals). Putting the two together, we get the following chain of causation:
human capital \longrightarrow productivity \longrightarrow income
There are many problems with this causal chain. For starters, the mediator (productivity) doesn’t vary enough to explain differences in income. Then there’s the fact that ‘human capital’ has no coherent definition, making it hard to measure. I’ve written here about the many problems with human capital theory.
What is most shocking about human capital theory, I’ve realized, is not what the theory includes. It’s what it excludes. Human capital theory excludes the social environment! The theory assumes that an individual’s income is explained entirely by personal traits.
I hope you see the absurdity of this thinking. Humans are social animals. Our social environment pervades every aspect of our behavior, including how we distribute resources. In other words, culture mediates our behavior.
Culture as a mediator
To illustrate culture as a mediator, let’s use the example of education and income.
In the United States, individuals with more education tend to earn more money. Does this mean that education causes greater income? That’s what neoclassical economists have concluded. Educated people are more productive, economists say, so they earn more money:
education \longrightarrow productivity \longrightarrow income
Here productivity mediates the relation between education and income. But let’s consider an alternative. What if culture mediates this relation? What if it’s our ideas about education (i.e. our values) that cause income to increase with education. We’d write this cultural mediation as:
education \longrightarrow culture \longrightarrow income
It’s easy to imagine why this might be true. Suppose we take a PhD-trained scientist and drop him/her into a society of hunter gathers. Does the scientist continue to earn a pay premium because of his/her extensive education? Unlikely. The hunter gatherers don’t give a damn about the scientist’s knowledge. And so the hunter gatherers are unlikely to let the scientist take a disproportionate chunk of the resource pie.
The point is simple. If a society values education, then education will affect income. But if a society does not value education, then education won’t affect income. So it is our ideas about education that cause education to affect income.
Does this mean that if we understand the relevant cultural ideas, we completely understand the cause of income? Not really. We still have to explain where the ideas came from. In other words, what caused the culture that leads education to affect income? As we expand the scope of analysis, we get an infinite regression of causation.
Causation requires a boundary definition
To talk coherently about causation, we need a boundary that limits the concept of ‘cause’. In the social sciences, I can think of 3 useful boundaries.
Boundary 1: Culture as a given
The simplest boundary is to exclude culture from our analysis. For instance, we might say that in the United States in 1970, an additional year of education ‘caused’ individual income to increase by 10%, on average.
Under this causal boundary, we take the cultural context (of the US in 1970) for granted. Although it’s rarely stated, this is the causal boundary used by most social scientists. And it’s easy to see why. When you take culture for granted, it’s easy to infer causation using regressions. Run some multivariate regressions, control for some ‘confounders’, and bam … you show that education ‘causes’ income to increase.
I have no problem with this limited notion of causation, as long as the boundaries are stated explicitly. However, they rarely are. Worse, social scientists often uses inferences from a limited causal boundary to make sweeping generalizations.
When we take culture as a given, we need to report causation as follows: “x causes y, given culture z.” But all too often, the “given culture z” gets dropped, leaving the seemingly universal “x causes y”.
Boundary 2: Culture as a mediator
Let’s expand our boundary of causation. Now let’s admit that our ideas affect our actions. When we admit that culture mediates behavior, we no longer say that education ‘causes’ greater income. Now there’s an extra step: culture causes education to affect income.
But not only do we add an extra step to the causal chain, we also make the analysis more expansive. To study the effect of culture, we need to look at many different social settings (not just one). For instance, the returns to education vary greatly between countries. Why? What aspect of culture explains this variation?
This is a difficult question to untangle, which is why few social scientists include culture in the boundary of causation. (They may allude to culture in the discussion, but it’s rarely part of the regression table).
Boundary 3: Culture is to be explained
Supposing that we understand how culture affects our behaviour, the next step is to explain culture itself. When we use this causal boundary, we no longer say that education ‘causes’ greater income. And we no longer say that culture causes education to affect income. Instead, we say that phenomenon x causes the culture that leads education to affect income. (You can see how torturous this gets).
This is obviously the most difficult causal boundary. Genetic or geographic determinism aside, there is no simple way to explain culture. In fact, it’s not even clear that culture has an external explanation. Culture can be its own cause! In other words there is a feedback between ideas and behavior that is impossible to untangle into linear causation.
Causal boundaries and linear causation
My experience is that many social scientists are deeply wedded to a linear concept of causation. In causal notation, linear causation is a unidirectional arrow. A causes B:
A \longrightarrow B
Those trained in system dynamics, however, know that causation can involve feedback. A causes B, and B causes A:
A \longleftrightarrow B
Here’s a general principle that I’m going to throw out there. Linear causation becomes less and less useful as we expand our boundary of causation.
To illustrate, let’s return to education and income. If we exclude culture, it’s fine to say that education causes greater income. This is linear causation at its simplest. But once we include culture in the boundary of causation, things become less simple. Ideas affect our behavior. But our behavior also affects our ideas. In the most expansive boundary for causation (when culture is to be explained) we admit that there is a complex feedback between ideas and behavior. And so linear notions of causation become almost useless.
Lessons?
I’ve learned a few lessons from this reflection on causation. First, the longer I think about causation, the more confused I become. Second, I think that training on the philosophy of causation needs to be a basic part of science education. And no, I don’t mean how to infer causation from regression tables. I mean that science education should tackle the big questions about causation. Judea Pearl tackles these questions, and I highly recommend you read his work.
OK, let’s conclude this reflection. While I’ve confused myself while writing this post, I think there’s at least one cogent lesson. The next time a social scientist tells me that x causes y, I’m going to ask them: “what is your boundary of causation?”