17 March 2016

The (Lack of) Usefulness of Empirical Evidence in Economics

My argument with Jason Smith about the philosophy of economics has become rather spread out and hard to follow, so I thought I would compile my argument into one blog post.

First, let's start where I did with my previous post. Suppose there is some effect $y$ that is caused by $x,z,...$. This can be written as
$$y = f(x,z,...)$$
I hypothesize that $y = g(x,u)$. How do I test this hypothesis? I must isolate all other variables but $x$ or $u$ at one time and then change one of the independent variables ($x$ or $u$). If I do this, then I will be able to determine if $g(\bullet) = f(\bullet)$ and whether or not $x$ or $u$ causes $y$.

This can't [easily] be done in economics; it is simply not possible [probable that it would be possible] to conduct a generalizable experiment in a closed system -- i.e., only one independent variable can change at once. Take my previous example of the minimum wage. There are two issues with concluding that the econ 101 partial equilibrium model is wrong in the face of apparently conflicting empirical evidence: there is no way to determine whether or not the minimum wage increase caused employment to be lower than it otherwise would have been, since there can be no control experiment, and, if employment is $y$ and the minimum wage is $x$ from the example above, empirical evidence can only prove that the theory is incomplete; i.e. that it doesn't capture all of the possible causes of $y$. 

So, in this sense, it is possible to invalidate an economic model with data, but it doesn't help with anything; it doesn't even say whether the model is actually wrong, or simply lacking the full number of causes that effect $y$. The econ 101 model may be completely right about what happens to employment given a minimum wage increase in a closed system, but we can never know. It is for this reason that empirical evidence is not very valuable in economics, not because people have too strong of priors (even though this is frequently the case).

This also means, as Jason rightly acknowledges, that empirical accuracy is not something worth praising very highly in economics:
[*] You'll never nail down correlation vs causation 
Due to external factors (and lack of controlled experiments), this may be true. However, this is a reason not to praise an empirically successful model.
The problem is that the flip-side of this is that you can never actually determine whether or not the supposed cause and effect relationships in models are correct. This is why Popperian rejection of hypotheses is not [rarely] possible in economics; why empirical evidence cannot "falsify economic models."

Jason also challenges my claim that DSGE models are [should be] qualitative models:
John makes the case that it is the latter: DSGE models are qualitative models. I don't buy this. For one, they are way too complex to be a qualitative model.
I agree with this when it comes to, e.g., the NY Fed DSGE, but not for DSGE in general. Basic DSGE models, without all the bells and whistles that try to make them empirical, are indeed, in my opinion qualitative. They are simply internally consistent ways of diagnosing a single problem in economics. If I want to come up with a theory about whether it is better to have PAYGO pensions or American-style social security, I don't bother modeling monopolistic competition or sticky prices because all I want is a qualitative analysis.

I disapprove of even using the supposedly structural DSGE models of the economy such as Smets-Wouters or the NY Fed DSGE on the grounds that it should be apparent that the assumptions of these models deviate far from reality; they are in no way structural, so expecting them to be accurate models is almost ridiculous. Because of this, they should just be thrown out for their complexity.

When it comes down to it, Jason seems to immediately associate DSGE with big, clunky models with tons of dubious assumptions that somehow approximate reality whereas I associate DSGE with utility maximization, budget constraints, and rational expectations. Every DSGE model I bother using is, in my opinion, qualitative and that's how it should be.

So, to make one of my previously misunderstood arguments slightly more clear, does the great recession invalidate the basic 3-equation reduced form New Keynesian model? No, it only shows that the model is incomplete, since there is no way to disprove any of the supposed causal relationships in the model, especially since rational expectations are unobservable. The lack of empirical support for New Keynesian DSGE (as I define it) does not inherently mean that any of the hypothesized cause and effect relationships in the model are wrong. It could be this, or that the model lacks enough complexity to explain the data.


  1. I think I am beginning to see less and less daylight between our two views.

    RE: DSGE

    I was mostly referring to models with with on the order of 50 parameters, so simpler DSGE models would fall under my "less complex" heuristic.

    RE: Rejection

    I'd completely agree with you if we were just attempting to find the probability:


    The data may be uninformative, so you can't conclusively reject the model ... even if it is correct -- i.e. P(model) = 1. However, a scientific approach has an additional "pragmatic" hypothesis that is tested with every model. Is the model useful?

    P(model + useful|data) ~ P(model|data) P(useful|data)

    Since usefulness is not necessarily independent of the correctness of the model -- i.e. P(model) -- that second term may be a poor approximation. But it lets us see that if a model is useless, it can mean you can reject it regardless of whether the model is correct. What is usefulness? It's a complex function of other models, complexity, your purpose in building the model, etc:

    useful = useful(complexity, better than other models, ...)

    This probability:

    P(model + useful|data)

    can be low even if this probability:


    is middling (or even high).

    Popper never really considered usefulness. That is why he thought the success of general relativity falsified Newtonian gravity. But Newtonian gravity remains useful; falsification is too binary.

    1. Jason, I agree with you about Popper. Perhaps that's why Sean Carroll listed "falsifiability" as something that needs to disappear from science. I admit I was surprised by that from him, but if he means that kind of strict Popperian falsifiability, then I can see where he has a point.

      Also, it seems to me your usefulness discussion here has a tie-in with Occam's razor. To use a famous example, I could hypothesize that I have an undetectable dragon living in my garage, that doesn't interact with any known fields, matter or energy in any way. I might (or might not) be correct, but that seems like a clear case of usefulness = 0.

    2. I agree that usefulness provides a way to reject models (hence "Because of this, [the NY DSGE and Smets-Wouters, etc.] should just be thrown out for their complexity").

      I still hold that there is no pure way to reject a model solely with data unless there is a closed system though.

    3. John, what about an impure way? If model B has significantly greater empirical success than model A, and is no more complex (and perhaps less complex) and no less useful (perhaps more useful), then why wouldn't we put A on the back burner and turn our attention to B?