Menzie Chinn writes,

the forecasts are generated using old-fashioned models in the spirit of the neoclassical synthesis (demand determined in short run, supply determined in the long run) with (as I understand it) backwards looking expectations rather than model-consistent expectations. I leave it to the readers whether these characteristics are the biggest sins of macro modelers in the run up to the latest crisis, and the ensuing Great Recession. (After all, one could reasonably argue that assuming perfect capital markets, or a unitary bond market, might be more problematic assumptions that adaptive expectations.)

He is discussing the macroeconometric models. These are models with hundreds of equations that are simulated with and without a fiscal stimulus in order to measure the effect of the stimulus. I want to explain why the models lack scientific merit. Some of this criticism was spelled out in historical perspective here. I am currently revising that paper.

Chinn is correct that the economics profession turned away from macroeconometric models for reasons having to do with their theoretical specification, in particular their inconsistency with rational expectations. If that were their only problem, I would still be a model jockey, as I was early in my professional career.

Instead, my criticism of macroeconometric models is that the degrees of freedom belong to the modeler, not to the data. In Bayesian terms, the weight of the modeler’s priors is very, very high, and the weight of the data is close to zero. The data are essentially there just to calibrate the model to the modeler’s priors.

In nontechnical terms, this issue can be stated as follows. Consider two ways of getting a computer to print out that the stimulus created 1.6 million jobs. Method one is to set up an elaborate computer simluation that produces such a result. Method two is to type “the stimulus created 1.6 million jobs” into a word processer and hit the print key. The only difference between those two ways is the amount of computer processing time involved.

The scientific method is based on controlled experiments. In a controlled experiment, the experimenter creates an environment in which one factor changes and all other factors are held constant.

Economists cannot construct controlled experiments to test all of our interesting hypotheses. We have abundant data, but we did not create the circumstances that produced the data. In statistical jargon, we are making observational studies.

An observational study can be of scientific use if the conditions are right. One condition is that there are many observations relative to the number of factors that must be controlled for. In statistical jargon, this is known as the degrees of freedom.

In macroeconomics, there are more factors to be controlled for than there are observations. There are negative degrees of freedom, which should cause your statistical software to give you an error message.

Instead, the modeler limits the way that factors enter the model. For example, the modeler probably will not control for changes in the educational attainment of the labor force over time. That is not because the educational attainment over time does not matter. It is because the modeler does not want to put in so many factors that the computer spits out an error message.

There are thousands of ways to specify the “consumption function,” which is the equation that predicts consumer spending. Should durable goods spending be separated from spending on nondurable goods and services? Should previous periods’ income be used in addition to current income, and with what weight? Should a measure of anticipated future income be used? How should wealth enter the equation? Is there a way to account for the role of credit market conditions? How do tax considerations enter? Are there different propensities to consume out of wage income and out of transfer payments? How do consumers respond to changes in oil prices? How do they form expectations for oil prices in the future? What factors that are trending over time, such as population changes and shifts in the mix of consumption, need to be controlled for? Which time periods are affected by special factors, such as the recent snowstorms along the east coast?

If you have about 80 quarters of data to work with, and you have thousands of factors to control for, there is no conceivable way for the model’s specification to reflect the data. Instead, the specification depends on the opinion of the modeler.

The conditions under which statistical techniques are scientifically valid are not satisfied with macroeconomic data. There is no reason to take model results as reflecting anything other than the opinion of the modeler.

What if the models performed well in out-of-sample forecasts? If that were the case, then I would have to concede that there might be some scientific validity to the models. However, that has never been the case. When I was a model jockey, the models were forever being tweaked with what were called “add factors” or “constant adjustments” in order to keep them on track with the most recent data. Formal studies of out-of-sample forecasts, by Stephen McNees of the Boston Fed and others, showed dismal performance. Even today, the models that are telling us how many jobs the stimulus saved are the same models that predicted that unemployment today would be close to 7 percent with the stimulus, when in reality it is 9.7 percent. So out-of-sample performance fails to boost one’s confidence in the scientific status of these models.

Macroeconometric models satisfy a deep need to create the illusion that government can exercise precise control over output and employment. As long as people are determined to believe that such control is possible, the models will have a constituency. For better or worse.