Tyler Cowen recently challenged us to try to stump an AI named “o1”:
1. o1 is a very good economist. Try to stump it if you can.
I could not find o1, so I used ChatGPT 4.o mini. I hope someone will try the following question with an o1 and report the results in the comment section:
Update: Several commenters pointed out that o1 does indeed get the question correct, unlike the mini version I used. That’s good to see.
The thing I keep noticing with AIs is that they are like a super smart student that is capable of memorizing vast quantities of information, but lack an ability to actually see the essence of a problem. In the example above I gave one hint after another, practically begging the AI to avoid reasoning from a price change. It had so little awareness of what it was doing that it ended up telling me that Qd fell while Qs rose and yet the market remained at equilibrium.
Perhaps I used the wrong AI, or perhaps this question is unusually tricky. (To a college student, “tricky” means a question where you actually have to understand something, and cannot just regurgitate what you’ve read in your notes.)
So let’s take a question that Tyler asked the o1 economist:
Please write an essay on how current macroeconomists find inflation dynamics so very difficult to predict, and why that has made them reject various forms of monetarism, even as approximations of what is going on behind price level behavior.
The answer is very long, so I’ll just focus on a few key portions of the response:
Several factors have contributed to the increasing complexity and unpredictability of inflation. First, the financial innovations and regulatory changes of the late twentieth and early twenty-first centuries dramatically altered the relationship between money and economic activity. Monetary aggregates—like M1 or M2—that once served as dependable indicators of policy stance and future inflation now behave erratically due to shifts in the velocity of money, the proliferation of shadow banking, and the globalization of financial flows. Simply put, where money resides and how quickly it moves through the economy has become too fluid and too complex for older monetarist simple rules to capture.
Second, the nature of central banking and fiscal policymaking has changed. Central banks now intervene in a host of unconventional ways, from massive purchases of financial assets to the forward guidance of policy expectations. These tools are not well-explained by the classic monetarist perspective, which centered on controlling a particular measure of the money supply. The recent experience following the Global Financial Crisis vividly illustrates this: The Federal Reserve and other central banks undertook unprecedented quantitative easing programs, dramatically expanding their balance sheets. According to traditional monetarist logic, this rapid increase in the monetary base should have led to substantial inflation. Yet inflation remained persistently below target levels in many advanced economies for years, confounding those who relied on old monetary aggregates as a guide.
Third, the determination of prices today involves a bewildering interplay of global supply chains, technological advances, labor market transformations, and shifts in consumer behavior. Globalization means that prices for goods and services are influenced not just by domestic monetary conditions, but also by distant supply shocks, currency fluctuations, and international competition. Technological change increases productivity and can compress prices in certain sectors, while leaving other parts of the economy less affected. Labor markets have also evolved, with changes in union power, demographic shifts, and altered labor-force participation patterns influencing wage formation and cost pressures. These micro-level frictions and structural changes make the older macro-level equations linking money supply growth to inflation too coarse and imprecise.
Sorry, but this is mostly a mix of empty clichés and irrelevant comments. Again, it’s the sort of thing that I’d expect a student to put down if they didn’t know the answer. I’ll explain why in a moment, but first let me clarify an important point. I am pretty sure that o1 is more intelligent than I am, at least in the way that most people define “intelligence”. In other words, o1 knows way more than I do. It is far more intelligent than I am in roughly the sense that the New York Times is far more intelligent than I am. And that’s really, really impressive.
But o1 is not a good economist. Sorry, it just isn’t.
Let’s first dispose of the final paragraph, which is utter nonsense. Technology has been advancing for more than 100 years, and the pace of productivity growth is not accelerating. So that’s not a factor making inflation harder to forecast. Union strength goes up and down, and that fact does not make inflation harder to forecast. Shifts in consumer behavior? Now the student is really getting desperate. It’s an “I need to fill up the page with something” sort of comment. Global supply chains? Again, it doesn’t make inflation harder to predict. Not at all. Global commodity prices had a bigger impact on US inflation 100 years ago than today. None of those four factors have made it any more difficult to predict inflation. Indeed I’m not even sure it is becoming more difficult to predict inflation. It seems to me that inflation was far harder to predict when I was young than it is today.
The first sentence in the middle paragraph is a bit less bad. Central banking really has changed in one respect. And the relationship between the base and the aggregates really has gotten looser. Unfortunately, the AI seems to have no idea why the relationship has gotten looser. It mentions a bunch of irrelevant stuff like QE, and misses the key point that the payment of interest on reserves and the zero lower bound problem have made the money multiplier far more unstable. The AI also says that fiscal policymaking has changed. That was probably just a wild guess. I cannot think of any changes in fiscal policy that make inflation harder to predict. Given that the AI is silent on the issue, I suspect that it also has no idea what has changed about fiscal policy, or why those changes would make inflation harder to predict.
In the first paragraph the AI says:
Monetary aggregates—like M1 or M2—that once served as dependable indicators of policy stance and future inflation now behave erratically due to shifts in the velocity of money, the proliferation of shadow banking, and the globalization of financial flows.
False. The aggregates were never dependable indicators of inflation. Money velocity has always been unstable. But don’t the textbooks say that money velocity used to be stable? Yes, many do. That’s probably where the AI got its misinformation. It might be more unstable now, but it’s never been stable enough for the aggregates to be good predictors.
Also notice the odd list, which includes velocity, shadow banking and financial flows. That’s kind of silly. An AI that understood what it was talking about would have put a period after velocity, and then had a separate sentence explaining some reasons why velocity changed. Right now, it’s a confusing jumble of concepts.
If you are a math-oriented person who is confused by economics, this equation might help you see my point:
V = PY/M
The AI is basically saying that the PY/M ratio is changing due to things like changing V, shadow banking and financial flows. The ratio is V!! The other variables help explain why V might change.
Even though the NYT is much smarter than I am, when I happen upon a NYT article in an area where I have some expertise, the paper suddenly seems much less smart. I won’t say “dumb”, because even at its worst the NYT employs talented reporters. And I cannot expect them to have devoted their entire life to studying monetary economics.
In my view, the o1 response provided by Tyler is modestly below the level of the NYT. At a minimum, the Times would have mentioned interest on bank reserves or the zero lower bound. Nonetheless, AIs are truly impressive when it comes to the quantity of information they have memorized. Even within monetary economics, an AI will often have more information at its fingertips than I do. An AI could write a better essay on all the new Fed policy tools than I could write. Where it falls down is in a certain type of understanding. I’m not sure I can even put this concept into words. Maybe understanding doesn’t really exist, and we just fool ourselves into thinking we have a deep understanding of something. Perhaps a super-intelligence would mock my writing in the same way that I mock the AIs that engage in reasoning from a price change.
All I know is that AIs have a long way to go before they impress me in the domain of understanding monetary economics. But outside my field, I’m already highly impressed.
Here’s how AI Overview defines Gell-Mann Amnesia:
Gell-Mann Amnesia (GMA) is a defense mechanism that occurs when someone reads a news article about a subject they know well and finds it to be full of errors, but then proceeds to read the rest of the newspaper as if it is more accurate. The term was coined by Michael Crichton after discussing it with Murray Gell-Mann
READER COMMENTS
Chris
Dec 9 2024 at 6:51pm
o1 does better. Question was exactly the same and it answered:
“If oil prices rise while the market remains in equilibrium, it means that either demand has increased or supply has decreased:
•If rising oil prices are driven by an increase in demand (the demand curve shifts right), then both the equilibrium price and the equilibrium quantity of oil will increase.
•If rising oil prices are driven by a decrease in supply (the supply curve shifts left), then the equilibrium price will rise while the equilibrium quantity of oil will decrease.
In other words, the effect on quantity depends on the underlying cause of the price increase.”
BC
Dec 9 2024 at 9:00pm
I got the same results as both Scott and Chris. o1-mini produced the same incorrect answer as Scott received (though not verbatim) after a couple of seconds. o1 produced the same correct answer as Chris received after 16 seconds.
From OpenAI [https://help.openai.com/en/articles/9824965-using-openai-o1-models-and-gpt-4o-models-on-chatgpt]: “The key difference [between o1 and o1-mini] is that o1-mini lacks access to broad world knowledge, making it more suitable for tasks where all necessary context is provided within the prompt.” o1-mini is faster, more cost-efficient, and designed for coding and agentic applications.
Charlie
Dec 9 2024 at 6:53pm
Prompt: “If oil prices rise and the oil market remains in equilibrium, what happens to the quantity of oil consumed and what happens to the quantity of oil produced?”
Here’s the 01-preview response:
Incidentally, it “thought for 8 seconds.” Here is its “thought process”:
Craig
Dec 9 2024 at 10:35pm
Short Answer:It depends on what caused the price increase. If the higher price is due to a decrease in supply, then both the quantity consumed and produced will decline. If the higher price is due to an increase in demand, then both the quantity consumed and produced will rise.
Detailed Explanation:In a competitive market, the equilibrium price and quantity are determined by the intersection of the supply curve (representing producers’ willingness to sell) and the demand curve (representing consumers’ willingness to buy).
If the Price Rises Due to a Decrease in Supply:Imagine that something causes the supply curve to shift to the left (for example, new regulations, input shortages, or a cut in production quotas). This means, at any given price, suppliers are willing and able to produce less oil. The new equilibrium occurs where the demand curve intersects this reduced supply. The result is:
Higher Equilibrium Price: Because there is now less oil available, consumers must bid against each other, driving the price upward.
Lower Equilibrium Quantity: Since producers are supplying less and higher prices discourage some consumers from buying as much, the equilibrium quantity of oil consumed and produced both fall.
If the Price Rises Due to an Increase in Demand:Now consider a scenario where the demand curve shifts to the right (for example, due to economic growth, increased travel, or cold weather increasing heating needs). At each price, consumers now want to buy more oil. This greater willingness to purchase drives the equilibrium price and quantity upward. The result is:
Higher Equilibrium Price: With more buyers or stronger demand, consumers compete more aggressively for each unit of oil, pushing prices up.
Higher Equilibrium Quantity: The higher price incentivizes producers to supply more, and since consumers still desire more even at a higher price, the new equilibrium quantity consumed and produced is larger.
In Summary:
A leftward (negative) supply shift leads to higher price, lower quantity.
A rightward (positive) demand shift leads to higher price, higher quantity.
Since the question only states that the oil price rises and equilibrium is maintained—but does not specify the reason—there is not a single universally correct outcome for the equilibrium quantity. The direction of the change in equilibrium quantity depends entirely on whether the price increase originated from a supply decrease or a demand increase.
——————–
If oil prices rise and the oil market remains in equilibrium, what happens to the quantity of oil consumed and what happens to the quantity of oil produced?
Chat GPT o1
Scott Sumner
Dec 10 2024 at 12:39am
Thanks everyone, I’ll add an update.
Vaidas Urba
Dec 10 2024 at 3:13am
I appended “Please avoid the fallacy of reasoning from price change” to the original question. The 4o model then gave a perfect answer three times out of four, and a good answer on the fourth attempt. So yes, the question is unusually tricky, as this fallacy is quite frequent in the training set. By the way, seven years ago you wrote a post “Is the battle against reasoning from a price change unwinnable?”.
Garrett
Dec 10 2024 at 9:56am
Yeah I asked Claude and it got it wrong too (just assumed it was a left shift in supply curve). But when I said “You’re reasoning from a price change” it corrected itself.
Vaidas Urba
Dec 10 2024 at 3:42am
Turning to o1 model, Tyler’s question essentially asked it to write an editorial with a predetermined conclusion. I am more interested in understanding what o1 model really thinks. Thus I asked it the following question:
You are a discussant preparing for macroeconomics seminar. The topic of the seminar is “Inflation and output dynamics are very difficult to predict, leading to the rejection of market monetarism as even an approximate explanation of price and output behavior.” Describe market monetarism, focusing on differences from the regular monetarism. Identify and explain the top 5 arguments in favor of this thesis. Identify and explain the top 5 arguments opposing this thesis. Based on the arguments, provide your ultimate verdict with a reasoned justification.Here is the key excerpt from the reply:
Top 5 Arguments in Favor of the Thesis (Rejecting Market Monetarism)
Unpredictability of Key Variables Makes Expectation Management Difficult:The thesis states that inflation and output dynamics are notoriously hard to predict. In practice, forming stable market expectations around NGDP growth proves challenging when underlying structural relationships (e.g., velocity of money, productivity trends, and global shocks) are constantly shifting. This undermines the core principle of market monetarism that expectations can be effectively managed to ensure stable inflation and output.
Limited Empirical Evidence of Successful NGDP Targeting Regimes:Despite theoretical elegance, there are few real-world examples where a central bank has explicitly and consistently targeted NGDP and succeeded in smoothing out business cycles better than alternative frameworks. The lack of empirical track records weakens the claim that market monetarism can serve as even an approximate explanation of observed price and output behavior.
Difficulty in Establishing Credible Futures Markets for Policy Signals:One of the hallmark proposals of market monetarism is to set up NGDP futures markets to guide expectations. Yet, these markets either do not exist or are extremely thin, making it hard for a central bank to read and influence expectations in real-time. Without these market signals, policymakers revert to model-based estimates, losing the defining advantage of market monetarism.
Structural Shifts and Regime Changes Undermine Stability:Economic dynamics are not static. Changes in technology, demographics, financial regulations, and global trade patterns shift the relationship between monetary policy actions and nominal aggregates. In such a fluid environment, expecting a stable link between policy levers and outcomes—an assumption underlying market monetarism—is unrealistic, thereby justifying a rejection of its predictive power.
Complexity of Translating Monetary Policy into Real Outcomes:Even if a central bank could perfectly manage nominal expectations, translating that into actual wage-price setting behavior and real output stabilization is uncertain. Credit frictions, heterogeneous agents, and global financial interdependencies mean that even well-managed expectations may fail to deliver stable inflation and output, contradicting the simplified mechanisms proposed by market monetarists.
Top 5 Arguments Opposing the Thesis (Defending Market Monetarism)
Expectations Management is a Strength, Not a Weakness:While predicting economic variables is hard, market monetarists argue that the central bank’s job is to shapeexpectations rather than just predict outcomes. By credibly committing to a particular nominal target (like NGDP level targeting), the central bank can reduce uncertainty and guide market participants, making actual outcomes more predictable over time than under discretionary policy regimes.
Historical Episodes Support Expectations-Driven Stabilization:Though pure NGDP targeting regimes are rare, historical episodes (e.g., the Volcker disinflation, or actions by central banks post-Great Recession) demonstrate how decisive monetary commitments influence expectations and outcomes. When central banks clearly communicate targets and follow through, market monetarists assert that private sector actors align behavior with those targets, stabilizing the economy.
NGDP Targeting Handles Supply and Demand Shocks Better:Unlike inflation targeting (which can misinterpret supply shocks) or money supply rules (which ignore velocity shifts), an NGDP target automatically accommodates changes in productivity or aggregate supply. By keeping nominal income on track, real variables and inflation can remain more stable in the face of unpredictable shocks—an argument that market monetarists say is more robust, not less.
The Futility of Perfect Prediction Does Not Refute the Framework:Opponents set a high bar by claiming that difficulty in prediction invalidates the framework. Market monetarists counter that no macroeconomic framework can predict precisely. Instead, the goal is to provide a rule that yields more stable macroeconomic outcomes on average. By focusing on a stable nominal anchor (NGDP path), market monetarism provides a clearer, more transparent guideline than ad-hoc policy responses.
Monetary Policy Is Always and Everywhere About Coordination:Market monetarists emphasize that monetary policy’s chief role is to coordinate expectations among millions of actors. Perfect prediction is neither possible nor required. What is required is a stable and credible commitment that reduces the range of plausible future outcomes, effectively ‘herding’ expectations toward a nominal target and, thereby, improving macroeconomic stability relative to regimes without clear nominal anchors.
Ultimate Verdict and Reasoned Justification
After weighing these arguments, a nuanced conclusion emerges. The thesis contends that the unpredictability of inflation and output dynamics invalidates market monetarism as even an approximate explanation. While the complexity of real-world economies and the persistent forecasting failures of all macroeconomic frameworks cast doubt on any single model’s predictive prowess, this does not necessarily imply a wholesale rejection of the market monetarist framework.
Market monetarism’s strength lies not in guaranteeing perfect predictions but in offering a rule-based approach that leverages expectations as a policy tool. The core idea—stabilizing the nominal anchor (like NGDP)—is appealing because it attempts to align the private sector’s plans with the central bank’s intentions, potentially reducing macroeconomic volatility over time.
However, the absence of robust empirical evidence, practical implementation hurdles, and the still-theoretical nature of NGDP futures markets highlight that market monetarism remains more of a guiding principle than a proven regime. It should not be flatly rejected because it cannot ensure precise predictions of inflation and output—no framework can. Instead, it should be recognized as a valuable conceptual framework that could inform better policy design, especially when compared to discretion-based alternatives, even if it falls short of providing a foolproof predictive theory.
Verdict: Market monetarism should not be completely discarded merely because of predictive difficulties. While it may not serve as a comprehensive and flawless predictive model, it remains a potentially useful framework for structuring and guiding monetary policy decisions. Its reliance on expectations and nominal anchors could, in practice, reduce uncertainty and volatility over the long run, provided that policymakers can address implementation challenges and credibly commit to such rules.
Scott Sumner
Dec 10 2024 at 1:52pm
Thanks Vaidas, That’s very interesting. I could do much better on both the pros and cons, but that’s better than you’d get from many econ students.
Brent Buckner
Dec 10 2024 at 1:53pm
Dr. Sumner, re: your update
I’m now curious as to how well the full o1 model could explain your positions to Tyler Cowen.
Scott Sumner
Dec 10 2024 at 11:00pm
I think we are still a long way away from AI’s being able to explain my position to another economist. There’s way too much nuance that would get missed.
Brent Buckner
Dec 13 2024 at 10:12am
@Dr. Sumner:
Thank you.
I’m not looking for you to reply to this, but after a day I have a bit to add.
Tyler Cowen recently suggested that we should interrogate o1 more (and so at the margin we would rely on primary material somewhat less). I’m now wondering whether or not o1 presents in a way that Dr. Cowen more readily takes in. Clearly he has read material from you, yet has failed to grasp nuances of your position. Perhaps interrogating o1 would help him get closer.
Ted Sanders
Dec 10 2024 at 7:54pm
Yeah, 4o-mini is essentially the dumbest model by OpenAI. The ranking goes 4o-mini < 4o < o1-mini < o1 < o1 pro, roughly. o1-mini is an oddball as it’s very good at math/science, but not necessarily in general.
Comments are closed.