Coleman Hughes recently interviewed Eliezer Yudkowsky, Gary Marcus and Scott Aaronson on the subject of AI risk. This comment on the difficulty of spotting flaws in GPT-4 caught my eye:
GARY: Yeah, part of the problem with doing the science here is that — I think, you [Scott] would know better since you work part-time, or whatever, at OpenAI — but my sense is that a lot of the examples that get posted on Twitter, particularly by the likes of me and other critics, or other skeptics I should say, is that the system gets trained on those. Almost everything that people write about it, I think, is in the training set. So it’s hard to do the science when the system’s constantly being trained, especially in the RLHF side of things. And we don’t actually know what’s in GPT-4, so we don’t even know if there are regular expressions and, you know, simple rules or such things. So we can’t do the kind of science we used to be able to do.
This is a bit similar to the problem faced by economic forecasters. They can analyze reams of data and make a recession call, or a prediction of high inflation. But the Fed will be looking at their forecasts, and will try to prevent any bad outcomes. Weather forecasters don’t face that problem.
Note that this “circularity problem” is different from the standard efficient markets critique of stock price forecasts. According to the efficient markets hypothesis, a prediction that a given stock is likely to do very well because of (publicly known) X, Y or Z will be ineffective, as X, Y and Z are already incorporated into stock prices.
In contrast, the circularity problem described above applies even if markets are not efficient. Because nominal wages are sticky, labor market are not efficient in the sense that financial markets are efficient. This means that if not for the Fed, it ought to be possible to predict movements in real output.
Before the Fed was created it might have been possible to forecast the macroeconomy. Thus an announcement of a gold discovery in California could have led to forecasts of faster RGDP growth in 1849. There’s no “monetary offset” under the gold standard. This suggests that moving to fiat money ought to make economic forecasting less reliable than under the gold standard. Central bankers would begin trying to prove forecasters wrong.
We tend to assume that fields progress over time, that we are smarter than our ancestors. But the logic of discretionary monetary policy implies that we should be worse at economic forecasting today than we were 120 years ago.
Recall this famous anecdote:
During a visit to the London School of Economics as the 2008 financial crisis was reaching its climax, Queen Elizabeth asked the question that no doubt was on the minds of many of her subjects: “Why did nobody see it coming?” The response, at least by the University of Chicago economist Robert Lucas, was blunt: Economics could not give useful service for the 2008 crisis because economic theory has established that it cannot predict such crises.¹ As John Kay writes, “Faced with such a response, a wise sovereign will seek counsel elsewhere.” And so might we all.
If Robert Lucas had successfully predicted the 2008 crisis it would have meant that he would not have deserved a Nobel Prize in Economics.
PS. I highly recommend the Coleman Hughes interview. It’s the best example I’ve seen of a discussion of AI safety that is pitched at my level. Most of what I read on AI is either too hard for me to understand, or too elementary.
PPS. The comment section is also interesting. Here a commenter draws an analogy between those who think an AI can only become more intelligent by adding data (as opposed to self-play) and people who believe a currency can only have value if “backed” by a valuable asset.
Yet another prevalent (apparently) way people think about the limitations of synthetic data is that they think it’s like how prompting can bring out abilities a model already had, by biasing the discussion towards certain types of text from the other training data. In other words, they are claiming that it never adds any fundamentally new capabilities to the picture. Imagine claiming that about a chess-playing system trained through self-play…
Many of these wrong ways of looking at synthetic data sort of remind me of people not grokking how “fiat currency” can have value. They think if it’s not backed by gold, say, then the whole house of cards will come crashing down. The value is in the capability it enables, the things it allows you to do, not in some tangible, external object like gold (or factual knowledge).
READER COMMENTS
spencer
Aug 17 2023 at 5:01pm
AI is no better than the conventional wisdom.
Kevin Dick
Aug 17 2023 at 5:23pm
While I agree with you on the monetary policy point, I think this particular comment from Gary is confused on the AI point, thus making the analogy shaky. Though it seems like he should know better, so perhaps I am misunderstanding his overall context.First, there is a knowledge cutoff date for the base model. So if Gary comes up with some new objection, there’s literally no way this can be part of the base model.Second, with regards to RLHF, so what? If a human got a bunch of answers wrong, learned from those mistakes, and improved, that would be the whole point of education, right? If it were straightforward to correct any LLM mistakes with a little RHLF, that would be a point in their favor!
But in actuality, just like with humans, there are tradeoffs in how much learning you can do. If you try to optimize along one dimension with RLHF, other dimensions suffer. See “catastrophic forgetting”. This is why people who evaluate the quality of LLMs, do so on a large and broad set of tests. And we see that the latest LLMs have become quite good at a very broad set of language tasks. The real question is whether the Fed has actually improved consistently over time like LLMs.
J
Aug 18 2023 at 1:56am
“if a human got a bunch of answers wrong, learned from those mistakes, and improved, that would be the whole point of education, right?”
Gary would say its not getting any smarter its just recalling a larger collection of facts and doesn’t have the ability to generalize any better then it did before which I would say is more the point of education rather then just learning facts.
Kevin Dick
Aug 18 2023 at 2:57pm
If indeed Gary would say that, he is confused about how LLMs work.
First, the base model has a cut off date. So any LLM issues Gary or others write about after the cutoff date would be impossible for the model to recall, if in fact that’s what they were doing.
Second, LLMs do not in fact “recall facts”. They attempt to predict subsequent strings of text. Based on cases they’ve seen in the past, when faced with a new input case, they try to predict the output. Extending patterns from previous data to predict relationships in new data is in fact a very important kind of generalization. This characterization of how LLMs operate is not controversial, this is how the transformer models are defined.
Finally, with RLHF, you are not giving them new facts. You are giving them targeted new cases of inputs and outputs and training them to do a better job of predicting that category of cases. You then test them on out of sample cases they’ve never seen to see how good of a job your targeted training has done. Again, this is pretty much exactly how many human didactic processes are supposed to work.
The reason they can “hallucinate” is precisely because they are not recalling facts. They are making predictions about text. As it happens, some of those predictions are wrong. It “seems” to the LLM that there should be a paper with a title like that, in a journal like that, from authors whose names are like that, and who work at institutions like that.
Kevin Dick
Aug 18 2023 at 3:51am
If that is indeed what Gary would say, then he is confused about how LLMs work, particularly RHLF.
This isn’t really a matter of opinion. LLMs make probabilistic predictions about word sequences. Logically, prediction is generalization.
With RHLF, you’re specifically training the LLM to generalize better for a certain class of inputs. You measure progress on how the LLM performs these generalizations by testing them inputs you know it has never seen before.
Any characterization of LLMs as fact recallers is misguided. The radon they “hallucinate” is precisely because they are not fact recallers
Guy
Aug 18 2023 at 1:18pm
I think the hype wave about AI has already jumped the shark. It is way to early to worry about AGI. The current LLM systems are more like talking Wikipedia drawing directly from the internet. Just because an LLM can regurgitate an interactive story narrative, does not mean that it has a purpose, or an appreciation of meaning or beauty. I think LLMs will be more important and disruptive than Wikipedia, but it is not going to be an existential danger to humanity. LLM technology will be built into business, writing and query software and websites, schools in particular will have to adapt like they did with calculators and the internet.
Auto driving cars will kill people, there will be accidents in the automation of military technology. There will be some friendly fire incidents, but I doesn’t think people are going to automate the pressing of the nuclear button. Once robots start building other robots without human supervision, then we can start to evaluate real existential risk.
Which is not to say that humanity can’t use technology like bio-engineered viruses or nuclear/chemical weapons to create existential risks. AI could be a vector for spreading that kind of dangerous information. However, climate change could end up being a much bigger problem.
Thomas L Hutcheson
Aug 18 2023 at 6:47pm
An economic forecaster should include his assumption about what the Fed will do and any other variables. He does have to assume that his own forecast will not affect either other market participants or the Fed.
Alexander Search
Aug 19 2023 at 4:04pm
Off-topic, but, still, I’m curious.
Economists, in their monetary models and explanations, mention stickiness a lot. That makes me wonder: Why are markets sticky? Or, rather, why aren’t markets less sticky than they are? What mechanisms (cultural, educational, financial, monetary, contractual, legal, regulatory, technological, etc.) have any effect (in any way) on stickiness?
Comments are closed.