A column in the Financial Times reminds us that data—and the more “big,” the better—are often seen as a sesame key to the door of knowledge. It is even imagined that small data are owned by the person who has chosen to share them on an open platform that belongs to somebody else. (See Benedict Evans, “There Is No Such Thing as ‘Data’,” Financial Times, May 27, 2022):
This is mostly nonsense. There is no such thing as “data”, it isn’t worth anything, and it doesn’t belong to you anyway. … “Data” does not exist—there are merely many sets of data. … Most of the meaning in “your” data is not in you but in all of the interactions with other people.
But this is not my topic, although it is related. My topic is the false idea that one can induct theory from data without first having a theory, formal or intuitive, explicit or implicit, to indicate which data are relevant. In economics, this idea has been lately associated with Harvard University economist Raj Chetty, who apparently aims to teach microeconomic principles by first looking at the data (see Don Boudreaux, “How Should Econ 101 Be Thought,” Econlib, January 6, 2020).
That this is not consistent with the scientific way of understanding the physical or social world has been well explained by Karl Popper, the famous philosopher of science, in a series of articles in Economica (“The Poverty of Historicism,” May 1944, August 1944, and May 1945):
I believe that theories are prior to observations as well as to experiments, in the sense that these are significant only in relation to theoretical problems. … Therefore, I do not believe in the “method of generalization”, that is to say, in the view that science begins with observations from which it derives its theories by some process of generalization or induction. (Part 2, p. 134-135)
I believe that the prejudice that we proceed in this way is a kind of optical illusion, and that at no stage of scientific development do we begin without something in the nature of a theory, such as a hypothesis, or a prejudice, or a problem … which in some way guides our observations, and helps us select from the innumerable objects of observation those which may be of interest. (Part 3, p. 79)
The literary literature provides us with a fun example of another sort. In an 1841 letter to his sister, French novelist Gustave Flaubert wrote:
Since you are now studying geometry and trigonometry, I will give you a problem. A ship sails the ocean. It left Boston with a cargo of wool. It grosses 200 tons. It is bound for Le Havre. The mainmast is broken, the cabin boy is on deck, there are 12 passengers aboard, the wind is blowing East-North-East, the clock points to a quarter past three in the afternoon. It is the month of May. How old is the captain?
Puisque tu fais de la géométrie et de la trigonométrie, je vais te donner un problème : Un navire est en mer, il est parti de Boston chargé de coton, il jauge 200 tonneaux. Il fait voile vers le Havre, le grand mât est cassé, il y a un mousse sur le gaillard d’avant, les passagers sont au nombre de douze, le vent souffle N.-E.-E., l’horloge marque 3 heures un quart d’après-midi, on est au mois de mai…. On demande l’âge du capitaine?
If you are looking for what determines a captain’s age or what is determined by it, most data in the universe are irrelevant. Of course, the exercise proposed by Flaubert could have been a mere cryptographic enigma, but solving it would still have shown nothing about induction as a way to derive scientific laws.
READER COMMENTS
nobody.really
Jun 1 2022 at 9:12pm
Walter Lippmann (1922), coining the term “stereotyping”
Albert Einstein, Autobiographical Notes (1949).
Thomas Kuhn, Structure of Scientific Revolution (1962), Chap. 10, “Revolutions as Changes of World-View”
Alasdair MacIntyre, After Virtue (1981), Chap. 7, “’Fact,’ Explanation and Expertise” at 93-94 (emphasis added).
nobody.really
Jun 1 2022 at 9:36pm
https://freakonomics.com/podcast/abortion-and-crime-revisited-update/
Pierre Lemieux
Jun 2 2022 at 11:05am
Thanks for the interesting quotes. They are all consistent with my post except for Levitt and perhaps for some ambiguity in the Einstein quote.
nobody.really
Jun 2 2022 at 4:58pm
Does observation precede theory? Does theory precede observation? Or do they occur more or less simultaneously? Consider <a href=”https://en.wikipedia.org/wiki/Predictive_coding#Origins”>predictive coding</a>:
Roger McKinney
Jun 2 2022 at 5:26pm
Mises had a lot to say about that topic, too. Boiled down, he wrote that history (data) is so vast that a person can find data to support any crackpot theory. Everyone must filter the data because there is too much of it. Theory provides the filter for what is relevant and what isn’t.
nobody.really
Jun 2 2022 at 9:25pm
James Anthony Froude (1818 –1894), English historian
Jon Murphy
Jun 2 2022 at 9:45pm
Not only which data are relevant, but how to understand and interpret the data as well. One of the things we have seen over the past two years with Fauci so much in the spotlight is that failure to understand data lead to incorrect interpretations. He had no clue how to interpret the economic data he was being fed, so to him he had no way of knowing his advice was causing the pandemic to be worse along many margins (not the least of which were shortages and hoarding of PPE in 2020).
Facts are not “stubborn little things” nor to they “speak for themselves.”
Pierre Lemieux
Jun 3 2022 at 12:44pm
Jon: You write that a theory is necessary
It seems to me that the two processes are so related as to be considered only one, at least in a Popperian positivist perspective, for the purpose of finding the data is to see if our prior theory explains them. Just after the first quote of his in my post, Popper added:
Jon Murphy
Jun 3 2022 at 6:39pm
That makes sense. I don’t know Popper that well
Michael Rulle
Jun 3 2022 at 11:16am
“My topic is the false idea that one can induct theory from data without first having a theory, formal or intuitive, explicit or implicit, to indicate which data are relevant.”
I think this statement is literally correct, but “one must first have a theory” is highly watered down, which I also agree with——as it incorporates intuition and implicit insights.
If your main point were to simply say one should not find correlated data by simple data snooping across vast swaths of data—-then that would be very clear and also true.
But I do not think of an a priori hypotheses as “intuition” or having implicit (unaware?) perception.
In other words—-and maybe I am merely arguing semantics——just say—“don’t data snoop”——and you have said what needs to be said.
Pierre Lemieux
Jun 3 2022 at 12:27pm
Michael: You are raising an interesting and complex issue. I was (and am) following Popper, for whom what looks like induction may be an implicit, intuitive hypothesis or theory. Note, in the second Popper quote of my post:
And three sentences below, he explains:
nobody.really
Jun 3 2022 at 7:23pm
Depends upon what “scientifically relevant” means. Is it scientifically relevant if our biases lead us to ignore certain hypotheses?
I am under the impression that you can explain the motion of heavenly bodies on the basis of the motion of points of light attached to invisible spheres around the earth–or spheres rolling within spheres–or spheres within spheres within spheres, ad infinitum. The weakness of this theory was exposed not from all the data it failed to explain, but from the rise of a rival, simpler theory. If no one had ever contemplated a heliocentric hypothesis, how much longer would the geocentric model have remained?
Thomas Kuhn emphasized the idea that science is a SOCIAL process, and subject to all the dynamics of other social processes. Whether you draw a distinction between the color blue and green depends upon the society in which you were raised. When asked to identify the midpoint between 1 and 9, you may say 5 (because you were socialized to emphasize counting numbers) or 3 (because you were socialized to emphasize ratios). Whether you regard homosexuality as a mental illness may depend upon whether you were socialized in a culture that disparages homosexuality. Whether you can acknowledge the properties of light may depend upon whether you were socialized to drawing a distinction between phenomena that act like particles and phenomena that act like waves, or whether your mind was not impeded by that metaphor. Whether you’re willing to give credence to the work of Jewish scientists may be influenced by whether you’re socialized as a Nazi. Whether you choose to invest in producing Viagra or malaria drugs may be influenced by whether you were socialized to value maximizing financial return or human welfare. Etc.
Perhaps Popper would find none of these issues to be “scientifically relevant.” I do. The manner in which we devise hypothesis matters.
And therefore I’m sympathetic to the appeal of inviting people to review data unaided/unimpeded by someone else’s theory. I also share the doubts expressed here about how plausible such a process would be–but the desire to transcend socially-prescribed bias is laudable, even if the tools are lacking.
nobody.really
Jun 10 2022 at 3:49pm
“Scholar Jonathan Metzl provides a chilling take on this history in his book The Protest Psychosis: How Schizophrenia Became a Black Disease. He describes how, in the 1850s, white U.S. psychiatrists invented the mental disorders ‘drapetomania’—defined by Louisiana surgeon and psychologist Dr. Samuel A. Cartwright as “the disease causing slaves to run away”—and ‘dysaesthesia aethiopis,’ which manifested as ‘disrespect for the master’s property’ and could be ‘cured’ by brutal whippings. Writes Metzl: ‘Even at the turn of the twentieth century, leading academic psychiatrists shamefully claimed that Negroes were psychologically unfit for freedom.’
Metzl’s book … focuses on the racialization of schizophrenia diagnoses. The disease affects people of all ethnic backgrounds at similar rates, yet a 2005 study showed that Black men are diagnosed with it four times as often as white people. Metzl illustrates how this trend began in the 1960s and ’70s, as the very definition of the disease shifted in ways that painted Black men as ‘criminally insane.’ He uncovered hospital charts that ‘diagnosed’ Black men as schizophrenic in part because of their connections to the civil rights movement and listed symptoms that included being paranoid or delusional about being persecuted for being Black.”
Michael Rulle
Jun 5 2022 at 10:50am
Popper’s quote that how you test your theory is what is scientifically relevant—-is the main point. This is very complex however. If a theory can make accurate predictions, that is compelling.
In Social Sciences, Popper devised the framework of situational analysis——which some say is a carve out from his falsification framework—others say it is consistent. I admit to being a rank amateur on this topic——but it implies not all knowledge comes from standard falsification framework. “Science” is difficult to define.
Grand Rapids Mike
Jun 3 2022 at 11:18am
Recently heard a comment on CNBC that is somewhat related to this discussion. To paraphase, “Economic Theory provides a general direction or understanding, Econometrics provides answers that are specifically wrong. ” The actual comment was much more eloquent.
nobody.really
Jun 3 2022 at 6:20pm
Ezra Solomon, Burmese-born American economist (1920–2002), The Bulletin (1984), Reader’s Digest 1985. (Often attributed to J. K. Galbraith following a humorous piece in U.S. News & World Report, March 7, 1988.)
Comments are closed.