The Chetty Teacher Study and the Hill Criteria
By Arnold Kling
This article discusses a study by Raj Chetty, John Friedman, and Jonah Rockoff on the effect of teacher quality on subsequent earnings. This is a study that has been cited a lot recently, by Alex, among others, as evidence that teacher quality matters. Below, I examine the study from the perspective of the Hill criteria for establishing causation. You may think that my grading is strict, but my reading of the study is that it clearly satisfies only 1 of these 9 criteria, while it clearly fails to satisfy several. The Hill criteria were developed to argue that the evidence showed that smoking causes cancer.
1. Temporal relationship. Did the teacher quality difference take place before the earnings were measured? Yes.
2. Strength. Did teacher quality predict earnings as strongly as smoking predicts lung cancer?. I would say No.
3. Dose-response. Is there evidence that earnings are related to the amount of exposure to good teachers? No.
4. Consistency. Do other studies using different methods show a similar response? The results are consistent with the kindergarten teacher study, also by Chetty and others (but see Russ Roberts on that study). However, I would think that if teacher quality made such a strong difference, then we would see clearer evidence that attending a school with stronger teacher hiring policies affects subsequent earnings. As far as I know, there are no such studies No.
5. Plausibility. Is there a theory of a causal mechanism? Yes, we think that it is plausible that teachers make a difference. However, we do not have a specific causal mechanism. As the article points out,
Moreover, while the new research may identify HVA[high value added] teachers, it’s still not clear what constitutes good teaching. Despite volumes of research, there are no criteria that enable schools to identify good prospects, nor are there a set of best practices to guide teachers in the classroom.
6. Consideration of Alternative Explanations. Can we rule out alternative explanations?
“Effectively, we identified experiments in the data when students come into contact with HVA teachers, such as when they change grades, leave or enter the school system,” Chetty says.
The “natural experiment” methodology is intended to address the possibility of alternative explanations. However, in my mind, there still might be parental variables involved in the decisions that created the “natural experiments.” Maybe
7. Experiment. Can the condition be prevented or ameliorated, the way that lung cancer can be prevented or ameliorated by cessation of smoking? If the “condition” here is low earnings, then it cannot be prevented by even a host of good teachers. This is a rare study that even suggests a small amount of amelioration is possible. No
8. Specificity. Can this be regarded as the causal mechanism? This criterion does not even hold for smoking and lung cancer, so some methodologists question whether it should be applied. But in this case, because there are so many other possible mechanisms affecting earnings, the answer would be No.
9 .Coherence. Are the results compatible with existing theory and knowledge? The results are consistent with Hanushek’s work which suggests that there are differences in “value added” among teachers. However, the results are inconsistent with much work that shows that the effects of education interventions of all sorts dissipate within a few years. No.