APS News

The Back Page

Inquiry Science Rocks:  Or Does It?

by David Klahr

Figure one small

Figure 1. Proportion of unconfounded experiments designed by children in each phase after having been taught by one of the three types of instruction. See (4) for statistical analyses.
Gray arrow Link to Larger Image

chart two small

Table 1. Features of each type of instruction
Gray arrow Link to Larger Image

Although "inquiry teaching" has been a hot topic in science education for many years, it may be useful to reflect on some unresolved issues associated with it. The main point of this essay is that the relative effectiveness of different types of instructional "approaches" is not always investigated with the same rigor that permeates all strong scientific disciplines–clear definitions, well-defined empirical procedures, and data-driven conclusions. The second–and more contentious–point is that for many aspects of science instruction, "discovery learning" is often a less effective way to teach than a direct, didactic, and explicit type of instruction. Some in the physics education community may view this assertion as a foolhardy heresy, while for others it may be a dark secret that they have been reluctant to share with their colleagues. But heresies and secrets are hardly the way to discover and implement maximally effective instructional methods for teaching science.

I am not alone in suggesting that common practices in physics education may have scant empirical support. Several years ago Handelsman, et al.1 asked: " … why do outstanding scientists who demand rigorous proof for scientific assertions in their research continue to use and, indeed, defend on the basis of their intuition alone, teaching methods that are not the most effective?" (p. 521) The specific lament in Handelsman et al. is the claim that much science education is based on a traditional form of didactic lecturing. However, one could just as well use that very same critique about the lack of "rigorous proof" to challenge the current enthusiasm for "inquiry approaches" to science education.

For example, an influential report from the NAS on inquiry approaches to science education2 states that "…studies of inquiry-oriented curriculum programs … demonstrated significant positive effects on various quantitative measures, including cognitive achievement, process skills, and attitudes toward science." This would seem to be clear evidence in support of inquiry-approaches to science instruction, except that the report goes on to note, parenthetically, that "there was essentially no correlation between positive results and expert ratings of the degree of inquiry in the materials (p. 125)." Thus we have an argument for the benefits of a particular pedagogy, but no consensus from experts about the "dose response", i.e., the extent to which different "degrees of inquiry" lead to different types or amounts of learning.

One wonders about the evidential basis for the wide-spread enthusiasm for inquiry science, given the lack of operational definitions of what constitutes an "inquiry-based" lesson–or entire curriculum–and what specific features distinguish it from other types of instruction. There is a particular irony here in that the very field that has developed extraordinarily clear norms and conventions for talking about methods, theories, instrumentation, measurement, underlying mechanisms, etc. often abandons them when engaging in research on science education.

Although the NRC and AAAS continue to favor inquiry approaches to science instruction, many researchers in the emerging field of "Education Sciences" are not so sure. Controversy about the purported universal superiority of constructivist approaches to science teaching has been growing over the past decade, culminating in an entire volume of pro and con perspectives on the issue3. However, my aim here is not to resolve the issue, but rather to note that the evaluation of one approach versus another is all too often made, as Handelsman et al.4 put it, "on the basis of … intuition alone", rather than on the results of replicable experiments, designed around operational definitions of instructional methods being investigated.

I will illustrate with examples from my research: on different ways of teaching a topic in elementary school science known as the "control-of-variables strategy" (CVS). The procedural content of CVS instruction constitutes a method for creating experiments in which a single contrast is made between experimental conditions while "controlling" for other potential causal factors. The conceptual content includes an understanding of the inherent indeterminacy of confounded experiments. CVS is the basic procedure that enables children to design unconfounded experiments from which they can make valid causal inferences and it is invariably included in high stakes science assessments such as TIMMS and NAEP.  

Three Types of Instruction: Operational Definitions

Our goal is to teach CVS. But the experimental variable in our research is the method of instruction. In our first CVS study5, we compared the relative effectiveness of three different types of instruction for teaching CVS to 3rd to 5th grade students. We used simple physical materials (such as balls on ramps, springs and weights, pendulums, or objects sinking in water).

The three types of instruction ranged from explicit, teacher-directed instruction to more open-ended learner-directed discovery. Note that in the previous sentence, I have used the kind of terminology ("teacher-directed", "learner-directed") that I criticized earlier for its inherent ambiguity. However, the solution to this problem is to be extremely explicit about the features of specific instructional procedures. Furthermore, one can remove the baggage-laden terms, and describe the three different instructional methods simply as Type A, B, and C.

The essential aspects of each of the three types of instruction are depicted in Table 1, where each column corresponds to one of the instructional procedures, and each row describes a particular feature. (In our full scientific report on this study, of course, each of the cell entries in the table was augmented by a detailed "script" for how that component of the instruction was actually implemented, so that it could be replicated in other labs.)

For all three types of instruction, children dealt with the same materials. For example, we used a pair of identical adjustable ramps that had four binary-valued features (height, surface, length, and ball type). In all cases, (a) children were presented with the same goal: to design a "good experiment" (i.e., "Can you set up the ramps to find out for sure whether the height of the ramp makes a difference in how far the ball rolls?"), (b) this goal was provided by the teacher, not generated by the student, and (c) we used "hands on" instruction, as children manipulated the materials.

At this point, the different types of instruction summarized in Table 1 begin to diverge. In Type A instruction, the teacher presented explicit instruction regarding CVS (i.e. how to design an unconfounded experiment by varying the "focal variable"–such as the surface of the ramp) while making sure that all the other variables (ramp height, type of ball, length of the run) were held constant on each ramp. In contrast, in Type B and C instruction, the student, not the teacher, designed the experiment. Next, in Type A and Type B instruction, students were presented with probe questions: "Is this a smart way to find out whether the surface of the ramp makes a difference?" "Can you 'tell for sure' from this experiment whether <the variable being tested> makes a difference in the outcome?" "Why are you sure or not sure?" In Type C instruction there was no corresponding probe question. Other crucial features, and their presence or absence in each particular type of instruction, are indicated in the remaining rows. Note that this description is substantially condensed from the descriptions and details in our paper. But the point is clear: each column in the table, and the associated elaboration of what its contents mean, provides an operational definition of the three types of instruction being contrasted in this study.

The results of this training experiment (Figure 1) revealed that (a) only Type A instruction led to immediate gains in children's mastery of CVS, and (b) when tested on different physical materials several days later (such that children initially trained with ramps were now asked to design experiments with springs, and so on), children were able to transfer their CVS knowledge to materials with completely different physical dimensions. Other studies like this one showed that children presented with Type A instruction remembered and used what they learned about CVS in substantially different contexts (i.e., they transferred their CVS knowledge), and they retained it for several months, and even several years, after their instruction.

What's in a Name?

One important part of any operational definition is the name given to the construct being defined. And in sciences that are still in the process of developing unambiguous operational definitions, the name may carry unintended baggage beyond the specifics of the operational definition. Moreover, to the extent that the terms may be widely used in everyday language, they may be interpreted in different ways by different people.

To avoid this possible terminological confusion, in our first report on our three types of training (4), we used somewhat inelegant phrasing. We dubbed Type A, B and C instruction, "Training–Probe"; "No-training–Probe"; and "Probe", respectively. However, in subsequent studies, we began to call Type A "Direct Instruction" and Type C "Discovery Learning". The consistent finding was that the Direct Instruction condition produced substantially more learning and transfer than did the Discovery Learning condition.

For example, in one study, after a brief training session, 75% of the students in the Direct Instruction condition mastered CVS, whereas only 25% of the students in the Discovery condition did so. We also found that when challenged a few weeks later to judge science fair posters involving simple experiments created by other children–the children who had mastered CVS in the training phase were much better judges than those who had not mastered CVS–regardless of how they had been instructed. That is, the many children who learned CVS via direct instruction performed as well as those few children who discovered the method on their own. There was no long term advantage to having "discovered" CVS rather than having been "directly instructed" about it6.

Nevertheless, although these results seemed to indicate that we had identified an effective instructional procedure for teaching young children how to master CVS, the everyday labels we had begun to use led to substantial disagreement within the field about which of our conditions was "really" Direct Instruction, which was "really" Discovery Learning, and whether one or the other was a parody of the corresponding method. The problem, of course, is that those arguments were about vague labels, rather than about the relative effectiveness of well-defined instructional procedures.

Approach Avoidance

The terminological proliferation in the area of science education is daunting. It includes such "approaches" as: constructivism, explicit instruction, Piagetian approach, inquiry science, direct instruction, adaptive instruction, student centered instruction, authentic instruction, hands on instruction, didactic instruction, drill and kill, minds-on instruction, etc. But these imprecise slogans convey little of substance because they are so loosely defined and interpreted. Specifying a "Newtonian approach" doesn't get you very far on the journey to Mars. Only a determined and consistent effort to better define and evaluate our instructional methods will ensure coherent discourse about educational experiments, and ultimately to improved physics education.

David Klahr is the Walter van Dyke Bingham Professor of Cognitive Development and Education Sciences in the Department of Psychology at Carnegie Mellon University. He is a member of the National Academy of Education.

  • 1 Handelsman, J., Ebert-May, D., Beichner, R., Bruns, P., Chang, A., DeHaan, R., Gentile,J., Lauffer, S., Stewart,J., Tilghman, S.M., & Wood, W.B. (2004) Science 304(5670), 521-522. 
  • 2 Inquiry and the National Science Education Standards: A Guide for Teaching and Learning (2000) Board on Science Education (BOSE)
  • 3 Tobias, S. & Duffy, T. M (Eds.) (2009) Constructivist Theory Applied to Instruction: Success or Failure? Taylor and Francis.
  • 4 Handelsman, et al., Science 304.
  • 5 Chen, Z. & Klahr, D., (1999) All Other Things being Equal: Children's Acquisition of the Control of Variables Strategy, Child Development, 70
  • 6 1098 - 1120.5. Klahr, D. & Nigam, M. (2004) The equivalence of learning paths in early science instruction: effects of direct instruction and discovery learning. Psychological Science, 15, 661-667. 

APS encourages the redistribution of the materials included in this newspaper provided that attribution to the source is noted and the materials are not truncated or changed.

Editor: Alan Chodos