The Aikido of Assessment: Redirect Rather than Fight

By Charles Henderson

As faculty members and chairs, many of us are under pressure to become more involved in course and program assessment. This trend in higher education can seem like an unwelcome burden on our time that does not contribute to our own intellectual advancement or our department’s well-being. But, as you can learn from any good martial arts movie, often the best way to defend against an unwanted advance is not to try and directly stop it, but to redirect your opponent’s momentum and energy to your advantage. If we treat assessment as busy work then that is what it becomes, and everyone’s time is wasted. Instead, that same time and energy can be used to work towards something that is meaningful for us, our students, and our departments.

Know your opponent
Successful redirection of assessment pressures requires that we understand their origins. Much of the current emphasis on assessment is channeled through the regional accreditation process. Rather than have a government body, such as a ministry of education, that certifies the quality of higher education institutions, higher education institutions in the United States are accredited by one of six nonprofit regional accreditors. Accreditation is not legally required, but is necessary for participating in federal financial aid programs (such as Pell Grants).

Over the years, accreditation standards have shifted from a focus on inputs and resources (e.g., student-to-faculty ratio), which are easy to document at the institution level with little input from faculty, to a focus on documenting student learning outcomes, to the current focus on continuous improvement. Both of these new foci depend on faculty and department engagement. The rigor of the accreditation process has also increased significantly as a result of the increased political pressures on higher education institutions to demonstrate their value, both to individual students as well as to the national economy.

A key component of the accreditation process is a self-study carried out by the institution. A self-study includes comprehensive documentation of educational processes and measures of student outcomes. A common way for institutions to prepare the accreditation self-study is through ongoing program review. This is where faculty typically are recruited (or obliged) to participate.

Goal chart
Figure 1. Ideal relationships among the four core assessment processes. The four assessment processes should inform one-another, but typically do not.

Data chart
Figure 2. Data used for course-level assessment. Instructors use almost completely different data sources for course improvement than institutions use for judging teaching performance.

Analyze the Situation
One big problem with the program review process is that the data used for program review are completely different than the data used to make personnel decisions, such as tenure and promotion. In fact, as I show in Figure 1, there are four different assessment processes operating at once. The arrows indicate how information should flow among these processes, but in reality they operate independently. Ideally, instructors would collect course-level data for the purpose of course improvement. These data would be summarized for personnel decisions (with an emphasis on instructor performance and development) and program improvement (with an emphasis on student performance and development). The information from multiple courses that departments use for program improvement would be summarized for program review. Finally, the bi-directional arrow between course and program improvement indicates that individual courses may need to be adjusted based on program-level data.

We can see the problems and possible solutions in play at the course level. In a recent interview study, Melissa Dancy, Chandra Turpen, and I interviewed 72 physics faculty at 4- and 2- year institutions [1]. We asked them what data sources they use for course improvement and what data sources their institution uses to judge their teaching performance. Unfortunately, as the results show in figure 2, faculty and institutions use almost completely different sources of data. The situation is actually even worse than the figure suggests. Of those instructors who say that they use student evaluations of teaching, most indicated that they really value only the written student comments, while the institution values only the numerical ratings.

This difference between instructors valuing written student comments and institutions valuing numbers is indicative of most mismatches between the four assessment processes. The problem, of course, is that an instructor cares about and can make use of very detailed contextual information. Administrators, even department chairs, are not familiar with the context of each class and need quantitative summaries of big-picture issues. A big part of an administrator’s job is to compare faculty. Using average ratings on a few key items (e.g., overall course quality, overall instructor quality) is a very easy way to do this.

This same mismatch occurs at the program level. Program review criteria are based on broad student outcomes that are relevant to many disciplines, such as critical thinking, quantitative reasoning, and written communication. These are difficult to measure. Lacking explicit guidance, most instructors and departments do the easiest thing they can: Use a test they have given as evidence that students had to think critically, solve a quantitative problem, and/or express themselves in writing. This kind of documentation feels like busy work for everyone involved, does not result in high quality program review data, and does little to support program improvement.

The situation doesn’t have to be like this. One way to improve assessment is to develop metrics that are quantifiable, but of interest to both faculty and administrators. For example, it might help to ask students how much they think they learned, rather than how good they think the instructor is. Even better would be for educators to use research-based assessment tools that have been developed by physics education researchers over the last few decades. The most user-friendly of these are carefully developed and validated multiple-choice conceptual tests, such as 30-item Force Concept Inventory (FCI). These tests have been very useful for individual faculty to understand the level of student learning in their courses. Administrators do not usually ask for these scores because similar tests do not exist in many disciplines. However, my experience has been that a summary score of class performance that puts the scores in context by comparing them to other similar classes elsewhere is seen favorably by administrators. PhysPort, a site developed by the American Association of Physics Teachers to help physics faculty find and use resources based on physics education research, is in the process of developing online assessment resources to help faculty do this more effectively.

Individual instructors can aggregate assessment data as discussed above. But real progress in assessment comes when groups of faculty work together on course and program improvement. For example, the physics department at the University of Colorado Boulder, has written about their program improvement process [2]. One of the starting points was the upper-level E&M course. The first step was to identify course goals.This differs from typical faculty discussions, which focus on topics (e.g., magnetostatics, electromagnetic waves). Discussions at this level rarely focus on what students should know or be able to do. When assessment folks talk about course goals, they usually emphasize starting from detailed measurable outcomes for each topic. In my experience, this approach is also not very useful.

In contrast, Boulder focused on course goals by asking the core questions “What is Junior E&M I about? How is it different from the introductory E&M course?” This framed the discussion in a way that was easy for instructors to understand and value. A total of 13 instructors met 7 times to set course goals based on these questions. They were supported by a postdoc who helped to keep things on track. The final set of specific goals for Junior E&M I was associated with eight broad learning goals (e.g., math/physics connection). Using these broad goals, each of the course topics could then be operationalized into a small number of topic-specific learning goals (e.g., “students should be able to write down, and explain in words and pictures, the full set of Maxwell’s Equations”). These broad goals also provided the scaffolding for development of goals for other upper-level courses. (See CU's Physics Learning Goals web page for more details).

The Boulder faculty then developed a diagnostic test to assess student progress towards the goals [3]. This test was open-ended, but could be scored relatively easily. Faculty felt that the questions were meaningful and, thus, valued the results. Like the FCI, test results could be summarized for comparison of outcomes for the course taught in different ways as well as for reporting on student learning as part of program review. Further, the 8 broad goals facilitated discussion among faculty about upper-level instruction, promoted program development, and enabled the department to track student progress through its program.

Similar department-level improvements were achieved by the physics department at University of California, Merced. They developed a set of five broad program learning objectives (e.g., mathematical expertise) that are assessed for each student throughout the undergraduate physics program. In keeping with Figure 1, as much of the data as possible is collected within individual courses (e.g., as part of a final exam) and summarized for program improvement, then summarized more for program review. Scoring tools (i.e., rubrics) guide this process of summarizing student performance on each objective, based on predetermined criteria.

Start from where you are
It is neither necessary nor advisable to immediately conduct a comprehensive overhaul of your assessment processes. Instead, I suggest that you eliminate wasted work by aligning some data collection that you are already doing to serve the multiple assessment processes shown in Figure 1. Accomplish the easy alignments (e.g., more use of standardized measures of content understanding, such as the FCI) and then tackle the harder areas in which tools are less well developed. Physics has been a leader in the development of innovative teaching strategies. We now have the opportunity to be leaders in working with the assessment movement to redirecting currently wasted energy towards goals that we all value–improving education for students in our courses and programs.

Charles Henderson is a professor at Western Michigan University with a joint appointment between the Department of Physics and the Mallinson Institute for Science Education. He is the Senior Editor of the journal Physical Review Special Topics – Physics Education Research.

  1. Henderson, C., Turpen, C., Dancy, M., & Chapman, T. (2014). Assessment of teaching effectiveness: Lack of alignment between instructors, institutions , and research recommendations. Physical Review Special Topics - Physics Education Research, 10(1), 010106.
  2. Chasteen, B. S. V, Perkins, K. K., Beale, P. D., Pollock, S. J., & Wieman, C. E. (2011). A Thoughtful Approach to Instruction: Course Transformation for the Rest of Us. Journal of College Science Teaching, 40(4), 70–76.
  3. Chasteen, S. V., Pepper, R. E., Caballero, M. D., Pollock, S. J., & Perkins, K. K. (2012). Colorado Upper-Division Electrostatics diagnostic: A conceptual assessment for the junior level. Physical Review Special Topics - Physics Education Research, 8(2), 020108. doi:10.1103/PhysRevSTPER.8.020108

APS encourages the redistribution of the materials included in this newspaper provided that attribution to the source is noted and the materials are not truncated or changed.

Editor: David Voss
Staff Science Writer: Michael Lucibella
Art Director and Special Publications Manager: Kerry G. Johnson
Publication Designer and Production: Nancy Bennett-Karasik

October 2014 (Volume 23, Number 9)

Table of Contents

APS News Archives

Contact APS News Editor

Articles in this Issue
Next Steps for Energy Critical Elements
New Local Links Chapters Bring Physicists Together
Blewett Scholarship Winners Announced
DOE Joins the CHORUS
Feynman Lectures Now Freely Available Online
PhysTEC Sites Successfully Sustain Teacher Education Programs
The Back Page
Members in the Media
This Month In Physics History
Education Corner
Zero Gravity: The Lighter Side of Science
International News
Washington Dispatch
Profiles In Versatility