The Physics Curriculum Needs More Data Science — and One Team is Making it Easier Than Ever to Integrate It
With support from the APS Innovation Fund, the DSECOP team is getting data science into more undergraduate physics classrooms.
Most physics professors today agree that their students should learn coding and computational thinking — but what about more specialized skills, like those in data science?
The movement to incorporate data science into the undergraduate physics curriculum is gaining momentum, bolstered by a 2021 APS Innovation Fund award to a team who met through the APS Topical Group on Data Science (GDS), launched in 2020. The group — Alexis Knaub of the American Association of Physics Teachers, Marilena Longobardi of the University of Basel in Switzerland, William Ratcliff of NIST and the University of Maryland, and Wolfgang Losert of the University of Maryland — saw the Innovation Fund as a chance to respond to a call for help: Physics faculty across the country were asking GDS members to recommend data science textbooks or resources for use in undergraduate classrooms.
“This was something that just needed to be done,” says Ratcliff. “If not us, who?”
After receiving a $200,000 Innovation Fund award for the project, which they called the Data Science Education Community of Practice (DSECOP), the team moved quickly. They hired graduate students and postdoctoral researchers as project fellows to develop small data science teaching modules that faculty at any college could easily add to their physics courses. Then, through the GDS, they reached out to colleagues across the country and asked for help piloting the modules with current physics majors.
In addition to the co-PIs, Mohammad Soltanieh-ha of Boston University coordinates the fellows’ activities, Jacob Hale of DePauw University reviews the fellows’ output, and Anil Zenginoglu of the University of Maryland helps manage the community the project has built.
Last June, the DSECOP team organized a multi-day workshop for about 30 graduate students and faculty on the University of Maryland campus, to tinker with and workshop the new modules. Fellow Julie Butler, a doctoral candidate in physics at Michigan State University, says the workshop was a great opportunity to network with other data science-minded people.
So far, Butler has developed two modules for DSECOP, including one on using neural networks, a type of machine learning algorithm. “I show them how to build a neural network from scratch,” she says. The students compare their results with those of a hand calculation and numerical differential equation solver. Beyond just applying a neural network, Butler aims to help students learn how to analyze a problem to determine whether a machine-learning approach would be an appropriate solution, and if so, what type.
In the classroom, “it’s been pretty well received,” says Butler, who has gotten student feedback to improve the module before it’s distributed more widely.
Data science is a comparatively young field, largely enabled by advances in computing power. “I graduated from undergrad in 2006 with a degree in physics, and we weren’t taught data science then,” says Knaub.
But today, physics graduates are taking jobs in industries that are thirsty for opportunities to streamline decision-making processes and innovate quickly. Data science offers a robust way to do this. In industry, “they’re using machine learning as one of the tools in their research toolbox to accelerate the pace of their science,” says Ratcliff.
Despite the demand for physicists with data science skills, the physics community has not responded quickly, says Ratcliff. “Whenever somebody wants to develop new content [for a course], this takes time,” he says.
Knaub adds that “one of the tensions we’re facing with any curricular change is, if we add stuff, does that mean we’re taking things out?” And adding data science content can be a daunting challenge at smaller schools, says Ratcliff, where a department might want to overhaul its curriculum but lacks a data science expert in its ranks.
The DSECOP team has designed the data science modules to readily fill this gap. Each module that Butler has built, for example, constitutes only three Python notebooks, and each can be completed in just one class period. The whole DSECOP project uses Python, because “it’s one of the easier languages to learn.” This should reduce the barrier for faculty adopters, she says.
With the project’s Innovation Fund support ending later this year, the DSECOP team is planning their next steps. “We need to seek additional funding to start really working on the deployment of this material,” says Ratcliff. “It does us no good, even if it's tested material, if it just sits on GitHub.” He says that with additional funding, the team will also be able to accept contributions directly from the physics community.
Ratcliff expects that APS and AAPT meetings, as well as future DSECOP workshops, will play a critical role in getting the DSECOP modules into the hands of physics instructors across the country. Connecting with other faculty “at least gives us the chance to influence or to educate those who are going to be most passionate that these resources exist,” he says. “And that’s often how you start movements.”
Liz Boatman is a science writer based in Minnesota.