ChatGPT’s errors also may have been a contributing factor. The chatbot only answered the math problems correctly half of the time. Its arithmetic computations were wrong 8% of the time, but the bigger problem was that its step-by-step approach for how to solve a problem was wrong 42% of the time. The tutoring version of ChatGPT was directly fed the correct solutions and these errors were minimized.
A draft paper about the experiment was posted on the website of SSRN, formerly known as the Social Science Research Network, in July 2024. The paper has not yet been published in a peer-reviewed journal and could still be revised.
This is just one experiment in another country, and more studies will be needed to confirm its findings. But this experiment was a large one, involving nearly a thousand students in grades nine through 11 during the fall of 2023. Teachers first reviewed a previously taught lesson with the whole classroom, and then their classrooms were randomly assigned to practice the math in one of three ways: with access to ChatGPT, with access to an AI tutor powered by ChatGPT or with no high-tech aids at all. Students in each grade were assigned the same practice problems with or without AI. Afterwards, they took a test to see how well they learned the concept. Researchers conducted four cycles of this, giving students four 90-minute sessions of practice time in four different math topics to understand whether AI tends to help, harm or do nothing.
ChatGPT also seems to produce overconfidence. In surveys that accompanied the experiment, students said they did not think that ChatGPT caused them to learn less even though they had. Students with the AI tutor thought they had done significantly better on the test even though they did not. (It’s also another good reminder to all of us that our perceptions of how much we’ve learned are often wrong.)
The authors likened the problem of learning with ChatGPT to autopilot. They recounted how an overreliance on autopilot led the Federal Aviation Administration to recommend that pilots minimize their use of this technology. Regulators wanted to make sure that pilots still know how to fly when autopilot fails to function correctly.
ChatGPT is not the first technology to present a tradeoff in education. Typewriters and computers reduce the need for handwriting. Calculators reduce the need for arithmetic. When students have access to ChatGPT, they might answer more problems correctly, but learn less. Getting the right result to one problem won’t help them with the next one.
This story about using ChatGPT to practice math was written by Jill Barshay and produced by The Hechinger Report, a nonprofit, independent news organization focused on inequality and innovation in education. Sign up for Proof Points and other Hechinger newsletters.
You Might Also Like
What Schools Stand to Lose in the Battle Over the Next Federal Education Budget
In a news release heralding the legislation, the chairman of the House Appropriations Committee, Republican Tom Cole of Oklahoma, said,...
Experiencing the Wonders of Awe While Raising Children
Researchers have even generated feelings of awe in the laboratory by asking parent to remember an experience with their child...
Lawsuit Aims to Force Trump Administration to Stop Delaying Student Loan Forgiveness
“Congress designed these to ensure that borrowers repay their loans, yet the Biden Administration tried to illegally force taxpayers to...
The Benefits of Teaching Young Kids How Their Brains Work
“Once the kids feel they can calm themselves, even just through breathing it’s like the ‘wow’ moment,” said Rick Kinder,...