Tracking Claude Sonnet 4’s Answers Drift when Explaining Beginner SQL Concepts
DOI:
https://doi.org/10.13021/jssr2025.5167Abstract
The new rise of AI assistants has completely altered the field of education. Part of what makes AI so effective is its ability to monitor itself and adapt; this strength also introduces a flaw. Although large language models (LLMs) have been proven to be a beneficial tool, there is still a lack of research on the implications that AI drift has on student’s learning. AI drift is when LLMs’ behavior or performance changes over time. This project focuses on quantifying LLM’s drift in its responses to beginner SQL queries. Chen et.al 2024’s previous study found that ChatGPT drifted overtime. We extend the methodology of that longitudinal study to a new test case, focusing on beginner SQL queries, first using a screening step to select the LLM. A list of 6 possible LLMs was created with the criteria that they were conversational, able to read text from images, free, and easily accessible: ChatGPT 4o, DeepSeek, Claude Sonnet 4, Gemini, Meta LLama 3.1, and CoPilot . Then, each LLM was asked 6 questions and graded by 4 high schoolers with limited SQL knowledge on its responses based on a rubric with four categories: accuracy, clarity, wordiness, and pedagogical value. Claude Sonnet 4 was found to provide the best scored responses (18.25/20), so it was utilized to conduct the study. Claude was presented with the same 10 questions every 3 days for 2 weeks. Although limited in test duration, there has been a very slight (1.78%) increase in response quality across all categories. However, the AIA is making the same errors each trial, indicating a possible lack of change in solving potential despite improving the understandability of their answers. Future work would involve a longer time frame and more questions to further confirm the preliminary results.
Chen, L., Zaharia, M., & Zou, J. (2024). How Is ChatGPT’s Behavior Changing Over Time?. Harvard Data
Science Review, 6(2). https://doi.org/10.1162/99608f92.5317da47
Published
Issue
Section
License

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.