Multi-agent virtual student simulation method and system for teacher Q&A training
By using a multi-agent virtual student simulation method, which utilizes control agents and functional agents to simulate the student's problem-solving process, the problem that large language models cannot realistically reproduce students' error performance and cognitive evolution is solved, thereby improving the authenticity and effectiveness of teachers' Q&A training.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- HUAZHONG NORMAL UNIV
- Filing Date
- 2025-09-18
- Publication Date
- 2026-06-30
Smart Images

Figure CN121328604B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of intelligent education technology, and in particular to a multi-agent virtual student simulation method and system for teacher Q&A training. Background Technology
[0002] The ability to answer questions is one of the core competencies of teachers in the teaching process, especially crucial in mathematics and science subjects where logical chains are long and the thinking is complex. In mathematics and science teaching scenarios, student questions typically exhibit three main characteristics: ambiguity, leaps in thought, and deviations. Teachers need to provide precise intervention to address students' broken reasoning chains and confused concepts. However, teacher trainees and young teachers lack frequently reproducible question-answering training scenarios that closely resemble real student learning situations, thus limiting their development in teaching practice. Therefore, constructing a question-answering training environment that supports repeated practice by teachers and closely reflects real student behavior (i.e., constructing virtual students) is a crucial issue that urgently needs to be addressed in the current development of teacher education.
[0003] With the rapid development of Large Language Models (LLMs), their role-playing and complex reasoning abilities have been explored. Many academic fields have attempted to use LLMs to simulate typical individuals to support professional training. For example, in psychology, LLMs are used to simulate patients with mental disorders, thereby improving the clinical interviewing abilities of counselors; in education, LLMs are used to simulate teachers, providing students with answers to their questions anytime, anywhere, thus enabling personalized guidance. However, due to the complexity of student behavior and technological limitations, controllable modeling of the "student role" has not been addressed, making it impossible to provide a reverse training environment for teachers.
[0004] While existing research has attempted to use large language models to simulate students, it primarily involves single-round or multi-round prompts directly driving the model to play the role of a "poor student," "average student," or "high-achieving student." A typical approach is to simply assign a persona to the prompts, and then the LLM (Learning Model) generates an answer to a given exercise. For example, if a teacher has a large language model play the role of a poor student to practice answering their questions, the prompts could be: "I am a math teacher, and I want you to play my student. You are a student with poor math skills, have some cognitive biases, and are occasionally careless. I will directly guide you, and in the process, I will practice my teaching skills. Please answer the following question: A truck needs to transport a batch of goods to a city. It must pass through two checkpoints along the way, each of which will detain 20% of the goods on the truck (the detained goods will not be returned). If the truck finally arrives in the city with 96 tons of goods remaining, then: How many tons of goods did the truck initially carry? Note that your answer should be consistent with your knowledge level (poor student)."
[0005] It's important to note that when prompts are provided to large language models, they often directly provide the standard answer; or, despite feigning confusion, they can still offer a perfectly accurate response; even if they sometimes give an incorrect answer, a teacher's hints can immediately bring them to the correct one, leading to a rushed Q&A session. However, struggling students make various mistakes during problem-solving, such as misreading the question, calculation errors, or even careless mistakes like misreading numbers—behaviors that large language models don't make. Therefore, it's difficult to simulate these student behaviors using prompts. Furthermore, when students genuinely make mistakes, teachers guide them step-by-step to understand the problem—a gradual process involving a shift from not knowing to knowing, not a rushed Q&A session.
[0006] It is evident that this method of using prompt-driven models to simulate student-teacher interactions at specific ability levels has significant shortcomings: First, it struggles to realistically reproduce typical student errors, especially when simulating lower-ability students, failing to capture factual errors and logical fallacies. Second, the interaction with the teacher lacks a cognitive evolutionary process from "not understanding to gradual mastery." Existing large language models can quickly provide correct answers upon receiving only a few prompts from the teacher, resulting in stiff interactions and a lack of training value. This poses a significant challenge for prompt-driven large language models when simulating different levels of "not yet mastered" states, making it impossible to provide teachers with truly effective training in answering questions. Summary of the Invention
[0007] To address the aforementioned problems, this invention provides a multi-agent virtual student simulation method and system for teacher Q&A training. It simulates the question-answering and thinking processes of students at different ability levels in terms of knowledge representation, creating a realistic teaching scenario. This addresses the technical problem that current large language models cannot realistically reproduce students' errors in problem-solving and the gradual evolution of their cognition. Through highly realistic interactive dialogue with teachers, it helps teachers repeatedly train their teaching strategies and Q&A skills, thereby improving their practical teaching abilities.
[0008] In a first aspect, the present invention provides a multi-agent virtual student simulation method for teacher Q&A training, the method comprising:
[0009] The S100 control agent analyzes the historical dialogue between the virtual student and the teacher, dynamically infers the knowledge state of the virtual student at the current moment, and generates the expected response state.
[0010] S200 constructs a preset error pattern library, receives the teacher's current round of speech, and the control agent determines whether to inject the errors in the error pattern library and the expected error results into the virtual student's current round of response based on the expected response state and the current round of speech;
[0011] S300 The control agent generates an ordered scheduling result of multiple functional agents based on the expected reaction state and the expected error result;
[0012] S400 schedules multiple functional agents to work based on the ordered scheduling result. The control agent aggregates the responses of the multiple functional agents to generate the current response of the virtual student and feeds it back to the teacher, while updating the historical dialogue.
[0013] Furthermore, the expected reaction state is as follows:
[0014] ;
[0015] in, and These represent the virtual student in the first... Wheel and the first Expected response state in turn-based dialogue; The function representing the evaluation of changes in the virtual student's reaction state is executed by the controller agent; Indicates the virtual student's ability level; It signifies a historical dialogue. , , They represent the first The virtual student's responses and the teacher's remarks during the round-robin dialogue.
[0016] Furthermore, methods for constructing a pre-defined error mode library include: expert experience methods and / or instantiation extension methods; wherein,
[0017] The expert experience method is as follows: by utilizing relevant literature in the current subject education field, combined with teachers' teaching experience and the actual answers of students, typical error types are summarized.
[0018] The instantiation extension method is as follows: based on the typical error types, each type of typical error is further subdivided into fine-grained categories, and corresponding error solution examples are constructed for each type of error.
[0019] Furthermore, the expected error result is:
[0020] ;
[0021] in, This indicates that the controller agent is the virtual student in the [number]th [year]. The expected error result set by the wheel; This represents a measure of error; Indicates the default error mode library; Indicates the first The teacher's remarks during the round-robin dialogue; Indicates the virtual student in the first The response state in a turn-based dialogue.
[0022] Furthermore, multiple functional intelligent agents include at least:
[0023] A reading agent is used to understand the content of the question and identify key information and the conditions given in the question.
[0024] Thinking agents are used to design problem-solving strategies and generate potential solutions.
[0025] Computational agents are used to perform numerical calculations and generate the final answer;
[0026] The checking agent is used to reflect on and verify the problem-solving process and the final answer.
[0027] Furthermore, the ordered scheduling result is as follows:
[0028] ;
[0029] in, and They represent the first The functional agent that is scheduled next and the errors that the functional agent is required to exhibit. ; , Represents a reading agent. Represents a thinking intelligent agent. Represents a computational intelligent agent. Indicates the inspection of the intelligent agent; Indicates an expected error result; This indicates the number of functional agents that can be scheduled on demand.
[0030] Furthermore, scheduling the work of multiple functional agents includes:
[0031] In the context of the aforementioned functional intelligent agent In the scheduling, ;
[0032] in, Represents a functional intelligent agent as the subject. The reply Represents a functional intelligent agent The generated response.
[0033] Furthermore, the controlling agent aggregates the responses from multiple functional agents to generate the current response from the virtual student. for:
[0034] ;
[0035] in, This represents a function that aggregates the responses of multiple functional agents.
[0036] Furthermore, the aggregation is as follows: the controller agent splices, summarizes, or resolves conflicts among the various sub-responses using prompt words to form a natural and coherent student speech.
[0037] In a second aspect, the present invention provides a multi-agent virtual student simulation system for teacher Q&A training, comprising a memory, a processor, and a computer program stored in the memory, wherein the processor executes the computer program to implement the steps of any of the methods described above.
[0038] In summary, this invention provides a multi-agent virtual student simulation method and system for teacher Q&A training. Compared with existing technologies, the technical solution conceived in this invention can achieve the following beneficial effects:
[0039] This invention constructs a virtual student simulation engine comprising a control agent and multiple functional agents, decoupling the student problem-solving process into four observable and interventionist sub-stages: "reading—thinking—calculating—checking." It also injects typical pedagogical errors as needed, with the error type and severity dynamically adjusted by the controller agent based on historical dialogue. This allows the simulation of typical errors and cognitive biases exhibited by students of different ability levels during problem-solving, showcasing a natural cognitive evolution process through multiple rounds of interaction with teachers. For the first time, this invention achieves a technological breakthrough in a large language model system, enabling "trackable cognitive states, programmable error behavior, and controllable gradual processes." It avoids the phenomenon of "cognitive rigidity and instant correctness" found in traditional prompt-based solutions, significantly improving overall dialogue quality. This solves the technical problem that current large language models cannot realistically reproduce students' error performance and the gradual evolution of cognition during problem-solving, providing a new path for training and evaluating teachers' question-answering abilities and expanding the application boundaries of large language models in teacher education. Attached Figure Description
[0040] To more clearly illustrate the technical solutions in this invention or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are some embodiments of this invention. For those skilled in the art, other drawings can be obtained from these drawings without creative effort.
[0041] Figure 1 This is a schematic diagram of the method steps of a multi-agent virtual student simulation method and system for teacher Q&A training provided by the present invention;
[0042] Figure 2This is a schematic diagram illustrating the principle of a multi-agent virtual student simulation method and system for teacher Q&A training provided by the present invention. Detailed Implementation
[0043] To make the objectives, technical solutions, and advantages of this invention clearer, the technical solutions of this invention will be clearly and completely described below with reference to the accompanying drawings and embodiments. Obviously, the described embodiments are only some, not all, of the embodiments of this invention. Based on the embodiments of this invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of this invention.
[0044] It should be noted that, in the description of the embodiments of the present invention, the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a method, step, or apparatus that includes a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to the method, step, or apparatus. Without further limitations, an element defined by the phrase "comprising one..." does not exclude the presence of additional identical elements in the method, step, or apparatus that includes the element.
[0045] To address the technical problem that current large language models cannot realistically reproduce students' errors in problem-solving and the gradual evolution of their cognition, this invention provides a multi-agent virtual student simulation method and system for teacher Q&A training. In terms of knowledge representation, it simulates the problem-solving and thinking process of students at different ability levels, creating a realistic teaching scenario. This allows for highly realistic interactive dialogue with teachers, helping them to repeatedly train their teaching strategies and Q&A skills, and improve their practical teaching abilities.
[0046] It should be noted that the Student Simulation Engine (SSE) proposed in this invention is based on multi-agent collaboration driven by a large language model, aiming to generate natural language responses with multi-layered human student characteristics. SSE divides the student's problem-solving process into four sub-steps: reading the question, thinking, solving the question, and checking. By analyzing historical dialogues, it dynamically infers the cognitive level that the virtual student should demonstrate in the Q&A interaction and the possible human-like errors.
[0047] In the task of constructing virtual students to enhance teachers' Q&A skills, the Q&A task can be formally defined as a pair of tuples. ,in For a specific problem, This refers to the virtual student's ability level. During the Q&A session, the virtual student and teacher will discuss... Multiple rounds of dialogue were formed, denoted as ; , They represent the first The invention utilizes virtual students' responses and teachers' comments in a round-robin dialogue. Virtual students, simulated by a multi-agent system based on a large language model, can interact with teachers in multiple rounds around questions to train teachers' answering abilities and improve the overall quality of their responses.
[0048] Specifically, such as Figure 1 As shown, the method includes:
[0049] S100: The control agent analyzes the historical dialogue between the virtual student and the teacher, dynamically infers the knowledge state of the virtual student at the current moment, and generates the expected response state.
[0050] The controller agent analyzes the context of the dialogue, tracks the virtual student's knowledge state, and plans the types and degrees of errors to be exposed in each sub-step to ensure that the virtual student's performance matches its ability settings, while making the dialogue process natural and cognitively progressive. By performing semantic analysis on historical dialogues and dynamically inferring knowledge states through the controller agent, the virtual student can track its own "know-not-know" boundary in real time. This solves the role collapse problem caused by "cognitive solidification" in the cue word scheme, significantly improving role fit and dialogue realism.
[0051] The main purpose of analyzing the dialogue history is to determine the appropriate response state of the virtual student at the current moment, thereby deciding how the virtual student should react to the teacher's questions. For example, it determines whether the virtual student remains confused about "not yet mastered" or demonstrates an epiphany of "understanding."
[0052] As an example, the expected reaction state is:
[0053] ;
[0054] in, and These represent the virtual student in the first... Wheel and the first Expected response state in turn-based dialogue; The function representing the evaluation of changes in the virtual student's reaction state is executed by the controller agent; Indicates the virtual student's ability level; It signifies a historical dialogue. , , They represent the first The virtual student's responses and the teacher's remarks during the round-robin dialogue.
[0055] It should be noted that the rule for "evaluating the changes in the virtual student's response state" can be defined in the form of prompts and executed by the controller agent. For example, Chinese contextual instructions can be used, which are compatible with any open-source large language model (such as QwQ-32B, DeepSeek-R1, etc.) and can be executed at runtime with zero samples without additional fine-tuning. This is a relatively mature existing technology and will not be elaborated on here.
[0056] S200 constructs a preset error pattern library, receives the teacher's current speech, and controls the agent to determine whether to inject errors from the error pattern library and expected error results into the virtual student's current response based on the expected response state and the current speech.
[0057] Given that different subjects, grade levels, and knowledge topics have their own characteristics, the common mistakes made by virtual students also vary. Existing research mostly focuses on the macro-level classification of error types and lacks detailed descriptions for specific subject tasks, which to some extent limits the realism of virtual student simulations.
[0058] To compensate for this deficiency, as an example, the method for constructing a preset error mode library according to the present invention includes: an expert experience method and / or an instantiation extension method; wherein,
[0059] The expert's experience method is to use relevant literature in the current subject education field, combined with teachers' teaching experience and the actual answers of students, to summarize the typical error types;
[0060] The instantiation extension method is as follows: based on typical error types, each typical error is further subdivided into fine-grained categories, and corresponding error solution examples are constructed for each type of error to enhance the operability and coverage of error patterns.
[0061] This invention integrates expert experience and exemplified extensions to form a preset error pattern library to support SSE. It then provides this library to the controller agent in the form of prompt words as a reference. The controller agent then determines whether to inject an error, which type of error to select, and the type of error to select based on the student's expected response in the current dialogue and the teacher's questions or feedback.
[0062] It should be noted that the "prompt word format" here does not mean forcibly stuffing the entire preset error pattern library into the prompt words. Instead, at runtime, a template of "dynamic sampling + structured description + few examples" is used to instantly assemble a natural language instruction from "a small batch of typical errors that may be used at the moment" for the controller agent to call with zero samples.
[0063] Specifically, the process can be as follows: First, a table can be created based on four dimensions: "subject-knowledge point-ability level-error result". Each record includes information such as error name, typical manifestation, triggering keyword, applicable ability level, and teachability score. Then, when the controller agent obtains the table, it first uses keyword + vector similarity dual recall to select the top-k (k≤5) most relevant error records from the preset error pattern library to form a temporary sub-library E′. Next, the temporary sub-library E′ is filled into the set prompt word template to obtain a prompt word and then sent to the large language model.
[0064] As an example, the expected error result is:
[0065] ;
[0066] in, The controller agent represents the virtual student in the first... The expected error result set by the wheel; This represents a measure of error; Indicates the default error mode library; Indicates the first The teacher's remarks during the round-robin dialogue; Indicates the virtual student in the first The response state in a turn-based dialogue.
[0067] In other words, if the controller agent determines that the student has fully understood the knowledge point, it decides that there is no need to inject errors from the error pattern library into the virtual student's current response. If the controller agent determines that the student has not fully understood the knowledge point, it determines that an error from the error pattern library needs to be injected into the virtual student's current response. .
[0068] It should be noted that the rules for "error metrics" can be defined in the form of prompt words and executed by the controller agent. For example, Chinese contextual instructions can be used, which are compatible with any open-source large language model (such as QwQ-32B, DeepSeek-R1, etc.) and can be executed at runtime with zero samples without additional fine-tuning. This is a relatively mature existing technology and will not be elaborated on here.
[0069] The S300 control agent generates an ordered scheduling result of multiple functional agents based on the expected reaction state and expected error result;
[0070] To achieve realistic simulation of students at different ability levels, the responses generated by SSE need to appropriately incorporate human-like errors and be gradually corrected under the guidance of teachers. Therefore, inspired by the IDEAL problem-solving theory, this invention divides the question-answering task into problem identification, definition, exploration, action, and reflection, emphasizing the entire process from problem identification to result reflection. During the question-answering process, the virtual student's reaction status is tracked in real time, and the student's answering process is broken down into multiple sub-tasks, each executed by a different agent. Simultaneously, human-like errors are strategically introduced in several key stages such as reading the question, thinking, calculating, and checking to enhance the rationality and realism of the virtual student.
[0071] Multiple functional agents correspond to the four stages of a virtual student's problem-solving process: reading, thinking, calculation, and checking. Each functional agent completes its own task under the scheduling of the controller agent and injects appropriate human-like errors at designated stages, thereby achieving a realistic simulation of students with different cognitive levels.
[0072] As an example, the multiple functional agents include at least a reading agent, a thinking agent, a computational agent, and an inspection agent.
[0073] The Reader, corresponding to the "identification and definition" in the IDEAL theory, is used to understand the content of the question and identify key information and the given conditions. Since the assignments and tests that students encounter in their studies are often well-structured and have clear task boundaries, in SSE, problem identification and problem definition are combined into reading and understanding, which is the Reader.
[0074] The Thinker, corresponding to the "Exploration" aspect of the IDEAL theory, is used to design problem-solving strategies and generate potential solutions.
[0075] The computational agent (Solver), corresponding to the "action" in the IDEAL theory, is used to perform numerical calculations and generate the final answer.
[0076] The Checker agent, corresponding to the "reflection" in the IDEAL theory, is used to reflect on and verify the problem-solving process and the final answer.
[0077] It should be noted that, in addition to the aforementioned functional intelligent agents, more functional intelligent agents can be added or subdivided.
[0078] The generation of ordered scheduling results aims to produce scheduling results for multiple functional agents based on expected response states and expected error outcomes. This invention utilizes a pre-set error pattern library and an expected response state matching mechanism to precisely inject typical, pedagogically-interventionible, and teachable errors into the four stages of reading, thinking, calculation, and checking, avoiding random fabrication and enhancing the "pedagogical value" and "error rationality" of the errors. Simultaneously, ordered scheduling breaks down the problem-solving process into an observable and interventionable four-step pipeline, enabling teachers to pinpoint students' thought processes at specific stages, providing sequential guidance and significantly improving the realism of the training scenario.
[0079] Furthermore, the ordered scheduling result is:
[0080] ;
[0081] in, and They represent the first The functional agent that is scheduled next and the errors that the functional agent is required to exhibit. ; , Represents a reading agent. Represents a thinking intelligent agent. Represents a computational intelligent agent. Indicates the inspection of the intelligent agent; Indicates an expected error result; This indicates the number of functional agents that can be scheduled on demand.
[0082] It should be noted that, It is a non-negative integer, which means that only some functional agents may be scheduled, not all functional agents.
[0083] The S400 schedules multiple functional agents to work based on the ordered scheduling results. The control agent aggregates the responses of multiple functional agents to generate the virtual student's response for the current round and feeds it back to the teacher, while updating the historical dialogue.
[0084] In other words, the subordinate functional agents are scheduled to work step by step according to the ordered scheduling results generated by the controller agent; as an example, scheduling multiple functional agents to work includes:
[0085] In the context of functional intelligent agents In the scheduling, ;
[0086] in, Represents a functional intelligent agent as the subject. The reply Represents a functional intelligent agent The generated response.
[0087] Once all responses from the functional agents to be scheduled have been generated, the controller agent summarizes and aggregates these responses to form the final response to the teacher's question.
[0088] As an example, the generated virtual student's response in this round for:
[0089] ;
[0090] in, This represents a function that aggregates the responses of multiple functional agents.
[0091] It should be noted that the aggregation refers to the controller agent concatenating, summarizing, or resolving conflicts among various sub-responses using prompts to form a natural and coherent student statement. For example, Chinese contextual instructions can be used, which are compatible with any open-source large language model (such as QwQ-32B, DeepSeek-R1, etc.) and can be executed with zero samples at runtime without additional fine-tuning. This is a relatively mature existing technology and will not be elaborated further here.
[0092] It should be noted that, as Figure 2 As shown, the SSE comprises a collaborative control module and an execution module, used to track student status and generate human-like behavior, respectively. Specifically, the control module is the core component of the SSE, primarily responsible for dynamically modeling the student's knowledge state and generating response strategies that align with their cognitive level. Its main functions include dialogue history analysis, typical error generation, and scheduling planning. The control module includes a controller agent responsible for analyzing historical dialogues, inferring the student's current knowledge state, and determining their behavior in sub-stages such as reading the question, thinking, calculating, and checking, including whether errors should occur, the type of error, and its severity. The execution module, based on the control module's decisions, schedules its subordinate functional agents to complete tasks such as question reading, problem-solving strategy design, numerical calculation, and result checking, thereby generating natural language responses that conform to the virtual student's settings.
[0093] As a specific embodiment,
[0094] Task Input
[0095] The teacher sends the following to the Student Simulation Engine (SSE) via the interactive interface:
[0096] "Xiaoming, please try this problem again: A truck is transporting a batch of goods to city A. It must pass through two checkpoints along the way, and at each checkpoint, 20% of the goods on the truck will be detained (the detained goods will not be returned). If the truck finally arrives with 96 tons of goods remaining, how many tons of goods did the truck carry when it set off?"
[0097] The controller agent executes functions based on a QwQ-32B large language model instance, using prompt words. The system identifies that the virtual student exhibited a reading / comprehension bias in the first two rounds by "accumulating two consecutive 20% scores into 40%", thus inferring that the current round should maintain the "not yet mastered" state, with the expected response state being... The answer is "Confused - Partial Error". This indicates that the current virtual student's level is relatively weak. Based on the analysis of historical dialogues, this question may be quite difficult for him, and errors still need to be introduced in subsequent questions.
[0098] The controller agent queries a pre-set error pattern library and performs a matching function. Output the expected error result for this round. {"Oversight in reading the question: Missed the number of checkpoints"; "Confusion in understanding: Directly adding the two 20% values together to get 40%"}. The error level is set to "Medium" to maintain teachability.
[0099] The controller agent generates the ordered scheduling result as follows: In other words, the reading agent does not inject errors to ensure that the information in the question is fully parsed; the thinking agent injects the error "accumulate 40%"; the calculation agent performs arithmetic normally; the checking agent is skipped in this round because students often neglect to reflect due to its "weakness".
[0100] The work of multiple functional agents is scheduled based on the ordered scheduling result, specifically as follows:
[0101] Reading agent:
[0102] Input: Question stem ;
[0103] Output The key information is: "Two checkpoints; 20% deduction each time; 96 tons remaining." (No errors, correct statement).
[0104] Thinking agent:
[0105] enter: ;
[0106] Under the guidance of the prompt "You are a weaker student and may mistakenly believe that two 20% deductions total 40%", output The example given is: "I first thought: a total of 20% + 20% = 40% was deducted, so 60% is left. Let x tons be the initial weight, then x - 0.4x = 0.6x = 96." (Successfully demonstrates the error in the objective, and the logic is logically consistent).
[0107] Computational agents:
[0108] enter: ;
[0109] Perform algebraic solution and output. The answer is: "x = 96 ÷ 0.6 = 160 tons" (the calculation process is correct, but the result is not the standard answer due to an error in the upstream approach).
[0110] Inspect the agent:
[0111] in accordance with The system was found to have no intelligent agent that was invoked and no self-checking process, which is consistent with the behavioral characteristics of a "weak" student.
[0112] Next, the controller agent will The data is aggregated and stitched together, and then processed with smooth language prompts to generate the final virtual student response. "Okay, teacher, I'll try. Let's assume the truck loaded x tons of cargo when it departs. The first checkpoint detains 20% of the cargo, and the second checkpoint detains another 20%, which means a total of 40% of the cargo is detained. So the remaining cargo is x - 0.4x = 0.6x. When the truck finally arrives in the city, it has 96 tons of cargo left. Therefore, 0.6x = 96, and solving for x gives x = 96 ÷ 0.6 = 160 tons."
[0113] Finally, the virtual student will reply. The message is sent to teachers, who can immediately ask follow-up questions regarding the "40%" misconception (such as "Are deductions made simultaneously or sequentially at checkpoints?"). The system then enters the next cycle from S100 to S400, gradually guiding students to correct their understanding. Each functional agent independently undertakes a sub-task and is controlled to make mistakes, which are then aggregated to generate the final response. This maintains overall language coherence while reflecting the gradual correction process at a fine-grained level, achieving a gradual trajectory from "error → partial correction → complete mastery," overcoming the leapfrog defect of traditional solutions where a single prompt leads to complete correctness.
[0114] The following discussion, from an experimental perspective, highlights the superior performance of SSE compared to other comparative methods and its suitability for training teachers' question-and-answer skills.
[0115] First, in the experiment, this invention categorizes virtual students into three ability levels: weak, average, and excellent. The ability level, knowledge status, and error tendency are explicitly set in the prompts, guiding the model to simulate virtual students with different cognitive characteristics. Simultaneously, this invention selects the classic teaching method of "Socratic method" as the primary strategy for teacher practice. This method, which replaces direct instruction with questioning and guided thinking, is widely used in various teaching scenarios. This invention uses it as a typical strategy to test whether virtual students can cooperate with the teacher's gradual guidance and demonstrate a reasonable cognitive change process.
[0116] In the actual implementation of this invention, all agents involved in SSE are driven by the QwQ-32B model. Furthermore, to efficiently generate large-scale question-and-answer dialogues and improve the reproducibility of experimental results, this invention uses a large language model as the teacher, specifically employing the Qwen2.5-72B-Instruct model to simulate the teacher.
[0117] An ideal virtual student should closely resemble the character's established personality, demonstrating their ability level in interactions with the teacher and showcasing a cognitive evolution from misunderstanding to understanding as the lesson progresses. Therefore, this invention evaluates the virtual student simulation effect of SSE in two categories: problem-solving performance and the quality of multi-round Q&A dialogues. Problem-solving performance refers to whether the virtual student can provide answers commensurate with their ability level when answering questions assigned by the teacher; the quality of multi-round Q&A dialogues refers to the effectiveness of the teacher-student dialogue around the problem to be solved.
[0118] Because GSM8K is a high-quality dataset of math word problems at a junior high school level, it was chosen as the material for teacher-student dialogues in this invention. SSE and various comparison models act as virtual students of different ability levels. The virtual students solve math problems on GSM8K according to their ability level, emphasizing the simulation of the corresponding ability level rather than using correct answers as the sole criterion. They then engage in dialogue with a teacher around the problem, during which the teacher applies teaching strategies to answer questions and resolve doubts for the virtual students.
[0119] This invention selects a total of 6 comparison models, covering multiple parameter scales such as 7B, 14B, and 32B, specifically: DeepSeek-R1-Distill-Qwen-32, DeepSeek-R1-Distill-Llama-8B, QwQ-32B, Qwen2.5-7B-Instruct, Qwen2.5-14B-Instruct, and Qwen2.5-32B-Instruct.
[0120] First, compared to comparative methods, the answers provided by virtual students in SSE simulations have a higher degree of "error plausibility".
[0121] This invention selects incorrect responses from various models on the GSM8K dataset and calculates the average score of the reasonableness of these incorrect responses. This experiment does not involve interaction with teachers; it only involves virtual students answering questions.
[0122] This invention proposes an error rationality index to evaluate the model's problem-solving performance: This index is used to determine whether the model's errors in a single round of problem-solving have the characteristics of common errors made by real students, to measure whether the model's errors are teachable and typical, and to avoid arbitrary errors or completely logical errors.
[0123] This invention employs a Large Language Model-as-a-Judge paradigm to score the subjective metric of error reasonableness, with a score range of 1-10. A higher score indicates a better performance of the model in virtual student simulation. Specifically, Qwen2.5-72B-Instruct was used as the judge in the experiment.
[0124] Table 1. Comparison of the reasonableness of errors in SSE and various comparative models.
[0125]
[0126] Table 1 shows the error plausibility comparison between SSE and various comparative models. It can be seen that SSE has the highest error plausibility across all ability levels compared to all other comparative models. This is related to SSE's decomposition of the virtual student's problem-solving process into multiple sub-steps based on the IDEAL theory. Due to this step-by-step decomposition, SSE can inject errors at each step, thereby avoiding meaningless errors and generating more human-like problem-solving results.
[0127] Second, compared to comparative methods, the multi-round Q&A dialogues of virtual students simulated by SSE have higher "role fit", "dialogue coherence" and "cognitive gradualism".
[0128] Specifically, the quality of Q&A can be expressed as: ,in, The function represents the measure of dialogue quality, taking into account multiple dimensions such as role fit, dialogue coherence, and cognitive variability. It can be evaluated manually or by using a large language model as the judge.
[0129] This invention selects 100 questions from the GSM8K dataset as dialogue material. Teachers and virtual students engage in question-and-answer dialogues around these questions. The dialogue ends when the teacher deems the virtual student to have fully understood the question. Specifically, this invention uses the Qwen2.5-72B-Instruct model to act as the teacher, guiding the teacher through the question-and-answer process using prompts. The model's role fit, dialogue coherence, and cognitive variability are evaluated in each dialogue between the model and the teacher.
[0130] Role fit: Evaluate whether the virtual student responses generated by the model in the dialogue are consistent with its set ability level and whether they can reflect the expected knowledge gaps or erroneous logic.
[0131] Dialogue coherence: Assess whether the simulated dialogue is coherent, whether the virtual student's remarks closely respond to the teacher's questions or the preceding content, and whether there are any jumps in the answers.
[0132] Gradual cognitive change: Does the dialogue reflect the dynamic changes in the virtual student's state, gradually transitioning from "not understanding" to "partially mastering" or "completely mastering", rather than "knowing it as soon as asked" or "never knowing it"?
[0133] We still use the Large Language Model as a Judge (LLM-as-a-Judge) paradigm to score these subjective indicators, with a score range of 1-10.
[0134] Table 2 Evaluation index results of different models
[0135]
[0136] Table 2 shows the scores of each model on the three indicators. The results show that SSE can achieve the highest role fit, dialogue coherence and cognitive variability in all situations, which further confirms the advantages of SSE.
[0137] On the other hand, the present invention also provides a multi-agent virtual student simulation system for teacher Q&A training, including a memory, a processor and a computer program stored in the memory, wherein the processor executes the computer program to implement the steps of any of the above methods.
[0138] In summary, this invention can simulate the typical errors and cognitive biases of students at different ability levels during problem-solving. It demonstrates a natural cognitive evolution process in multiple rounds of Q&A interaction with teachers. For the first time, it achieves a technological breakthrough in the large language model system, enabling "trackable cognitive states, programmable error behaviors, and controllable gradual processes." This avoids the phenomenon of "cognitive rigidity and instant correctness with just one prompt" in traditional prompt word schemes, greatly improving the overall dialogue quality. It solves the technical problem that current large language models cannot realistically reproduce students' error performance and the gradual evolution of cognition during problem-solving. It provides a new path for the training and evaluation of teachers' Q&A abilities and expands the application boundaries of large language models in the field of teacher education.
[0139] It should be noted that, for the sake of simplicity, the foregoing embodiments are all described as a series of actions. However, those skilled in the art should understand that the present invention is not limited to the described order of actions, as some steps can be performed in other orders or simultaneously according to the present invention. Furthermore, those skilled in the art should also understand that the embodiments described in the specification are preferred embodiments, and the actions and modules involved are not necessarily essential to the present invention. In the above embodiments, the descriptions of each embodiment have their own emphasis; for parts not described in detail in a certain embodiment, please refer to the relevant descriptions of other embodiments.
[0140] In the several embodiments provided by this invention, it should be understood that the disclosed methods or systems can be implemented in other ways. For example, the embodiments described above are merely illustrative; for instance, the division of units is only a logical functional division, and in actual implementation, there may be other division methods. For example, multiple units or components may be combined or integrated into another system, or some features may be ignored or not executed.
[0141] The foregoing description is merely an exemplary embodiment of this disclosure and should not be construed as limiting the scope of this disclosure. Any equivalent changes and modifications made in accordance with the teachings of this disclosure shall still fall within the scope of this disclosure. Those skilled in the art will readily conceive of embodiments of this disclosure upon considering the specification and practicing the disclosure herein. This application is intended to cover any variations, uses, or adaptations of this disclosure that follow the general principles of this disclosure and include common knowledge or customary techniques in the art not described herein. The specification and embodiments are to be considered exemplary only, and the scope and spirit of this disclosure are defined by the claims.
[0142] The technical features of the above embodiments can be combined in any way. For the sake of brevity, not all possible combinations of the technical features in the above embodiments are described. However, as long as there is no contradiction in the combination of these technical features, they should be considered to be within the scope of this specification.
[0143] Those skilled in the art will readily understand that the above description is merely a preferred embodiment of the present invention and is not intended to limit the present invention. Any modifications, equivalent substitutions, and improvements made within the spirit and principles of the present invention should be included within the scope of protection of the present invention.
Claims
1. A multi-agent virtual student simulation method for teacher Q&A training, characterized in that, The method includes: The S100 control agent analyzes the historical dialogue between the virtual student and the teacher, dynamically infers the knowledge state of the virtual student at the current moment, and generates the expected response state. S200 constructs a preset error pattern library, receives the teacher's current round of speech, and the control agent determines whether to inject the errors in the error pattern library and the expected error results into the virtual student's current round of response based on the expected response state and the current round of speech; S300 The control agent generates an ordered scheduling result of multiple functional agents based on the expected reaction state and the expected error result; The multiple functional agents include at least: a reading agent, used to understand the content of the question and identify key information and the given conditions; a thinking agent, used to design problem-solving strategies and generate potential solutions; a computational agent, used to perform numerical calculations and generate the final answer; and a checking agent, used to reflect on and verify the problem-solving process and the final answer. The ordered scheduling result is as follows: ;in, and They represent the first The functional agent that is scheduled next and the errors that the functional agent is required to exhibit. ; , Represents a reading agent. Represents a thinking intelligent agent. Represents a computational intelligent agent. Indicates the inspection of the intelligent agent; Indicates an expected error result; This indicates the number of functional agents that can be scheduled on demand. S400 schedules multiple functional agents to work based on the ordered scheduling result. The control agent aggregates the responses of the multiple functional agents to generate the current response of the virtual student and feeds it back to the teacher, while updating the historical dialogue. Scheduling multiple functional agents to work includes: [the process involves] managing the functional agents... In the scheduling, ;in, Represents a functional intelligent agent as the subject. The reply Represents a functional intelligent agent The generated response.
2. The multi-agent virtual student simulation method for teacher Q&A training according to claim 1, characterized in that, The expected reaction state is: ; in, and These represent the virtual student in the first... Wheel and First Expected response state in turn-based dialogue; The function representing the evaluation of changes in the virtual student's reaction state is executed by the controller agent; Indicates the virtual student's ability level; It signifies a historical dialogue. , , They represent the first The virtual student's responses and the teacher's remarks during the round-robin dialogue.
3. The multi-agent virtual student simulation method for teacher Q&A training according to claim 1, characterized in that, Methods for constructing a pre-defined error mode library include: expert experience methods and / or instantiated extension methods; among which, The expert experience method is as follows: by utilizing relevant literature in the current subject education field, combined with teachers' teaching experience and the actual answers of students, typical error types are summarized. The instantiation extension method is as follows: based on the typical error types, each type of typical error is further subdivided into fine-grained categories, and corresponding error solution examples are constructed for each type of error.
4. The multi-agent virtual student simulation method for teacher Q&A training according to claim 1, characterized in that, The expected error result is: ; in, This indicates that the controller agent is the virtual student in the [number]th [year]. The expected error result set by the wheel; This represents a measure of error; Indicates the default error mode library; Indicates the first The teacher's remarks during the round-robin dialogue; Indicates the virtual student in the first The response state in a turn-based dialogue.
5. A multi-agent virtual student simulation method for teacher Q&A training according to claim 1, characterized in that, The control agent aggregates the responses from multiple functional agents to generate the current response from the virtual student. for: ; in, This represents a function that aggregates the responses of multiple functional agents.
6. The multi-agent virtual student simulation method for teacher Q&A training according to claim 5, characterized in that, The aggregation is as follows: the controller agent splices, summarizes or resolves conflicts of each sub-response in the form of prompt words to form a natural and coherent student speech.
7. A multi-agent virtual student simulation system for teacher Q&A training, comprising a memory, a processor, and a computer program stored in the memory, characterized in that, The processor executes the computer program to implement the steps of the method according to any one of claims 1 to 6.