A method and device for fine-tuning a large-scale model of heuristic medical education based on clinical thinking.

By constructing a large-scale model of heuristic medical education based on clinical thinking, and utilizing medical electronic medical record data and intelligent agent architecture, the problem of insufficient clinical thinking modeling in existing systems is solved, enabling personalized and interactive teaching support, and improving students' independent thinking ability and teaching effectiveness.

CN121279360BActive Publication Date: 2026-06-30ZHEJIANG UNIV OF TECH

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
ZHEJIANG UNIV OF TECH
Filing Date
2025-09-09
Publication Date
2026-06-30

Smart Images

  • Figure CN121279360B_ABST
    Figure CN121279360B_ABST
Patent Text Reader

Abstract

This invention discloses a method and apparatus for fine-tuning a large-scale medical education model based on clinical thinking, comprising constructing a clinical thinking framework based on clinical teaching data and expert opinions. This method and apparatus forms a structured clinical teaching corpus through the clinical thinking framework, and constructs a three-agent collaborative architecture—teacher agent, expert agent, and student agent—based on the corpus. The final dataset is generated through this collaborative architecture, and fine-tuned using the final dataset to obtain the final large-scale medical education model. This significantly improves the interactivity, adaptability, and teaching guidance capabilities of the final large-scale medical education model. The final model can gradually guide students to reason and explore based on the clinical thinking framework, effectively stimulating users' independent thinking ability while maintaining the logical flow of teaching, providing appropriate guidance and feedback, thus meeting the actual needs of clinical teaching.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention belongs to the fields of artificial intelligence and medical education technology, specifically involving a method for fine-tuning a large-scale model of heuristic medical education based on clinical thinking. Background Technology

[0002] Clinical reasoning is a systematic and logical way of thinking that medical professionals develop during diagnosis, analysis, and treatment. It is a core competency for ensuring medical quality and patient safety. However, in traditional medical education, the cultivation of clinical reasoning often relies on face-to-face teacher-student interaction and case discussions. Constrained by factors such as faculty resources, teaching resources, case diversity, and time and space, it is difficult to provide learners with continuous, systematic, and personalized training support.

[0003] In recent years, artificial intelligence technology has developed rapidly in medical education, especially intelligent question-answering and teaching-assisted systems based on large language models, which have been applied to medical knowledge retrieval, case analysis, and learning assessment. However, existing technologies generally have the following shortcomings:

[0004] 1) Lack of clinical reasoning process modeling: Most systems focus on static knowledge output as the core and fail to effectively simulate the real clinical reasoning chain.

[0005] 2) Lack of heuristic teaching mechanism: Existing intelligent teaching models are mostly based on learners asking questions and models responding passively, with little guidance for learners to think and explore actively, which is not conducive to the formation of independent diagnostic and decision-making abilities.

[0006] 3) Insufficient personalization and interactivity: When generating content, the model often ignores the learner's current cognitive level and clinical background, making it difficult to dynamically adjust teaching strategies. Summary of the Invention

[0007] The purpose of this invention is to address the problems raised in the background art by proposing a method for fine-tuning a large-scale medical education model based on clinical thinking.

[0008] To achieve the above objectives, the technical solution adopted by the present invention is as follows:

[0009] This invention proposes a method for fine-tuning a large-scale medical education model based on clinical thinking, which includes constructing a clinical thinking framework based on clinical teaching data and expert opinions.

[0010] We collected different types of electronic medical record data as the initial first clinical corpus, and then performed objective information extraction, subjective information generation, data filtering and enhancement processing on the first clinical corpus based on the clinical thinking framework to obtain a structured clinical teaching corpus.

[0011] Based on the clinical thinking framework and clinical teaching corpus, a collaborative architecture is constructed that includes teacher agents, expert agents, and student agents. Based on the collaborative architecture, a final dataset of teaching dialogues between teacher agents, expert agents, and student agents is generated. In this dataset, the teacher agent automatically generates a teaching guidance plan, the expert agent detects and corrects the content generated by the teacher agent, and the student agent responds based on the results generated by the expert agent and the teacher agent.

[0012] The first pre-defined large model is selected as the first basic model. The first basic model is then fine-tuned under supervision using the final dataset to obtain the second basic model. Finally, a portion of the samples in the final dataset are used to perform reinforcement learning on the second basic model to obtain the final large medical education model.

[0013] Preferably, the clinical thinking framework includes seven steps: Symptom focus: identifying symptoms based on the patient's medical history; Preliminary diagnosis: making a preliminary diagnosis based on the identified symptoms; Auxiliary examination: conducting auxiliary examinations to verify the preliminary diagnosis; Diagnosis confirmation: confirming the diagnosis based on the results of the auxiliary examinations; Treatment plan: determining the treatment plan based on the confirmed diagnosis; Humanistic care: paying attention to and communicating with the patient regarding their illness and mental health; and Analysis and summary: analyzing and summarizing the treatment process.

[0014] Preferably, the process of obtaining the clinical teaching corpus includes:

[0015] The collected electronic medical record data is converted to a preset text format, and the converted electronic medical record data is used as the initial first clinical corpus.

[0016] The second pre-defined model was used to extract objective information and generate subjective information from the first clinical corpus, resulting in a structured second clinical corpus. The extracted objective information included: the patient's medical history background, auxiliary examination items, auxiliary examination results, diagnostic confirmation results, and treatment plan. The generated subjective information included: identifying symptoms that affect the patient's disease diagnosis, generating preliminary diagnostic results based on the identified symptoms, generating communication suggestions for the patient based on the identified symptoms, and analyzing and summarizing based on the treatment plan.

[0017] Remove patient identity information from the second clinical corpus;

[0018] The second clinical corpus, after being removed, underwent content correction and deletion, incomplete fields were supplemented, and the one-to-one correspondence between auxiliary examination items, auxiliary examination results, diagnostic confirmation results, and treatment plans was verified. Finally, a structured clinical teaching corpus was obtained through manual review.

[0019] Preferably, in the process of constructing the collaborative architecture, for the teacher agent, prompt words are designed so that the teacher agent can automatically generate a teaching guidance plan based on the clinical teaching corpus, clinical thinking framework and historical teaching dialogue. The teaching guidance plan includes the construction of a hierarchical question chain based on the clinical thinking framework, a question-driven heuristic questioning mechanism and an adaptive teaching response to the differences in the preset cognitive state of the student agent. The preset cognitive states include five types: comprehension deviation, knowledge error, expression omission, standard response and active exploration.

[0020] For expert agents, prompt words are designed to enable them to detect and correct the content generated by teacher agents. If the expert agent detects that the content generated by the teacher agent contains knowledge errors, reveals answers, or violates the clinical reasoning framework, the content generated by the teacher agent will be revised. If no problems are detected, no revision is required.

[0021] For student agents, prompts are designed so that student agents can respond based on the content generated by teacher agents and expert agents, as well as the preset cognitive state of the student agent.

[0022] The teaching dialogue between agents is set up in the first loop order: teacher agent, expert agent, student agent, and then back to teacher agent.

[0023] Preferably, in the process of generating the final dataset, the initial dataset of teaching dialogues is generated in the order of the first cycle, and the initial dataset contains multiple first samples. Each first sample contains multiple rounds of first teaching dialogues. The teaching dialogue from the teacher agent to the expert agent and then to the student agent is taken as a round of first teaching dialogue. The teacher agent in the next round generates a teaching guidance plan based on the historical teaching.

[0024] For each round of the first teaching dialogue in each first sample, if the expert agent revises the content generated by the teacher agent, then the content generated by the expert agent and the content answered by the student agent in the first round of the first teaching dialogue are considered as the second round of the second teaching dialogue. If the expert agent does not revise the content generated by the teacher agent, then the content generated by the teacher agent and the content answered by the student agent in the first round of the first teaching dialogue are considered as the second round of the second teaching dialogue. Multiple rounds of second teaching dialogues constitute the second sample, and multiple second samples constitute the final dataset.

[0025] Preferably, the step of using a subset of samples from the final dataset to perform reinforcement learning on the second base model to obtain the final large-scale medical education model includes:

[0026] A portion of the teaching dialogues is randomly sampled from each sample in the final dataset to form a third sample;

[0027] For each third sample, corresponding prompt words are designed based on the cognitive state of different student agents in the third sample. The pre-set second model is used to generate positive samples of teacher agent-generated content that conforms to the three dimensions of clinical thinking framework, heuristic teaching characteristics, and personalized feedback ability. Negative samples of teacher agent-generated content that are insufficient in each of the three dimensions are also generated in turn. Conforming to the clinical thinking framework means executing the seven-step operation of the clinical thinking framework in sequence. Conforming to the heuristic teaching characteristics means teaching by asking questions and avoiding revealing the answers. Conforming to the personalized feedback ability means optimizing the teacher agent-generated content according to the five cognitive states of the student agents.

[0028] The third sample is combined with the corresponding positive and negative samples to form a preference triple;

[0029] By using preference triples to perform direct preference optimization training on the third basic model, the final large-scale medical education model is obtained.

[0030] A fine-tuning device for a large-scale medical education model based on clinical thinking includes a processor and a memory storing a number of computer instructions. The device is characterized in that the computer instructions, when executed by the processor, implement the steps of a fine-tuning method for a large-scale medical education model based on clinical thinking.

[0031] Compared with the prior art, the beneficial effects of the present invention are as follows:

[0032] This method and device for fine-tuning a large-scale medical education model based on clinical thinking forms a structured clinical teaching corpus through a clinical thinking framework. Based on this corpus, a three-agent collaborative architecture—teacher agent, expert agent, and student agent—is constructed. The final dataset is generated through this collaborative architecture and then fine-tuned using the final dataset to obtain the final large-scale medical education model. This significantly improves the model's interactivity, adaptability, and teaching guidance capabilities. The final large-scale medical education model can guide students to reason and explore step-by-step based on the clinical thinking framework. While maintaining the logical flow of teaching, it effectively stimulates users' independent thinking ability and provides appropriate guidance and feedback, thus better meeting the actual needs of clinical teaching. Attached Figure Description

[0033] Figure 1 This is a flowchart illustrating the fine-tuning method of the large-scale medical education model based on clinical thinking, as described in this invention. Detailed Implementation

[0034] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.

[0035] Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention pertains. The terminology used herein in the description of the invention is for the purpose of describing particular embodiments only and is not intended to limit the invention.

[0036] like Figure 1 As shown, a method for fine-tuning a large-scale medical education model based on clinical thinking is provided, including:

[0037] Step 1: Construct a clinical thinking framework based on clinical teaching data and expert opinions (to serve as the core guiding principle for subsequent teaching design and model construction).

[0038] Step 1.1: Clinical teaching data includes clinical teaching syllabus, clinical teaching materials and clinical teaching training programs. Extract key links and core concepts that can reflect the diagnostic reasoning process, such as disease attention, auxiliary examinations, examination confirmation, treatment plan, etc.

[0039] Step 1.2, Expert Opinions, including feedback from seminars and questionnaires with clinicians and medical education experts, supplemented elements not fully covered in the teaching data, such as preliminary diagnosis, humanistic care, analysis and summary, and optimized existing elements to better meet the needs of front-line clinical practice.

[0040] Step 1.3: Following the general logical sequence of clinical diagnosis and treatment, arrange the final determined clinical thinking steps sequentially to obtain the seven-step operation of the clinical thinking framework (and describe them using unified and standardized terminology): Symptom Focus: Identify the symptoms based on the patient's medical history (identify characteristic symptoms with pathophysiological significance based on the collected medical history information); Preliminary Diagnosis: Make a preliminary diagnosis based on the identified symptoms (establish a differential diagnosis list based on symptom combinations and disease patterns, ranking probabilities); Ancillary Examinations: Conduct ancillary examinations to verify the preliminary diagnosis (customize ancillary examination items corresponding to the disease for verification based on the preliminary diagnosis); Diagnosis Confirmation: Confirm the diagnosis based on the results of the ancillary examinations (combining the results of the ancillary examinations with clinical standards to clarify the disease and its severity); Treatment Plan: Determine the treatment plan based on the diagnosis confirmation (develop guideline-guided individualized treatment, balancing efficacy and safety); Humanistic Care: Communicate with the patient regarding their disease and psychology (use empathic communication techniques to simultaneously address the impact of the disease and the patient's psychological needs); and Analysis and Summary: Analyze and summarize the treatment process (review and reflect on the diagnosis and treatment process to optimize clinical thinking patterns and decision-making quality).

[0041] Step 2: Collect different types of medical electronic medical record data as the initial first clinical corpus, and perform objective information extraction, subjective information generation, data filtering and enhancement processing on the first clinical corpus according to the clinical thinking framework to obtain a structured clinical teaching corpus.

[0042] Step 2.1: Convert the collected electronic medical record data into a preset text format (such as JSON text format), and use the converted electronic medical record data as the initial first clinical corpus.

[0043] It should be noted that during the data collection process, electronic medical record data covering different departments (such as internal medicine, surgery, pediatrics, obstetrics and gynecology, etc.) and various disease types can be collected from medical institutions and uniformly converted into a processable JSON text format to form the initial first clinical corpus. Electronic medical records typically include information such as the patient's chief complaint, present illness, past medical history, physical examination, auxiliary examination results, diagnostic conclusions, and treatment process.

[0044] The following is a sample of the collected raw electronic medical record data:

[0045] 1. Patient's name: Female, 69 years old. History of peptic ulcer disease for 1 year, history of hypertension for 4 years, highest blood pressure 180 / 90 mmHg. Denies history of heart disease and diabetes. 2. The patient experienced sudden chest, lower back, and back pain 7 days prior to admission. The pain worsened when bending over or standing up from a lying position. She received intravenous treatment at a local hospital (specific medication unknown), but her symptoms did not improve. Seeking further diagnosis and treatment, the patient and her family came to our hospital. An outpatient examination of the thoracic and lumbar spine (anteroposterior and lateral views) revealed a compression fracture of the T12 vertebra. The patient and her family requested hospitalization. She was admitted to the hospital with a diagnosis of "osteoporosis, T12 vertebral compression fracture." 3. Since admission, the patient has had no headache, dizziness, vomiting, diarrhea, chest tightness, shortness of breath, or dyspnea. Her appetite is good, and bowel movements are normal. 4. Physical examination: T 36.7°C; P 58 bpm; R 18 bpm; BP 125 / 75 mmHg. The patient is alert but mentally unwell, with normal development and moderate nutrition. Speech is fluent. The patient cooperated during the examination and entered the ward. Both pupils are large, round, and respond briskly to light. Cardiopulmonary and abdominal examinations were unremarkable. Bowel sounds were normal. No abnormalities were found in the anus or external genitalia. There were no deformities of the spine or limbs. Lumbar spine mobility was limited, with positive tenderness and percussion pain in the thoracolumbar region. There was no radiating pain or edema in the lower extremities. Sensation, muscle strength, and muscle tone were normal in all four extremities. Limb movement was good, with no impairment of toe movement and normal dorsiflexion. The dorsalis pedis artery was palpable, and peripheral blood supply was normal. Bilateral biceps and triceps brachii arteries... The patient's head and knee reflexes were normal, and Babinski's sign and Kernig's sign were negative bilaterally. 5. Auxiliary examinations: The anteroposterior and lateral views of the thoracolumbar spine showed that the T12 vertebral body was wedge-shaped and compressed by approximately one-third, with a vertebral cortex fracture. 6. Diagnosis and treatment: After admission, the patient underwent relevant examinations and tests, confirming the diagnosis. With no contraindications for surgery, the patient underwent percutaneous posterior vertebroplasty under local anesthesia on December 12, 2016. The procedure was successful. Postoperatively, the patient received anti-inflammatory and anti-osteoporosis medications for symptomatic treatment. The surgical incision was dressed on schedule. The patient and family requested discharge on December 20, 2016.

[0046] Step 2.2: Using a pre-defined second model, objective information is extracted and subjective information is generated from the first clinical corpus (and the content is uniformly converted into a standardized JSON data format) to obtain a structured second clinical corpus. The extracted objective information includes: the patient's medical history background (i.e., case background, covering the patient's chief complaint, present illness, past medical history, and physical examination results); auxiliary examination items (i.e., auxiliary examinations, including laboratory tests, imaging examinations, and other auxiliary diagnostic items); auxiliary examination results (i.e., examination results, corresponding to the specific examination results and data of the auxiliary examination); and diagnostic confirmation results (i.e., diagnostic results, including a clear diagnosis of the disease and supporting evidence). Treatment plans (including specific plans for drug therapy, surgical treatment, and other interventions) generate subjective information including: identifying symptoms that affect the patient's disease diagnosis (i.e., symptom focus, identifying and focusing on core symptoms and signs that directly affect disease diagnosis and differential diagnosis); generating preliminary diagnostic results based on the identified symptoms (i.e., preliminary diagnosis); generating communication suggestions for the patient based on the identified symptoms (i.e., humanistic care, generating empathetic and targeted communication suggestions and care strategies based on the patient's condition and individual characteristics); and analyzing and summarizing the treatment plan (i.e., analysis and summary, summarizing and reflecting on the diagnosis and treatment process, generating a summary description that helps optimize clinical decision-making and teaching effectiveness).

[0047] It should be noted that the second preset model extracts objective information and generates subjective information using prompt words, and the template for these prompt words is as follows:

[0048] "Your task is to transform the input content of the first clinical corpus into a structured, standardized format. The specific steps should strictly follow the nine steps below for objective information extraction and subjective information generation:"

[0049] 1. Case background: Extract key information from the chief complaint, present illness, past medical history, and physical examination to generate a concise case summary;

[0050] 2. Symptom Focus: Identify and extract core symptoms and signs that are directly related to disease diagnosis, based on the case background;

[0051] 3. Preliminary diagnosis: Based on the case background and the content of the symptoms of concern, generate a list of possible differential diagnoses;

[0052] 4. Ancillary examinations: Extract diagnostic-related ancillary examination items from the original electronic clinical medical records;

[0053] 5. Inspection Results: Extract the inspection results and parameter data corresponding to the auxiliary inspections;

[0054] 6. Diagnosis Confirmation: Extract established diagnostic conclusions from electronic medical records and link them to supporting evidence;

[0055] 7. Treatment plan: Extract targeted treatment measures that match the confirmed diagnosis;

[0056] 8. Humanistic care: Based on the patient's condition and individual characteristics, please generate communication suggestions and care strategies that are both compassionate and targeted.

[0057] 9. Analysis and Summary: Systematically review the entire diagnosis, treatment, and care process, extract key information, and generate a summary description that helps optimize clinical decision-making.

[0058] Input: {First Clinical Corpus};

[0059] Output: {a structured second clinical corpus}.

[0060] Step 2.3, Privacy Protection: Remove patient identity information from the second clinical corpus (patient identity information mainly includes the name of the hospital where treatment was received, the patient's name, contact information, etc.); this step is processed using the pre-set second major model;

[0061] Step 2.4, Reasoning Enhancement: First, content verification and deletion: (using the pre-set second major model) perform content correction and deletion on the removed second clinical corpus (identify and delete redundant, conflicting, or illogical content in the objective information). Then, information completion: supplement incomplete fields (supplement missing key data or fields with incomplete descriptions, such as missing examination indicators, diagnostic criteria, or treatment details). Finally, association verification: verify the one-to-one correspondence between auxiliary examination items, auxiliary examination results, diagnostic confirmation results, and treatment plans.

[0062] It should be noted that the preset second major model is obtained through prompt words during processing in steps 2.3-2.4, and the template for these prompt words is as follows:

[0063] "Your task is to perform privacy protection and inference enhancement processing on the input second clinical corpus to ensure that the output data conforms to clinical logic while protecting sensitive information. Specific implementation requirements are as follows:"

[0064] (1) Privacy protection: Identify and remove sensitive information involving personal identity or institutional identification from the text, including but not limited to patient name, ID number, contact information, home address, name of hospital, department name and doctor's name.

[0065] (2) Reasoning enhancement: Objective information extraction and subjective information generation content verification and deletion: Detect and delete redundant information, logically conflicting content or records that do not conform to clinical norms; 2. Information completion: Supplement missing key clinical elements, including but not limited to examination indicators, diagnostic basis, and treatment measures; 3. Correlation verification: Verify and strengthen the causal and correspondence between auxiliary examinations and examination results, treatment plans and diagnostic confirmation results, to ensure the integrity and rigor of the clinical data chain.

[0066] While deleting information, it is important to ensure the integrity and logical coherence of the clinical content, so as not to affect the expression of clinical information.

[0067] Input: {Completed structured clinical corpus};

[0068] Output: {Desensitized and enhanced structured clinical predictions};

[0069] The following is an example of the second clinical corpus after processing for objective information extraction, subjective information generation, privacy protection, and reasoning enhancement:

[0070] "[Case Background: The patient is a 69-year-old female who presented with sudden, severe chest, lumbar, and back pain 7 days prior, which worsened with bending over and rising from a supine position. She sought medical attention after local intravenous treatment proved ineffective. She had a 1-year history of peptic ulcer disease and a 4-year history of hypertension (highest blood pressure 180 / 90 mmHg). She denied any history of heart disease or diabetes. Physical examination revealed poor mental status, limited lumbar spine mobility, and positive tenderness and percussion pain in the thoracolumbar region."]

[0071] "Symptom Focus": "The patient's main symptoms are chest, lumbar and back pain and limited mobility. Physical examination results include tenderness and percussion pain in the thoracolumbar region, suggesting possible problems related to the spinal skeletal system."

[0072] Preliminary diagnosis: "(1) Compression fracture of the thoracic vertebra (2) Osteoporosis";

[0073] “Auxiliary examinations”: “(1) X-ray examination of the thoracic and lumbar spine (anteroposterior and lateral views) (2) Bone density examination”;

[0074] "Examination results": "(1) The T12 vertebral body was wedge-shaped and compressed by about 1 / 3, and the vertebral cortex was fractured. (2) Bone mineral density test showed a T value of -3.0, which meets the diagnostic criteria for osteoporosis."

[0075] "Diagnosis confirmed": "(1) Osteoporosis (2) Compression fracture of the T12 vertebra";

[0076] "Treatment plan": "(1) For the compression fracture of the T12 vertebra, perform percutaneous posterior vertebroplasty; (2) For osteoporosis, administer anti-inflammatory and anti-osteoporosis drugs for symptomatic treatment."

[0077] "Humanistic Care": "If a patient feels fear or is worried about postoperative recovery after vertebroplasty, I would communicate like this: Your concerns about the surgery are perfectly normal; anyone would be nervous facing spinal surgery. Please rest assured that we will ensure the entire process is safe and controllable, and explain each step in detail. Vertebroplasty is a minimally invasive technique with minimal trauma and quick recovery. We will monitor your condition throughout the process and provide detailed rehabilitation guidance after surgery. If you have any doubts or discomfort, please tell us at any time. We will listen patiently and adjust the plan to help you recover smoothly."

[0078] "Analysis and Summary": "The patient was a 69-year-old female who presented with chest, lower back, and back pain. Based on the pain characteristics, abnormal physical examination, and imaging results (compression of the T12 vertebra), she was diagnosed with osteoporosis and a T12 vertebral compression fracture. She underwent percutaneous vertebroplasty and anti-osteoporosis medication. Humanistic care emphasized surgical safety and building trust to enhance the patient's confidence."

[0079] Step 2.5: Finally, a structured clinical teaching corpus is obtained through manual review.

[0080] The second major preset model includes, but is not limited to, ChatGPT, GPT-4, GPT-4o, DeepSeek-R1, Qwen3-235B-A22B, Gemini-2.5-Pro, etc. In this embodiment, DeepSeek-R1 is used.

[0081] Step 3: Based on the clinical thinking framework and clinical teaching corpus, construct a collaborative architecture that includes teacher agents, expert agents, and student agents. Based on the collaborative architecture, generate the final dataset of teaching dialogues between teacher agents, expert agents, and student agents. The teacher agent automatically generates a teaching guidance plan, the expert agent detects and corrects the content generated by the teacher agent, and the student agent responds based on the results generated by the expert agent and the teacher agent.

[0082] Step 3.1: For the teacher agent, design prompts for the teacher agent so that it can automatically generate a teaching guidance plan based on the clinical teaching corpus, clinical thinking framework, and historical teaching dialogues. The teaching guidance plan includes a hierarchical question chain construction based on the clinical thinking framework (used to guide the student agent to reason step by step according to the clinical thinking framework), a question-driven heuristic questioning mechanism (used to promote students' independent thinking through progressive questioning), and an adaptive teaching response based on the differences in the student agent's preset cognitive state (used to dynamically adjust the teaching strategy according to the student agent's preset cognitive state). The types of preset cognitive states of the student agent include five types: comprehension deviation, knowledge error, omission of expression, standard response, and active exploration.

[0083] Among these, comprehension bias refers to a student's inability to correctly understand the teacher's intention in asking a question, resulting in an answer that is irrelevant to the question or off-topic. Typical manifestations include: giving irrelevant answers, conceptual confusion, and requesting repetition of the question;

[0084] Knowledge errors refer to situations where a student understands the general direction of a question but makes factual mistakes, logical inconsistencies, or confuses similar concepts in their answer. Typical manifestations include: incorrect terminology, reversed cause-and-effect relationships, and misattribution.

[0085] Omissions in expression refer to students' basic understanding of a question, but their answers are incomplete, lacking details or omitting key points. Typical manifestations include: incoherent expression, incomplete content, lack of structure in the answer, and inability to integrate related knowledge;

[0086] Standardized responses refer to students' ability to accurately understand questions, apply knowledge appropriately, and express themselves in a standardized, correct, and structurally complete manner. Typical characteristics include: professional language skills, comprehensive content, rigorous logic, and alignment with teaching objectives.

[0087] Active exploration refers to students' ability to propose hypotheses or extend questions based on reasonable answers, demonstrating strong proactive thinking skills. Typical manifestations include: posing counter-questions, extending reasoning, and offering unique insights.

[0088] The following are examples of prompt words for the teacher agent:

[0089] "You are a medical teacher with extensive clinical teaching experience, skilled in conducting heuristic teaching based on real cases. Your task is to engage in phased, progressive teaching responses with clinical interns, using a given 'clinical teaching corpus' and 'historical teaching dialogues,' gradually guiding them to complete their clinical reasoning learning. Your teaching process must adhere to the following principles:"

[0090] (1) Teaching Framework: Your teaching should strictly follow the seven steps of the "Clinical Thinking Framework". Each round of dialogue must be labeled with the current step. Skipping steps, merging steps, or prematurely revealing information about subsequent steps is prohibited. The teaching steps and objectives are as follows:

[0091] 1. Symptom Focus: At the beginning of the lesson, explain the case background and guide students to focus on the patient's symptoms or abnormal manifestations to identify the entry point for diagnosis;

[0092] 2. Preliminary diagnosis: Encourage students to propose possible diagnoses and demonstrate their reasoning ability regarding the causes of the disease;

[0093] 3. Ancillary examinations: Guide students to propose reasonable examination items based on the preliminary diagnosis;

[0094] 4. Diagnosis Confirmation: Based on the relevant examination results, guide students to integrate information and clarify the diagnostic conclusion;

[0095] 5. Treatment Plan: Based on the diagnosis results, guide students to propose a reasonable treatment plan and explain the basis for it;

[0096] 6. Humanistic Care: Exploring humanistic factors such as doctor-patient communication, psychological support, ethical issues, and patient dignity during the diagnosis and treatment process;

[0097] 7. Analysis and Summary: Guide students to systematically review the entire case and organize their clinical reasoning.

[0098] Once the “<Analysis and Summary>” step is completed, use “<End>” to mark the end of the teaching.

[0099] The standard answers for each step can be found in the corresponding content of the "Clinical Teaching Corpus". Student answers do not need to be exactly the same, but they must be basically reasonable, accurate, and logically clear.

[0100] (2) Heuristic teaching: Knowledge explanation or error clarification may only be given when students actively raise questions about medical concepts or when their answers contain obvious knowledge errors. In other cases, it is strictly forbidden to directly give conclusions containing the standard answer of the current step or to provide answer-oriented prompts. Instead, guiding questions, contextual promotion, and verbal encouragement should always be used to stimulate students' active thinking and reasoning, and to avoid rote learning.

[0101] (3) Teaching focus: The response of each round of teaching must be closely related to the teaching objectives of the current step. It is forbidden to guide across steps or introduce irrelevant questions to ensure that teaching is orderly, focused and has clear objectives.

[0102] (4) Flexible guidance: Students may exhibit the following five cognitive states. Please first identify their cognitive states and then adopt corresponding guidance strategies:

[0103] 1. Misunderstanding: Please briefly clarify the purpose of the question to guide the reader to refocus on the answer.

[0104] 2. Knowledge error; please clearly point out the error and briefly explain the reason, guiding the student to correct their answer based on understanding.

[0105] 3. Omissions in the description; please guide them to fill in the missing content through targeted follow-up questions.

[0106] 4. Provide standardized responses; please give positive feedback and continue to advance the teaching.

[0107] 5. Encourage active exploration; please encourage exploration, provide brief answers and guidance (without deviating from the main theme), and continue to advance the teaching in an orderly manner.

[0108] (5) Language style: The expression style should be natural, vivid and friendly, reflecting the professionalism and friendliness of clinical teachers; encouragement language can be used appropriately in the response to stimulate students' participation and thinking enthusiasm; the answer should be close to the real clinical situation, based on case evidence, and avoid empty theories or discussions that are divorced from reality.

[0109] Input: {Clinical Teaching Corpus}, {Historical Context Dialogue};

[0110] Output: {Teacher Response}. The teacher response is the content generated by the teacher agent.

[0111] Step 3.2: For the expert agent, design prompts for the expert agent so that the expert agent can detect and correct the content generated by the teacher agent. If the expert agent detects that the content generated by the teacher agent has knowledge errors, reveals answers, or violates the clinical reasoning framework, it will revise the content generated by the teacher agent. If no problems are detected, no revision is required.

[0112] Examples of prompts from expert agents are as follows:

[0113] "You are an expert physician with profound medical knowledge and rich teaching experience, responsible for supervising and guiding teachers in conducting high-quality clinical thinking teaching. Your task is to evaluate whether the teachers' latest teaching responses meet the following three criteria based on the 'Clinical Teaching Corpus' and 'Historical Teaching Dialogues,' and make adjustments as necessary:"

[0114] (1) Accuracy of teaching content: The medical knowledge in the teacher's response must be scientific, reasonable and error-free.

[0115] (2) Rigor of the teaching process: Teachers must strictly follow the seven steps of clinical thinking in sequence and must not skip steps, combine steps or carry out teaching guidance for subsequent steps in advance.

[0116] (3) Heuristic teaching principle: Teachers should encourage students to think actively through questioning, situational guidance and encouraging language, and should not give answers directly.

[0117] The seven steps of clinical reasoning are defined as follows:

[0118] 1. Symptom Focus: Explain the patient's case background and guide students to identify the main symptoms or abnormal manifestations;

[0119] 2. Preliminary diagnosis: Based on the patient's medical history and main symptoms, students are encouraged to propose reasonable possibilities for the disease;

[0120] 3. Ancillary examinations: Guide students to select appropriate examination items based on the preliminary diagnosis;

[0121] 4. Diagnosis Confirmation: Based on the responses of auxiliary examinations and by combining medical history and symptoms, guide students to arrive at a clear diagnosis.

[0122] 5. Treatment plan: Based on the diagnosis, explore possible treatment plans and their rationale;

[0123] 6. Humanistic care: Exploring humanistic factors such as doctor-patient communication, psychological support, and patient dignity in the diagnosis and treatment process;

[0124] 7. Analysis and summary: Guide students to systematically review the entire case and strengthen their knowledge structure.

[0125] Output rules:

[0126] 1. If the teacher's response fully complies with the above three criteria, the output will be: <No criterion violated>.

[0127] 2. If it does not meet the criteria, the specific error must be pointed out, and a teacher response that meets the criteria must be regenerated.

[0128] Input: {Clinical Teaching Corpus}, {History Teaching Dialogue};

[0129] Output: {Expert Response}. The expert response is the content generated by the expert agent.

[0130] Step 3.3: For the student agent, design prompts for the student agent so that the student agent can respond based on the content generated by the teacher agent and the expert agent, as well as the preset cognitive state of the student agent.

[0131] Before each round of teaching dialogue, one of the five cognitive states is randomly selected as the current cognitive state of the student agent, and the student responds accordingly based on that cognitive state.

[0132] Examples of prompts for the student agent are as follows:

[0133] "You are a medical student receiving clinical practice instruction. Please first randomly select one of the following five cognitive states, and then, in conjunction with the current teaching context, generate a student response that matches the characteristics of that cognitive state."

[0134] The five cognitive states are defined as follows:

[0135] 1. Misunderstanding: Students fail to correctly understand the teacher's intention in asking the question, which manifests as giving irrelevant answers, deviating from the topic, or making irrelevant responses or requesting the question to be repeated due to confusion.

[0136] 2. Conceptual Errors: Students generally grasp the direction of the teacher's questions, but their answers contain obvious factual errors, confused causal relationships, or improper reasoning, leading to biased judgments or incorrect expressions. This often manifests as confidence coupled with incorrect content and logical contradictions.

[0137] 3. Omissions in Expression: Students' thinking is correct, but the content is incomplete, with missing key information, one-sided arguments, or insufficient details, affecting the comprehensiveness and depth of the answer. It is important to avoid simply repeating confirmed content without making further progress.

[0138] 4. Standardized Response: Students accurately understand the question and their answers are scientific, standardized, clearly structured, and complete in key points, meeting the teaching standards. However, they do not proactively raise further questions or expand upon the topic.

[0139] 5. Active Exploration: Based on standardized responses, students actively propose a hypothesis, question, inference, or independent insight that is closely related to the current teaching steps, demonstrating their spirit of exploration and clinical associative ability.

[0140] Your response must follow these guidelines:

[0141] 1. Ensure that each answer reflects only one cognitive state, and the content must closely match the expression style, cognitive level, and logical characteristics of that cognitive state.

[0142] 2. Ensure that the styles are clear and easily distinguishable between different cognitive states.

[0143] 3. The language used in the response must be natural and authentic, taking into account the context, and avoiding a written tone, piling up of technical terms, or meaningless repetition.

[0144] Input: {History Teaching Dialogue};

[0145] Output: {Student Response}. The student response is the content generated by the student agent.

[0146] Step 3.4: Set up the teaching dialogue between agents in the following order: teacher agent, expert agent, student agent, and then back to the teacher agent in the first loop (i.e., teacher agent). Expert Intelligent Agent Student Intelligent Agent (The loop call logic of the teacher agent).

[0147] Step 3.5: Generate the initial dataset of teaching dialogues according to the first cycle sequence. The initial dataset contains multiple first samples, and each first sample contains multiple rounds of first teaching dialogues. The teaching dialogue from the teacher agent to the expert agent and then to the student agent constitutes one round of first teaching dialogue. In the next round, the teacher agent generates a teaching guidance plan based on historical teaching. The initial dataset is represented as follows: , ,in, ,in The total number of the first sample. The total number of rounds for the first teaching dialogue is The The first sample, , and The numbers represent the order of the numbers. In the first sample, the first The content generated by the teacher agent, the content generated by the expert agent, and the content answered by the student agent in the first round of teaching dialogue.

[0148] Here are some examples of generating multi-turn dialogues:

[0149] "[Teacher: "<Clinical Focus> Your observation is very keen, noting the importance of positive tenderness and percussion pain! However, there's a concept that needs clarification: In spinal examinations, positive local tenderness and percussion pain are usually typical manifestations of bony lesions (such as fractures, osteoporosis), rather than a direct indication of nerve root compression; nerve root compression is more often manifested as radiating pain, sensory abnormalities, or motor dysfunction. Based on the patient's pain characteristics (sudden onset during activity, aggravated by bending over and standing up from a supine position) and these signs, what type of problem do you think they are more likely to point to?"

[0150] "Expert": "No violation of guidelines";

[0151] "Student": "<Cognitive status[5]> I think these pain characteristics and signs are more likely to point to bone lesions, such as vertebral compression fractures or osteoporosis, because sudden pain during activity, aggravation by position, plus positive local tenderness and percussion pain are all typical manifestations. But I am wondering if the patient's history of peptic ulcer disease will affect the diagnosis or treatment options for osteoporosis, such as whether calcium supplements need to be adjusted to avoid gastrointestinal irritation?"

[0152] "Teacher, <Preliminary Diagnosis> Great! You've correctly pointed the symptoms and signs to a bone condition and proposed two reasonable preliminary diagnoses: vertebral compression fracture and osteoporosis. Regarding your question about whether a history of peptic ulcers affects the diagnosis, I can briefly explain: a peptic ulcer itself doesn't change the diagnostic criteria for osteoporosis (such as pain characteristics and signs), but it does require consideration of medication choices in subsequent treatment to avoid exacerbating the stomach problem. Now, based on this preliminary judgment, what tests do you think we need to arrange to further verify the possibility of these diseases?"

[0153] "Expert": "<Violation of guidelines [2][3], directly confirming the student's diagnosis and introducing subsequent treatment steps> Correction: <Preliminary Diagnosis> Very good, you have proposed vertebral compression fracture and osteoporosis as possible diagnoses! Regarding whether the history of gastric ulcers affects the diagnosis, gastric ulcers themselves do not change the diagnostic basis for osteoporosis. Now, based on these preliminary judgments, what examinations do you think we need to arrange to further verify the possibility of these diseases?";

[0154] “Student”: “<Cognitive state[1]> Hmm, the patient’s mental state is quite poor. Should we do a psychological assessment first?”}”

[0155] Step 3.6: For each round of the first teaching dialogue in each first sample, if the expert agent revises the content generated by the teacher agent, then the content generated by the expert agent and the response from the student agent in that round of the first teaching dialogue constitute a round of the second teaching dialogue. If the expert agent does not revise the content generated by the teacher agent, then the content generated by the teacher agent and the response from the student agent in that round of the first teaching dialogue constitute a round of the second teaching dialogue. Multiple rounds of the second teaching dialogue constitute the second sample, and multiple second samples constitute the final dataset, where the final dataset is represented as follows: = The total number of the second sample is The number of rounds of the second teaching dialogue in each second sample is: , For the first A second sample, Indicates the first In the second sample, the first The content generated by the teacher agent or the content revised by the expert agent in the second round of teaching dialogue. Indicates the first In the second sample, the first The content of the student's agent's answers in the second round of teaching dialogue.

[0156] The teacher agent, expert agent, and student agent can all be implemented using open-source or closed-source general-purpose large language models, including but not limited to ChatGPT, GPT-4, GPT-4o, DeepSeek-R1, Qwen3-235B-A22B, Gemini-2.5-Pro, etc. In this embodiment, DeepSeek-R1 is used for implementation.

[0157] Step 4: Select a pre-defined first large model (this first large model is a suitable open-source general-purpose large language model as the base model, including but not limited to: Qwen series, ChatGLM series, Llama series, Huatuo series, Baichuan series, etc. This example uses Qwen-3) as the first base model, and use the final dataset to perform supervised fine-tuning on the first base model to obtain the second base model; finally, use a portion of the samples in the final dataset to perform reinforcement learning on the second base model to obtain the final medical education large model;

[0158] Step 4.1: Process the final dataset into ShareGPT format, which can be used as input for training the first large model;

[0159] Step 4.2: Using the final dataset processed in Step 4.1, perform supervised fine-tuning on the first base model (supervised fine-tuning methods include, but are not limited to, LoRA, QLoRA, P-Tuning, and P-Tuning V2; this example uses LoRA, and the supervised fine-tuning training process can be implemented on the LlaMA-Factory platform) to obtain the second base model. The formula for supervised fine-tuning of the first base model is as follows:

[0160] ;

[0161] in, To use the second sample The set of parameters introduced into the first basic model The result of training (on a set of trainable parameters). For parameters Below, the first basic model is in the first... In the second sample, the first Second round of teaching dialogue At the same time, predict the output of the teacher's intelligent agent. The conditional probability; where the supervised fine-tuned parameters are used express;

[0162] Step 4.3: Use a subset of samples from the final dataset to perform reinforcement learning on the second basic model to obtain the final large-scale medical education model, including:

[0163] A portion of the teaching dialogue is randomly sampled from each sample in the final dataset to form a third sample. The set of all third samples is represented as follows: = , Indicates the first A third sample, This indicates the second teaching dialogue round in the third sample. This represents the number of third samples in the third sample set, and , ;

[0164] For each third sample, corresponding prompts are designed based on the cognitive state of different student agents within the third sample. A pre-defined second model is then used to generate positive samples of content generated by the teacher agent, conforming to three dimensions: clinical thinking framework, heuristic teaching characteristics, and personalized feedback ability (denoted as...). ), and sequentially generate negative samples of teacher agent-generated content that are deficient in each of the three dimensions (represented as follows). , and The criteria include: conforming to the clinical thinking framework (i.e., following the seven-step operation sequence of the clinical thinking framework); conforming to the characteristics of heuristic teaching (i.e., teaching through questioning to avoid revealing answers); and conforming to personalized feedback capabilities (i.e., optimizing the content generated by the teacher's agent based on the five cognitive states of the student's agent). The second pre-set model generates positive and negative samples through prompt words, and an example of such prompt words is as follows:

[0165] When the student agent understands a state deviating from its cognitive state, the prompt words for generating positive and negative samples are as follows:

[0166] "You are a medical 'teacher' with rich clinical teaching experience, skilled in conducting heuristic teaching based on real cases. Your task is to generate three incorrect teaching demonstrations based on the latest student responses in the 'teaching dialogue context' (current student cognitive state: comprehension deviation: unable to correctly understand the teacher's intention in asking the question, and the content of the answer deviates from the focus or topic): 1. Incorrect step-following, 2. Incorrect heuristic questioning, 3. Incorrect adaptive teaching, and one 'perfectly correct' heuristic teaching demonstration."

[0167] Requirements for generation: Correct demonstrations should encourage and inspire students, guiding them to focus on thinking; incorrect demonstrations should accurately reflect the corresponding errors.

[0168] Teaching steps and objectives (seven-step clinical thinking framework):

[0169] 1. Symptom Focus: Explain the case background and guide students to focus on the patient's symptoms or abnormal manifestations to identify the entry point for diagnosis.

[0170] 2. Preliminary diagnosis: Encourage students to propose possible diagnoses and demonstrate their ability to reason about the causes of the disease.

[0171] 3. Auxiliary examinations: Guide students to propose reasonable examination items based on the preliminary diagnosis.

[0172] 4. Diagnosis Confirmation: Based on the examination results, guide students to integrate information and clarify the diagnostic conclusion.

[0173] 5. Treatment Plan: Based on the diagnosis results, guide students to propose a reasonable treatment plan and explain the basis for it.

[0174] 6. Humanistic Care: Exploring humanistic factors such as doctor-patient communication, psychological support, ethical issues, and patient dignity during the diagnosis and treatment process;

[0175] 7. Analysis and summary: Guide students to systematically review the entire case and sort out the clinical thinking process.

[0176] The perfect and correct teaching demonstration standard: clearly and concisely clarify the purpose of the question and guide students back to the current teaching topic; the teaching should conform to the objectives of the current teaching steps; use open-ended and thought-provoking questions to guide thinking and avoid giving or obviously implying answers; actively encourage students and use friendly language; and ask specific questions based on the context of real cases.

[0177] Three incorrect teaching demonstration standards:

[0178] 1. Incorrect Steps: Students' answers deviate significantly from the focus of the question, and teachers deviate from the teaching objectives of the current step; or multiple teaching steps are mixed in the same question, causing confusion in thinking.

[0179] 2. Errors in heuristic questioning: indirectly revealing part of the answer or obviously hinting at the answer (e.g., using terms such as "hint" or "hint").

[0180] 3. Errors in adaptive teaching: When students' answers deviate from the intended meaning of the question, the teacher does not clarify the question's intent but simply affirms and repeats the previous question; the teacher misjudges the student's deviated answer as active exploration and responds irrelevantly.

[0181] Input: {teaching dialogue context};

[0182] Output: {One perfect positive sample, three negative samples that are flawed in their respective dimensions}.

[0183] The third sample is combined with the corresponding positive and negative samples to form a preference triplet; the form of the preference triplet is ( ), Indicates the third sample , Indicates a positive sample. Indicates negative samples (for) , or For each third sample, three preference triples are formed: ;

[0184] The second basic model is trained using Direct Preference Optimization (DPO) with preference triples to obtain the final large-scale medical education model, which can be expressed by the following formula:

[0185] ;

[0186] in, To directly optimize the training results, for Activation function For the model to be optimized (the intermediate model in the direct preference optimization training process) given input When, output The conditional probability, For the model to be optimized under given input When, output The conditional probability, For the second base model given input Below, output The conditional probability, For the second base model given input Below, output The conditional probability, This is a temperature coefficient used to adjust the intensity of preference.

[0187] The final medical education model can be deployed to the target teaching platform to provide online heuristic clinical teaching services. Specifically: Deployment environment configuration: Select a suitable operating environment and encapsulate the final medical education model into a callable service interface. This can be configured based on a cloud server, a hospital's local server, or a hybrid deployment architecture to ensure sufficient computing resources and response efficiency. Platform integration: Connect the final medical education model interface to the target teaching platform, such as a medical teaching management system, a clinical simulation training platform, or an online teaching system, and implement dialogue calls and interactive feedback through standard APIs. Teaching service application: Use structured clinical teaching data as input, and the final medical education model generates heuristic teaching dialogues. The platform outputs personalized feedback and interactive results to learners, achieving a traceable and assessable clinical thinking training process.

[0188] Experimental comparison:

[0189] The final medical education model of this application is compared with the teaching example of the general model GPT-4.1, as shown below:

[0190] Prompt: {You are a medical "teacher" with rich clinical teaching experience. Your task is to cultivate the "students'" clinical thinking ability through questioning, based on the provided "structured clinical teaching materials." The teaching must strictly follow these seven steps: "Patient observation, preliminary diagnosis, auxiliary examinations, diagnosis confirmation, treatment plan, humanistic care, and analysis and summary." Answers to each step can be found in the corresponding content within the "structured clinical teaching materials."}

[0191] Input: {Clinical teaching corpus: [Case background]: The patient is a 68-year-old female farmer who complained of weakness in her right limbs and slurred speech for more than one day. Her present medical history showed that her symptoms worsened within one day, without accompanying symptoms such as dizziness, nausea, or vomiting. She had a history of intermittent hypertension and headaches. Physical examination revealed dysarthria, a shallow right nasolabial fold, leftward deviation of the corner of the mouth, rightward deviation of the tongue, right upper limb muscle strength grade 4-, right lower limb muscle strength grade 3+, right-sided sensory loss, and a positive double gracilis sign.}

[0192] [Symptoms to watch out for]: The main symptoms include weakness in the right limbs and slurred speech. Physical signs such as right facial paralysis, muscle weakness and sensory abnormalities suggest neurological dysfunction.

[0193] [Preliminary diagnosis]: Initially considered to be an acute cerebrovascular event, such as cerebral infarction.

[0194] [Auxiliary examinations]: (1) Head CT; (2) Head MRI / MRA (including DWI).

[0195] [Examination results]: (1) Head CT showed multiple small patchy low-density shadows in the left basal ganglia, ventricle enlargement, and widening of sulci; (2) Head MRI showed DWI high signal and ADC low signal in the left basal ganglia, and multiple punctate high signal in the periventricular white matter and subcortical area of ​​the frontoparietal lobes; MRA showed multiple stenosis of cerebral arteries.

[0196] [Diagnosis Confirmed]: Acute cerebral infarction (cerebral arteriosclerosis type).

[0197] [Treatment plan]: (1) Acute phase treatment includes improving microcirculation, protecting brain tissue and antiplatelet aggregation; (2) Post-discharge management includes low-salt and low-fat diet, continued oral medication, rehabilitation therapy and regular follow-up examinations.

[0198] [Humanistic Care]: If a patient expresses concerns about the recovery process, I would communicate like this: 'We completely understand your concerns about recovery after a stroke. It's natural to feel anxious after such a diagnosis. Please rest assured that we will accompany you step by step in managing your condition, including medication, rehabilitation training, and lifestyle adjustments. If you have any questions or discomfort during treatment, please feel free to tell us. We will listen patiently and fully support you.'

[0199] [Analysis Summary]: The patient, a 68-year-old female, presented with acute right-sided limb weakness and slurred speech. Imaging revealed signs of acute infarction in the left basal ganglia, leading to a diagnosis of acute ischemic stroke. She was treated with antiplatelet therapy and discharged after symptom relief. Treatment strategies included acute intervention and long-term management. Humanistic care emphasized patient listening and psychological support to enhance the patient's confidence in recovery.

[0200] General GPT-4.1 instruction:

[0201] GPT-4.1: Now, I will be playing the role of an inspirational clinical instructor. Please answer the questions by following these steps:

[0202] 1. Symptom Focus: Which body system is most likely to be dysfunctional due to weakness in the right limbs and slurred speech?

[0203] 2. Initial diagnosis: Symptoms worsen rapidly within a day. In which disease is this pattern of symptom change most common?

[0204] 3. Ancillary examinations: What does the appearance of multiple low-density areas on a CT scan mean? Do these findings support your initial diagnosis?

[0205] 4. Diagnosis Confirmation: Based on the combined symptoms and examination results, how would you confirm the diagnosis?

[0206] 5. Treatment plan: What basic treatment measures should be taken during the acute phase?

[0207] 6. Humanistic care: If a patient feels anxious about the recovery process, how would you comfort and encourage her?

[0208] 7. Analysis and Summary: Please briefly summarize the diagnostic approach and key treatment points for this case.

[0209] The final medical education model taught in this application:

[0210] The Ultimate Model for Medical Education: Welcome to today's clinical teaching! We have a 68-year-old female farmer patient who has presented with right-sided limb weakness and slurred speech for over a day, with symptoms worsening during the day. She denies dizziness, nausea, or vomiting. Physical examination reveals dysarthria, right-sided central facial paralysis, leftward deviation of the mouth, right upper limb muscle strength grade 4, right lower limb muscle strength grade 3+, right-sided sensory loss, and bilateral positive pathological reflexes. Based on this description, which symptoms or abnormal signs do you think should be our primary clinical focus? Please share your initial thoughts.

[0211] User: The key focus should be on right-sided limb weakness, speech impairment, dysarthria, and positive pathological reflexes, as these all indicate the presence of neurological dysfunction.

[0212] Final Medical Education Model: Excellent! You accurately identified right-sided limb weakness, speech impairment, dysarthria, and positive pathological reflexes as key signs strongly suggesting neurological involvement. Now, let's move on to the next step: Based on these symptoms and signs, what do you think is the most likely preliminary diagnosis? Please explain your reasoning process.

[0213] User: The preliminary diagnosis is acute cerebral infarction in the left cerebral hemisphere. The reasoning is based on the right hemiplegia and central facial paralysis suggesting a lesion in the contralateral hemisphere, and the speech impairment supporting involvement of the dominant hemisphere.

[0214] The final medical education model: Fantastic! Your reasoning is clear and professional—accurately locating right-sided hemiplegia and central facial palsy to the left cerebral hemisphere, and emphasizing the involvement of the dominant hemisphere to explain the speech disorder, demonstrating solid neuroanatomical knowledge. Now, let's move forward: Based on this preliminary diagnosis, what auxiliary examinations do you think we should arrange to confirm the diagnosis and assess the condition? Please share your thoughts and reasons.

[0215] Based on the above experimental comparisons, it can be seen that the general-purpose large-scale model GPT-4.1 often outputs all questions at once, lacking the gradual progression of clinical thinking chains, thus making it difficult to achieve truly heuristic teaching. In contrast, the final medical education large-scale model proposed in this application can guide students to reason and explore step by step based on a clinical thinking framework. While maintaining the logical consistency of teaching, it effectively stimulates students' independent thinking ability and provides appropriate guidance and feedback, thus better meeting the actual needs of clinical teaching.

[0216] In another embodiment, the present invention also provides a fine-tuning device for a large-scale medical education model based on clinical thinking, including a processor and a memory storing a plurality of computer instructions, wherein the computer instructions, when executed by the processor, implement the steps of the automatic response method integrating dual word segmentation and iterative feedback. One such fine-tuning device for a large-scale medical education model based on clinical thinking includes:

[0217] The first module is used to construct a clinical thinking framework based on clinical teaching data and expert opinions;

[0218] The second module is used to collect different types of medical electronic medical record data as the initial first clinical corpus, and to perform objective information extraction, subjective information generation, data filtering and enhancement processing on the first clinical corpus according to the clinical thinking framework to obtain a structured clinical teaching corpus.

[0219] The third module is used to construct a collaborative architecture containing teacher agents, expert agents, and student agents based on the clinical thinking framework and clinical teaching corpus. Based on the collaborative architecture, the final dataset of teaching dialogues between teacher agents, expert agents, and student agents is generated. In this dataset, the teacher agent automatically generates a teaching guidance plan, the expert agent detects and corrects the content generated by the teacher agent, and the student agent responds based on the results generated by the expert agent and the teacher agent.

[0220] The fourth module is used to select a preset first large model as the first basic model, use the final dataset to perform supervised fine-tuning of the first basic model to obtain the second basic model, and finally use a portion of the samples in the final dataset to perform reinforcement learning on the second basic model to obtain the final large medical education model.

[0221] For specific limitations regarding the fine-tuning device for the large-scale model of heuristic medical education based on clinical thinking, please refer to the limitations on the fine-tuning method of the large-scale model of heuristic medical education based on clinical thinking mentioned above, which will not be repeated here.

[0222] The memory and processor are electrically connected directly or indirectly to enable data transmission or interaction. For example, these components can be electrically connected to each other via one or more communication buses or signal lines. The memory stores a computer program that can run on the processor, and the processor implements the method of the present invention by running the computer program stored in the memory.

[0223] The memory may be, but is not limited to, Random Access Memory (RAM), Read Only Memory (ROM), Programmable Read-Only Memory (PROM), Erasable Programmable Read-Only Memory (EPROM), Electrically Erasable Programmable Read-Only Memory (EEPROM), etc. The memory is used to store programs, which the processor executes upon receiving execution instructions.

[0224] The processor may be an integrated circuit chip with data processing capabilities. The aforementioned processor can be a general-purpose processor, including a Central Processing Unit (CPU), a Network Processor (NP), etc. It can implement or execute the methods, steps, and logic block diagrams disclosed in the embodiments of this invention. The general-purpose processor can be a microprocessor or any conventional processor.

[0225] This method and device for fine-tuning a large-scale medical education model based on clinical thinking forms a structured clinical teaching corpus through a clinical thinking framework. Based on this corpus, a three-agent collaborative architecture—teacher agent, expert agent, and student agent—is constructed. The final dataset is generated through this collaborative architecture and then fine-tuned using the final dataset to obtain the final large-scale medical education model. This significantly improves the model's interactivity, adaptability, and teaching guidance capabilities. The final large-scale medical education model can guide students to reason and explore step-by-step based on the clinical thinking framework. While maintaining the logical flow of teaching, it effectively stimulates users' independent thinking ability and provides appropriate guidance and feedback, thus better meeting the actual needs of clinical teaching.

[0226] It should be understood that, although Figure 1 The steps in the flowchart are shown sequentially as indicated by the arrows, but these steps are not necessarily executed in the order indicated by the arrows. Unless otherwise specified herein, there is no strict order in which these steps are executed, and they can be performed in other orders. Figure 1 At least some of the steps in the process may include multiple sub-steps or multiple stages. These sub-steps or stages are not necessarily completed at the same time, but can be executed at different times. The execution order of these sub-steps or stages is not necessarily sequential, but can be executed in turn or alternately with other steps or at least some of the sub-steps or stages of other steps.

[0227] The technical features of the above embodiments can be combined in any way. For the sake of brevity, not all possible combinations of the technical features in the above embodiments are described. However, as long as there is no contradiction in the combination of these technical features, they should be considered to be within the scope of this specification.

[0228] The embodiments described above are merely illustrative of several implementations of the present invention, and while the descriptions are specific and detailed, they should not be construed as limiting the scope of the invention. It should be noted that those skilled in the art can make various modifications and improvements without departing from the concept of the present invention, and these modifications and improvements all fall within the scope of protection of the present invention. Therefore, the scope of protection of the present invention should be determined by the appended claims.

Claims

1. A method for fine-tuning a large-scale medical education model based on clinical thinking, characterized by: The fine-tuning method for the large-scale model of heuristic medical education based on clinical thinking includes: Construct a clinical thinking framework based on clinical teaching data and expert opinions; We collected different types of electronic medical record data as the initial first clinical corpus, and then performed objective information extraction, subjective information generation, data filtering and enhancement processing on the first clinical corpus based on the clinical thinking framework to obtain a structured clinical teaching corpus. Based on the clinical thinking framework and clinical teaching corpus, a collaborative architecture is constructed that includes teacher agents, expert agents, and student agents. Based on the collaborative architecture, a final dataset of teaching dialogues between teacher agents, expert agents, and student agents is generated. In this dataset, the teacher agent automatically generates a teaching guidance plan, the expert agent detects and corrects the content generated by the teacher agent, and the student agent responds based on the results generated by the expert agent and the teacher agent. The first pre-defined large model is selected as the first basic model. The first basic model is then fine-tuned under supervision using the final dataset to obtain the second basic model. Finally, a portion of the samples in the final dataset are used to perform reinforcement learning on the second basic model to obtain the final large medical education model. The clinical thinking framework includes seven steps: Symptom Focus: Identifying symptoms based on the patient's medical history; Preliminary Diagnosis: Making a preliminary diagnosis based on the identified symptoms; Auxiliary Examinations: Conducting auxiliary examinations to verify the preliminary diagnosis; Diagnosis Confirmation: Confirming the diagnosis based on the results of the auxiliary examinations; Treatment Plan: Determining the treatment plan based on the confirmed diagnosis; Humanistic Care: Paying attention to and communicating with the patient regarding their illness and mental well-being; and Analysis and Summary: Analyzing and summarizing the treatment process.

2. The method for fine-tuning a large-scale model of heuristic medical education based on clinical thinking as described in claim 1, characterized in that: The process of obtaining the clinical teaching corpus includes: The collected electronic medical record data is converted to a preset text format, and the converted electronic medical record data is used as the initial first clinical corpus. The second pre-defined model was used to extract objective information and generate subjective information from the first clinical corpus, resulting in a structured second clinical corpus. The extracted objective information included: the patient's medical history background, auxiliary examination items, auxiliary examination results, diagnostic confirmation results, and treatment plan. The generated subjective information included: identifying symptoms that affect the patient's disease diagnosis, generating preliminary diagnostic results based on the identified symptoms, generating communication suggestions for the patient based on the identified symptoms, and analyzing and summarizing based on the treatment plan. Remove patient identity information from the second clinical corpus; The second clinical corpus, after being removed, underwent content correction and deletion, incomplete fields were supplemented, and the one-to-one correspondence between auxiliary examination items, auxiliary examination results, diagnostic confirmation results, and treatment plans was verified. Finally, a structured clinical teaching corpus was obtained through manual review.

3. The method for fine-tuning a large-scale model of heuristic medical education based on clinical thinking as described in claim 1, characterized in that: In the process of building the collaborative architecture, prompt words are designed for the teacher agent, enabling the teacher agent to automatically generate a teaching guidance plan based on the clinical teaching corpus, clinical thinking framework and historical teaching dialogue. The teaching guidance plan includes the construction of a hierarchical question chain based on the clinical thinking framework, a question-driven heuristic questioning mechanism and an adaptive teaching response based on the differences in the preset cognitive state of the student agent. The preset cognitive states include five types: comprehension deviation, knowledge error, omission of expression, standard response and active exploration. For expert agents, prompt words are designed to enable them to detect and correct the content generated by teacher agents. If the expert agent detects that the content generated by the teacher agent contains knowledge errors, reveals answers, or violates the clinical reasoning framework, the content generated by the teacher agent will be revised. If no problems are detected, no revision is required. For student agents, prompts are designed so that student agents can respond based on the content generated by teacher agents and expert agents, as well as the preset cognitive state of the student agent. The teaching dialogue between agents is set up in the first loop order: teacher agent, expert agent, student agent, and then back to teacher agent.

4. The method for fine-tuning a large-scale model of heuristic medical education based on clinical thinking as described in claim 3, characterized in that: In the process of generating the final dataset, the initial dataset of teaching dialogues is generated in the order of the first cycle. The initial dataset contains multiple first samples, and each first sample contains multiple rounds of first teaching dialogues. The teaching dialogues from the teacher agent to the expert agent and then to the student agent will be used as a round of first teaching dialogues. In the next round, the teacher agent will generate a teaching guidance plan based on the historical teaching. For each round of the first teaching dialogue in each first sample, if the expert agent revises the content generated by the teacher agent, then the content generated by the expert agent and the content answered by the student agent in the first round of the first teaching dialogue are considered as the second round of the second teaching dialogue. If the expert agent does not revise the content generated by the teacher agent, then the content generated by the teacher agent and the content answered by the student agent in the first round of the first teaching dialogue are considered as the second round of the second teaching dialogue. Multiple rounds of second teaching dialogues constitute the second sample, and multiple second samples constitute the final dataset.

5. The method for fine-tuning a large-scale model of heuristic medical education based on clinical thinking as described in claim 1, characterized in that: The process involves using a subset of samples from the final dataset to perform reinforcement learning on the second base model, resulting in the final large-scale medical education model, which includes: A portion of the teaching dialogues is randomly sampled from each sample in the final dataset to form a third sample; For each third sample, corresponding prompt words are designed based on the cognitive state of different student agents in the third sample. The pre-set second model is used to generate positive samples of teacher agent-generated content that conforms to the three dimensions of clinical thinking framework, heuristic teaching characteristics, and personalized feedback ability. Negative samples of teacher agent-generated content that are insufficient in each of the three dimensions are also generated in turn. Conforming to the clinical thinking framework means executing the seven-step operation of the clinical thinking framework in sequence. Conforming to the heuristic teaching characteristics means teaching by asking questions and avoiding revealing the answers. Conforming to the personalized feedback ability means optimizing the teacher agent-generated content according to the five cognitive states of the student agents. The third sample is combined with the corresponding positive and negative samples to form a preference triple; By using preference triples to perform direct preference optimization training on the third basic model, the final large-scale medical education model is obtained.

6. A fine-tuning device for a large-scale medical education model based on clinical thinking, comprising a processor and a memory storing a number of computer instructions, characterized in that, When the computer instructions are executed by the processor, they implement the steps of the method according to any one of claims 1 to 5.