Local physical experiment question and answer model training method and system
By using a closed-loop mechanism for edge-cloud collaboration, the system generates and corrects the answers of the local physical experiment question-answering model using a large language model in the cloud, which solves the problems of data scarcity and reduced reasoning ability in vertical domains and achieves efficient and accurate training of the local question-answering model.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- TSINGHUA UNIVERSITY
- Filing Date
- 2026-03-19
- Publication Date
- 2026-06-19
AI Technical Summary
Existing technologies lack high-quality vertical domain fine-tuning datasets for physics experiment teaching, manual annotation is costly, and local models suffer from reduced inference capabilities due to quantization compression, making it difficult to meet the needs of model iteration.
By constructing a closed-loop mechanism for end-to-end cloud collaboration, consisting of "cloud-based question generation - local answering - cloud-based polishing - local training", questions are generated using a large language model in the cloud and the answers from the local language model are corrected to form a training dataset. The local model is then fine-tuned and trained, and combined with memory optimization, it runs on a consumer-grade graphics card.
It enables the efficient generation of high-quality training data with limited hardware resources, corrects logical biases caused by quantization, improves the accuracy and robustness of local question answering models, and provides a low-cost, transferable localized solution.
Smart Images

Figure CN122242750A_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of artificial intelligence technology, specifically to a method and system for training a local physical experiment question-answering model. Background Technology
[0002] With the rapid development of large language models, their application in vertical fields such as educational assistance and intelligent Q&A has become a research hotspot. In physics experimental teaching scenarios, such as precision experiments like "steady-state method for measuring thermal conductivity," artificial intelligence assistants are usually required to understand complex experimental principles, equipment operation procedures, and data analysis logic. This places extremely high demands on the model's domain expertise and reasoning ability.
[0003] Related technologies typically employ pre-trained large models combined with efficient parameter fine-tuning techniques to construct vertical domain question-answering models. However, this approach faces the following technical bottlenecks in practical applications: First, obtaining high-quality professional datasets is difficult. Physics experiments belong to a typical "long-tail knowledge" domain, and existing public datasets are mostly general corpora, lacking standardized question-answer pairs specific to experimental steps, error analysis, and equipment operation. Relying on expert manual annotation is not only costly but also inefficient, making it difficult to meet the needs of model iteration. Second, local model knowledge iteration efficiency is low. To deploy on consumer-grade graphics cards, models typically need to be quantized and compressed (e.g., 4-bit or 8-bit quantization), which inevitably leads to a decrease in model reasoning ability and a tendency to produce logical illusions. Furthermore, relying solely on a single model for self-looping data generation can easily accumulate logical biases over long-term iterations, making it difficult to guarantee the physical rigor of the generated data.
[0004] Therefore, how to efficiently and cost-effectively generate high-quality vertical domain fine-tuning datasets to build vertical domain question-answering models under limited hardware resources is an urgent technical problem to be solved. Summary of the Invention
[0005] In view of the above problems, embodiments of this application provide a local physical experiment question-answering model training method and system to overcome or at least partially solve the above problems.
[0006] To solve the above-mentioned technical problems, this application is implemented as follows: A first aspect of this application discloses a method for training a local physical experiment question-answering model, the method comprising: Load a pre-trained local language model and perform GPU memory optimization on the local language model to obtain a quantized local language model; The system invokes a cloud-based large language model to generate multiple questions related to physics experiments based on background knowledge of physics experiments; the background knowledge of physics experiments is obtained by parsing documents on the principles of physics experiments. The question is input into the quantized local language model, and the model inference generates the original answer to the question. The question-and-answer pair containing the question and the original answer is sent to the cloud-based large language model. The cloud-based large language model performs logical evaluation and correction on the original answer based on the background knowledge of the physics experiment, and generates a personalized standard question-and-answer pair based on the performance of the local language model and conforming to the instruction fine-tuning format. Collect and store multiple standard question-answer pairs to form a training dataset, and use the training dataset to fine-tune the quantized local language model to obtain a local physical experiment question-answering model.
[0007] Optionally, the local language model is optimized for video memory, including: The weights of the local language model are quantized using 4-bit or 8-bit methods to convert the original 16-bit floating-point quantization into low-bit integers; and a low-rank adapter is loaded into the quantized local language model to construct a parameter-efficient fine-tuning structure. The fine-tuning of the quantized local language model using the training dataset includes updating the parameters of the low-rank adapter using the training dataset.
[0008] Optionally, a cloud-based large language model can be invoked to generate multiple questions related to physics experiments based on background knowledge of the physics experiments, including: Maintain a list of generated issues; Each time the cloud-based large language model is invoked, the list of generated questions is used as context input to guide the cloud-based large language model in generating new questions that do not overlap with existing questions in the list of generated questions.
[0009] Optionally, the cloud-based large language model performs logical evaluation and correction on the original answer based on the background knowledge of the physics experiment, including: Based on the background knowledge of the physical experiment, the cloud-based big language model judges whether there are logical flaws in the original answer's description of the physical experiment's principles, steps, and phenomena. If there are logical flaws, the cloud-based big language model will correct the original answer based on the background knowledge of the physics experiment to conform to the inherent logic of the physics experiment.
[0010] Optionally, generating personalized standard question-and-answer pairs based on the local language model's representation and conforming to the instruction fine-tuning format includes: The cloud-based large language model scores multiple different original answers generated for the same question; the scoring is based on the accuracy, completeness, and conformity with the experimental principles of the answer. Select the original answers that meet the preset threshold, take them as the preferred answers, and combine them with the question to form the standard question-answer pair.
[0011] Optionally, before inputting the problem into the quantized local language model, the method further includes: The background knowledge of the physical experiments is sliced and stored in a vector database; Based on the aforementioned problem, relevant knowledge slices are retrieved from the vector database; The relevant knowledge slices are used as part of the input context and are input along with the question into the quantized local language model to help it generate the original answer.
[0012] Optionally, after completing a round of data generation tasks, the process also includes: releasing the GPU memory resources used during the generation process to prevent memory fragmentation from accumulating due to prolonged operation.
[0013] Optionally, the background knowledge of the physical experiment also includes real-time data acquired from the physical experiment equipment; the real-time data is read directly from digital sensors connected to the physical experiment equipment via a serial communication interface.
[0014] Optionally, the physical experiment is a steady-state method for measuring thermal conductivity, and the content of the physical experiment principle document includes at least one of the following: physical formula for measuring thermal conductivity, method for judging steady-state conditions, steps for measuring the cooling rate of the heat sink, and operating procedures for digital display thermocouples.
[0015] A second aspect of this application discloses a local physics experiment question-answering model training system, used to execute the local physics experiment question-answering model training method described in the first aspect of this application, including: Local computing unit: contains at least one consumer-grade GPU for loading and running a local language model optimized for video memory, and performing model inference to generate the original answer; Experimental Knowledge Base: Stores documents on the principles of physical experiments; Cloud Interface Module: Used to communicate with the cloud-based large language model, send question generation requests and question-answer pair correction requests, and receive the returned results; Data generation and control module: Connected to the local computing unit, the experimental knowledge base, and the cloud interface module respectively, and used to control the following processes: The document on the principle of physical experiments is read from the experimental knowledge base, and multiple questions are generated by driving the cloud-based large language model through the cloud interface module. The question is sent to the local computing unit to obtain the generated original answer; The question-and-answer pair, containing the question and the original answer, is sent to the cloud-based large language model via the cloud interface module for logical correction. The corrected answers and corresponding questions are formatted and stored to form a training dataset; The local language model is fine-tuned using the training dataset to obtain a local physical experiment question-answering model.
[0016] A third aspect of this application discloses an electronic device, including a memory, a processor, and a computer program stored in the memory and executable on the processor. When the processor executes the computer program, it implements the steps of the local physical experiment question-answering model training method described in the first aspect of this application.
[0017] A fourth aspect of this application discloses a computer-readable storage medium storing a computer program thereon, which, when executed by a processor, implements the steps of the local physical experiment question-answering model training method described in the first aspect of this application.
[0018] A fifth aspect of this application discloses a computer program product, including a computer program that, when executed by a processor, implements the steps of the local physical experiment question-answering model training method described in the first aspect of this application.
[0019] The embodiments of this application have the following advantages: In this embodiment, by calling a cloud-based large language model, multiple questions related to physics experiments are generated based on the background knowledge of physics experiments, and the answers of the local language model are corrected. A large amount of diverse training data can be mined from the experimental principle documents without relying on manual annotation, which effectively solves the technical problems of the lack of high-quality training data and the high cost of manual annotation in vertical fields such as physics experiments.
[0020] To address the issue that the reasoning ability of local models tends to decline after GPU memory optimization, this application constructs an "edge-cloud collaboration" correction mechanism. By using a cloud-based large semantic model to logically evaluate and correct the original answers, it accurately identifies and corrects logical biases and factual errors caused by quantization compression, enabling the strong reasoning capabilities of the cloud to feed back knowledge to the local semantic model and ensuring the physical rigor of the training data.
[0021] Furthermore, by optimizing the local language model for video memory, it can run on consumer-grade graphics cards with limited video memory. Combined with the invocation of a large cloud-based language model, a fully automated closed loop is achieved, from document parsing, question generation, and answer correction to dataset formation, providing a transferable and low-cost localized solution for physics experiment teaching. In addition, by collecting standard question-answer pairs corrected in the cloud and forming a training dataset, the local language model is fine-tuned, enabling the final local physics experiment question-answering model to better inherit the logical reasoning capabilities of the cloud model, resulting in higher question-answering accuracy and robustness in the specific scenario of physics experiments.
[0022] In summary, this method achieves automated generation of fine-tuned datasets for vertical domain physics experiments and automated training of local physics experiment question-answering models by constructing an edge-cloud collaborative closed-loop mechanism of "cloud-based question generation - local question answering - cloud-based polishing - local training". Attached Figure Description
[0023] To more clearly illustrate the technical solutions of the embodiments of this application, the drawings used in the description of the embodiments of this application will be briefly introduced below. Obviously, the drawings described below are only some embodiments of this application. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.
[0024] Figure 1 This is a flowchart illustrating the steps of a local physical experiment question-answering model training method provided in an embodiment of this application. Figure 2 This is an overall architecture diagram of a local physical experiment question-answering model training method provided in an embodiment of this application; Figure 3 This is a schematic diagram of the structure of a local physical experiment question-answering model training system provided in an embodiment of this application; Figure 4 This is a schematic diagram of the structure of an electronic device provided in an embodiment of this application. Detailed Implementation
[0025] To make the above-mentioned objectives, features, and advantages of this application more apparent and understandable, the technical solutions in the embodiments of this application will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of this application, not all embodiments. Based on the embodiments in this application, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of this application.
[0026] This application addresses the technical problems of scarce high-quality fine-tuning data in the vertical field of physics experiments, high costs of manual annotation, and decreased reasoning ability of local lightweight models due to quantization compression. It proposes a training method and system for a local physics experiment question-answering model. The core concept is to construct a closed-loop mechanism of end-cloud collaboration: "cloud-based question generation - local answering - cloud-based polishing - local training." First, a large cloud-based language model automatically generates diverse questions from experimental principle documents. Then, the questions are answered by a local language model optimized for GPU memory, obtaining the original answers. Next, the large cloud-based language model is invoked again to logically evaluate and correct the local answers based on the original experimental documents, correcting reasoning biases caused by quantization. Finally, the corrected high-quality question-answer pairs are used as training data to fine-tune the local language model. This concept injects the strong logic of the large cloud-based language model into the local language model, achieving efficient knowledge transfer, solving the problem of training data sourcing, improving the accuracy of the quantized model's answers, and enabling fully automated operation on consumer-grade GPUs.
[0027] The technical solutions provided in this application will be described in detail below with reference to the accompanying drawings, through specific embodiments and application scenarios.
[0028] Reference Figure 1 As shown, Figure 1 This is a flowchart illustrating the steps of a local physical experiment question-answering model training method provided in an embodiment of this application. Figure 1 As shown, the method may include steps S110 to S150: Step S110: Load the pre-trained local language model and perform GPU memory optimization on the local language model to obtain a quantized local language model.
[0029] In this step, the pre-trained local language model can be an open-source base large language model (e.g., 7B parameter scale), which typically stores weights in 16-bit floating-point (FP16) format, requires a large amount of video memory, and is difficult to run directly on consumer-grade graphics cards.
[0030] To address this, the local language model undergoes memory optimization, including at least model quantization, which converts weights from high-bit representation (such as FP16) to low-bit representation (such as 4-bit or 8-bit integers). This significantly reduces the model file size and runtime memory usage, enabling it to be loaded and run on consumer-grade graphics cards with limited memory (such as the NVIDIA RTX 3090, with 24GB of VRAM).
[0031] Step S120: Call the cloud-based large language model to generate multiple questions related to physics experiments based on the background knowledge of physics experiments; wherein, the background knowledge of physics experiments is obtained by parsing the physics experiment principle document.
[0032] In this step, we first read the physics experiment principle document stored locally, such as the "Experimental Guide to Measuring Thermal Conductivity using the Steady-State Method". We then extract the text content from the document using a document parsing tool (such as Python's python-docx library) as background knowledge for the physics experiment.
[0033] Then, the application programming interface (API) of the cloud-based large language model is invoked. The cloud-based large language model refers to a large model deployed on a cloud server with extremely high parameter scale (e.g., hundreds of billions) and powerful logical reasoning capabilities, such as Qwen-Max, GPT-4, Qwen-3.5, and Deepseek-V3.2. During the invocation, the background knowledge of the physics experiment and a pre-set prompt template (e.g., "Based on the following experimental principles, please provide five questions about the experimental steps, understanding of the principles, or error analysis") are sent to the cloud-based large language model. Based on its powerful semantic understanding and generation capabilities, the cloud-based large language model outputs a series of diverse and professional questions related to the experiment, achieving automated transformation from static documents to dynamic questions.
[0034] Thus, this step enables the automated transformation of static physical experiment principle documents into dynamic questions, eliminating the need for manual question design and writing.
[0035] Step S130: Input the question into the quantized local language model, and generate the original answer to the question through model reasoning.
[0036] In this step, the question generated by the cloud-based large language model in step S120 is input into the quantized local language model loaded in step S110. Model inference is then performed to generate the corresponding answer. Because the local language model undergoes quantization and compression, its reasoning ability may be reduced, and the generated answer may contain inaccurate expressions, flawed logic, or even misleading content. Therefore, the answer generated in this step is called the original answer, which represents the knowledge level of the local language model in its current state.
[0037] During inference, the quality of the responses can be controlled by setting generation parameters. For example, a temperature coefficient can be set to balance accuracy and diversity, and a repetition penalty coefficient can be set to prevent the model from generating duplicate content.
[0038] Step S140: Send the question-answer pair containing the question and the original answer to the cloud-based large language model. The cloud-based large language model performs logical evaluation and correction on the original answer based on the background knowledge of the physical experiment, and generates a personalized standard question-answer pair based on the performance of the local language model and conforming to the instruction fine-tuning format.
[0039] In this step, the question and the original answer are combined into a question-answer pair, and the cloud-based large language model is invoked again. Upon receiving this information, the cloud-based large language model performs two tasks: First, it performs a logical evaluation, judging whether there are errors or flaws in the original answer's descriptions of experimental principles, operating procedures, physical formulas, and phenomena based on its knowledge of physics experiments. For example, in an experiment measuring thermal conductivity using the steady-state method, it judges whether the original answer correctly describes the criteria for steady-state determination and whether it accurately uses the thermal conductivity calculation formula. If logical flaws are found, they are corrected based on the physics experiment background knowledge, rewriting or refining the relevant statements to conform to the inherent logic of the physics experiment. Finally, the corrected answer is combined with the original question and output according to the required format (such as JSONL format, including instruction, input, and output fields) to obtain a personalized standard question-answer pair based on the local language model's performance and conforming to the instruction-fine-tuned format.
[0040] Step S150: Collect and store multiple standard question-answer pairs to form a training dataset, and use the training dataset to fine-tune the quantized local language model to obtain a local physical experiment question-answering model.
[0041] Repeat steps S120 to S140 until a preset number of standard question-answer pairs are generated, forming a training dataset. Use this dataset to fine-tune the quantized local language model from step S110. Since the training data has already undergone logical calibration by the cloud-based large language model, the fine-tuning process essentially "transfers" the logical reasoning capabilities of the cloud-based large language model to the local language model. After fine-tuning, a local physics experiment question-answering model specifically designed for physics experiment question-answering tasks is obtained. This model can run offline locally, providing users with accurate and professional answering services.
[0042] The technical solution adopted in this embodiment realizes the automated generation of fine-tuned datasets in the field of physics experiments through a closed-loop mechanism of "cloud-based question generation - local answering - cloud-based polishing - local training", which significantly reduces the cost of data construction. The local answers are corrected by using a large language model in the cloud, which effectively corrects the logical deviation caused by quantization. The optimization of video memory enables the entire process to run on consumer-grade graphics cards, providing a low-threshold and transferable local intelligent agent construction solution for physics experiment teaching.
[0043] In an optional embodiment, the step S110 above, "performing memory optimization on the local language model," specifically includes the following sub-steps S110-1 to S110-2: Step S110-1: Perform 4-bit or 8-bit quantization on the weights of the local language model to convert the original 16-bit floating-point quantization into low-bit integers.
[0044] In this step, quantization reduces the precision of the model weights' numerical representation: 8-bit quantization compresses each parameter to 1 byte, reducing memory usage by approximately 50%; 4-bit quantization compresses it to 0.5 bytes, reducing memory usage by approximately 75%. Taking a 7-bit model as an example, after 4-bit quantization, the weights require only about 3.5GB of memory, and with inference overhead, it can run smoothly on an RTX 3090. Quantization methods can include symmetric quantization and asymmetric quantization. The quantized weights are stored as low-bit integers, and during inference, they are dequantized to floating-point numbers for calculation, achieving a balance between memory usage and computational precision.
[0045] Step S110-2: Load a low-rank adapter into the quantized local language model to construct a parameter-efficient fine-tuning structure.
[0046] In this step, Low-Rank Adapter (LoRA) is a parameter-efficient fine-tuning technique. Its core idea is to inject a trainable low-rank decomposition matrix into a specific layer of the model (such as the projection matrix of the attention layer) while freezing the weights of the original pre-trained model. The loading process involves keeping the quantized model weights unchanged (frozen) and inserting LoRA modules into the specified layers to form a complete training structure that can be used for fine-tuning. Quantization and LoRA are compatible; quantized weights are stored with low bit depth, while LoRA parameters are saved and updated with higher precision (such as FP16), balancing memory optimization with fine-tuning flexibility.
[0047] Furthermore, the step S150 above, "fine-tuning the quantized local language model using the training dataset", specifically includes: updating the parameters of the low-rank adapter using the training dataset.
[0048] In this embodiment, a parameter-efficient fine-tuning strategy is employed for fine-tuning training. Specifically, during training, the original weights of the local language model remain frozen and are not updated; only the low-rank adapter parameters are updated. After fine-tuning, the learned low-rank adapter parameters can be merged with the quantized local language model weights, or the low-rank adapter parameters can be saved separately for dynamic loading.
[0049] The technical solution adopted in this embodiment significantly reduces the memory usage through quantization technology and achieves efficient parameter fine-tuning by combining LoRA. It completes the closed loop from data generation to model fine-tuning under limited hardware resources, which is conducive to rapid iteration.
[0050] In an optional embodiment, step S120 above, "calling the cloud-based large language model to generate multiple questions related to physics experiments based on physics experiment background knowledge," may include the following steps S120-1 to S120-2: Step S120-1: Maintain a list of generated issues.
[0051] In this step, the list records the question text generated each time the cloud model is called, accumulating continuously as the generation process continues. For example, if the first call generates 5 questions, 5 new records will be added to the list; if the second call generates another 5 questions, the list will expand to 10 records, and so on. This list provides "historical memory" for subsequent generation, avoiding the generation of a large number of duplicate or similar questions from multiple calls, thus ensuring dataset diversity.
[0052] In practical implementations, the list can store the original text of the question or its vector representation (Embedding) for semantic-level duplicate detection. This application does not impose any limitations on this.
[0053] Step S120-2: Each time the cloud-based large language model is invoked, the list of generated questions is used as context input to guide the cloud-based large language model to generate new questions that do not overlap with existing questions in the list of generated questions.
[0054] In this step, when sending a request to the cloud-based large language model, the generated question list and the background knowledge of the physics experiment are used as context. Prompt words guide the model to generate questions that do not overlap with the list content. Based on semantic understanding, the cloud-based large language model analyzes the scope of the list's content to generate new questions in other dimensions (such as error analysis, equipment operation, etc.).
[0055] For example, if the list already includes questions about "experimental principles" and "steady-state judgment," the cloud-based large language model might then generate questions about other dimensions such as "error analysis," "equipment operation," "data processing," or "phenomenon interpretation." This mechanism ensures that the question set comprehensively covers all aspects of experimental knowledge.
[0056] The technical solution in this embodiment, by maintaining a list of generated questions and using it as context input, dynamically guides the cloud-based model generation process, effectively avoiding the repeated generation of questions and improving the diversity and coverage of the dataset. Diverse training data helps the fine-tuned model learn richer knowledge associations, preventing the model from overfitting to specific question patterns due to limited training data, thereby improving the model's generalization ability and robustness.
[0057] In an optional embodiment, step S140 above, "the cloud-based large language model performs logical evaluation and correction on the original answer based on the background knowledge of the physical experiment," may include the following steps S140-1 to S140-2: Step S140-1: Based on the background knowledge of the physical experiment, the cloud-based big language model determines whether there are logical flaws in the original answer's description of the physical experiment principles, experimental steps, and experimental phenomena.
[0058] In this step, after receiving the original answer generated by the local language model, the cloud-based large language model first performs a logical evaluation task. This evaluation process uses the original physics experimental background knowledge as a benchmark and basis. Specifically, the cloud-based large language model performs multi-dimensional analysis on the content of the original answer to determine whether it contains logical flaws.
[0059] Logical flaws may manifest as misunderstandings of principles, reversed order of steps, inaccurate descriptions of phenomena, or confused causal logic. The cloud-based big language model, leveraging its built-in scientific knowledge and provided experimental documentation as benchmarks, can identify logical deviations in the original answers. For example, regarding the question "How do you determine if a system has reached steady state?", if the local answer only mentions "the heating plate temperature no longer changes," while the documentation explicitly requires comparing whether the temperature change of the heat sink is less than 0.1℃, then this is considered a logical flaw.
[0060] Step S140-2: If there is a logical flaw, the cloud-based big language model will correct the original answer based on the background knowledge of the physical experiment to conform to the inherent logic of the physical experiment.
[0061] Specifically, if logical flaws are identified in step S140-1, the cloud-based large language model will initiate a correction process. This correction process is also strictly based on the background knowledge of physics experiments to ensure the accuracy and rigor of the corrected content. The correction operations include content supplementation, error correction, expression optimization, and structural reorganization, aiming to accurately fix flaws while preserving the original framework and language style of the local answer. For example, the above answer could be corrected to: "When the system reaches steady state, the temperatures of both the heating plate and the cooling plate remain stable, which can be determined by comparing whether the temperature change of the cooling plate within five minutes is less than 0.1℃." The corrected answer is both accurate and natural.
[0062] Understandably, the correction process doesn't simply replace the local answer with the original document text. Instead, it preserves the original framework and language style of the local answer while precisely fixing only the flawed parts. This demonstrates the polishing capabilities of the cloud-based large language model.
[0063] The technical solution of this embodiment, through the logical evaluation of the original document by the cloud-based large language model, can identify various logical flaws and factual errors caused by the degradation of the local language model, ensuring that only verified content can be included in the training dataset. The correction process of the cloud-based large language model essentially involves injecting its built-in strong logical reasoning capabilities and the standard knowledge provided by the document into the original response of the local language model, so that the expression of the local language model is accurately aligned with the standard physical knowledge system.
[0064] In an optional embodiment, step S140 above, "generating a personalized standard question-and-answer pair based on the local language model's performance and conforming to the instruction fine-tuning format," may include the following steps S140-3 to S140-4: Step S140-3: The cloud-based large language model scores multiple different original answers generated for the same question; wherein the scoring is based on the accuracy, completeness and conformity of the answers to the experimental principles.
[0065] In this step, multiple raw answers are generated for the same question by calling the local language model multiple times (using a temperature parameter to control randomness). These answers are then sent to a cloud-based large language model, which performs quantitative scoring (e.g., on a percentage scale) based on three dimensions: accuracy, completeness, and consistency with the experimental principles.
[0066] Accuracy refers to whether the physical concepts, formulas, and data involved in the evaluation are correct. For example, in the thermal conductivity measurement experiment, it involves judging whether the answer correctly states the formula for calculating thermal conductivity and whether it correctly describes the criteria for judging steady state.
[0067] Completeness refers to assessing whether the answer comprehensively covers all aspects of the question and whether any key information is omitted. For example, for the question "How to measure the cooling rate of a heat sink", a complete answer should include key steps such as "recording temperature-time data", "plotting the cooling curve", and "calculating the slope". Missing any step will affect the completeness score.
[0068] The degree of conformity with experimental principles refers to whether the assessment answer strictly follows the standard methods and logic described in the physics experiment principle document. This dimension is used to ensure that the answer is not only accurate and complete, but also consistent with specific experimental teaching requirements, avoiding situations where the principle is correct but does not conform to the specific experimental method.
[0069] For example, regarding the question "What is the basic principle of measuring thermal conductivity using the steady-state method?", the local language model generated three answers: Answer A, concise but omitting the heat sink parameter; Answer B, correct formula but vague description of steady-state conditions; and Answer C, complete formula, clear steps, and accurate steady-state judgment. The cloud model scores the three answers, with possible results: Answer A scores 75 points (high accuracy, insufficient completeness), Answer B scores 80 points (decent completeness, slightly lower consistency with the principle), and Answer C scores 95 points (excellent performance in all three dimensions).
[0070] Step S140-4: Select the original answers whose scores meet the preset threshold, take them as the preferred answers, and combine them with the question to form the standard question-answer pair.
[0071] In this step, after obtaining the scores of all answers, a filter is made based on a pre-set threshold. The threshold can be an absolute score, such as "90 points or above"; or it can be a relative standard, such as "selecting the top 10% of the highest scores" or "selecting the top-scoring answer".
[0072] Only original answers that meet a preset threshold are retained and used as preferred answers in the subsequent training dataset construction process. These preferred answers represent the answers that the local language model performs best on the question in the current generation round. Subsequently, the original question is combined with the selected preferred answers and formatted according to instructions to generate personalized and standard question-answer pairs based on the performance of the local language model.
[0073] It's important to note that the cloud-based large language model doesn't directly modify the answers; instead, it acts as a review expert, using a scoring mechanism to select the best answer from multiple candidates. While these selected answers still originate from the local language model, their overall quality has been significantly improved due to the selection process.
[0074] The technical solution employed in this embodiment uses a cloud-based large language model to score the responses and select high-quality question-answer pairs. The multi-dimensional scoring mechanism ensures that the selected best answers meet high standards in terms of accuracy, completeness, and conformity to experimental principles, thereby guaranteeing the overall quality of the training dataset. This quantitative evaluation method is more objective and reproducible than subjective judgment.
[0075] In an optional embodiment, before step S120 "inputting the question into the quantized local language model", the following steps A1 to A4 are further included: Step A1: Slice the background knowledge of the physical experiment and store it in a vector database.
[0076] In this step, before generating answers using a local language model, the background knowledge of the physics experiment is preprocessed to construct a searchable knowledge base. This preprocessing includes two key steps: slicing and vectorized storage. The experimental principle document is segmented into several semantically complete text fragments (knowledge slices), converted into vector representations using an embedding model, stored in a vector database, and indexed. In this way, the original static document is transformed into a dynamically searchable structured knowledge base, laying the foundation for subsequent accurate knowledge retrieval.
[0077] Step A2: Based on the problem, retrieve relevant knowledge slices from the vector database.
[0078] In this step, after a question is generated in step S120, the question is vectorized before being input into the local model. The question text is converted into a question vector using the same embedding model as in step A1.
[0079] Then, using the question vector as the query condition, a similarity search is performed in the vector database. The database calculates the similarity between the question vector and all knowledge slice vectors (usually using cosine similarity or Euclidean distance), and returns the top K knowledge slices with the highest similarity (K can be set according to actual needs, such as 3-5). These retrieved knowledge slices are the knowledge slices relevant to the current question. They are document fragments that are semantically closest to the question and are very likely to contain the key information needed to answer the question. For example, if the question is "How to determine whether the system has reached steady state?", the retrieved knowledge slices are likely to contain descriptive paragraphs about the criteria for determining steady state.
[0080] In this way, the retrieval method achieves precise knowledge location, avoids the waste of resources and information redundancy caused by inputting the entire document into the model, and ensures that the local model can obtain the most relevant and accurate references.
[0081] Step A3: The relevant knowledge slices are used as part of the input context and input together with the question into the quantized local language model to help it generate the original answer.
[0082] In this step, after obtaining relevant knowledge slices, they are combined with the question to construct an enhanced input prompt, which is then fed into the quantified local language model. Upon receiving this enhanced input, the local language model can generate an answer using two information sources simultaneously: first, the general knowledge stored in the model's own parameters; and second, the precise reference materials provided in the input context. Because the reference materials are highly relevant to the question and originate from the original experimental documents, the accuracy and reliability of the original answer generated by the model are significantly improved.
[0083] By employing the technical solution of this embodiment, the local language model is provided with knowledge slices that are precisely related to the question as a reference, enabling it to answer questions in an "open-book" state. The generated original answers are significantly improved in terms of accuracy and completeness, reducing the probability of logical flaws and factual errors.
[0084] In one optional embodiment, after completing a round of data generation tasks, the method further includes: releasing the GPU memory resources occupied during the generation task to prevent memory fragmentation caused by long-term operation.
[0085] Specifically, during the data generation process, the local language model needs to repeatedly load input data, calculate intermediate activation values, and generate outputs. These temporary tensors continuously occupy GPU memory. As the number of generation rounds increases, a large amount of fragmentation gradually accumulates in GPU memory. Even if a single tensor has been released, the GPU memory space may still be insufficient to meet the allocation needs of subsequent large tensors due to fragmentation, ultimately leading to a "CUDA out of memory" error and interrupting the generation process.
[0086] To avoid the aforementioned problems, this embodiment performs a memory reclamation operation after each complete generation task (e.g., after generating a batch of 1000 question-answer pairs) to ensure the long-term stable operation of the system. Specifically, this can be achieved through a combination of methods: deleting temporary variables using `del`, clearing the cache by calling `torch.cuda.empty_cache()`, and manually triggering `gc.collect()`. In this way, this method ensures the stable operation of long-term, large-scale data generation tasks.
[0087] By performing the above operations, the GPU memory can be completely released and fragmented at the end of each generation task cycle, providing continuous GPU memory space for the next generation task, thus ensuring the stable operation of long-term, large-scale data generation tasks. It should be noted that the GPU memory reclamation operation is not performed after every problem inference, but rather at the granularity of a complete generation task cycle. This avoids the performance overhead of frequent reclamation and effectively prevents the continuous accumulation of GPU memory fragmentation.
[0088] In one alternative embodiment, the background knowledge of the physical experiment further includes real-time data acquired from the physical experiment equipment; the real-time data is read directly from digital sensors connected to the physical experiment equipment via a serial communication interface.
[0089] In this step, real-time data is used as part of the background knowledge to generate questions related to real experimental phenomena (such as "Determine whether the system has reached steady state based on temperature data?") or to correct local answers (such as supplementing judgment criteria based on actual data). Introducing real-time data enables the trained model to better cope with real experimental scenarios, achieving a "virtual-real integration".
[0090] In this way, by combining real experimental data with theoretical documents, the generated question-and-answer pairs contain both theoretical principles and practical data analysis, and the trained model can better cope with problems in real experimental scenarios.
[0091] It is understood that the technical solution of this application can be applied to a variety of different physical experiment scenarios. The following example uses the application of the technical solution of this application to the steady-state method for measuring thermal conductivity to explain in detail the specific contents contained in the experimental principle document.
[0092] In one optional embodiment, the physical experiment is a steady-state method for measuring thermal conductivity, and the content of the physical experiment principle document includes at least one of the following: physical formula for measuring thermal conductivity, method for judging steady-state conditions, steps for measuring the cooling rate of the heat sink, and operating procedures for digital display thermocouples.
[0093] The steady-state method for measuring thermal conductivity is a classic fundamental physics experiment in thermodynamics, based on Fourier's law of heat conduction. Thermal conductivity... It can be calculated using the following formula:
[0094] in, The diameter of the sample. For sample thickness, The temperature difference between the upper and lower surfaces of the sample. This represents the heat dissipation rate of the heat sink in steady state. The radius of the heat sink, This refers to the thickness of the heat sink.
[0095] The experimental principle document may include the physical formula for thermal conductivity, methods for determining steady-state conditions, steps for measuring the cooling rate of the heat sink, and operating procedures for digital thermocouples. This experiment involves complex concepts and precise operations, making it highly suitable as a demonstrative application scenario for this application. By specifying the document's content, question-and-answer pairs covering multiple dimensions such as theory, operation, and equipment can be generated, providing a reference for the structuring of knowledge in other physics experiments.
[0096] like Figure 2 As shown, Figure 2This is an overall architecture diagram of a local physics experiment question-answering model training method provided in this application embodiment. Specifically, firstly, a pre-trained local language model is loaded, and the local language model is optimized for GPU memory to obtain a quantized local language model. Simultaneously, a cloud-based large language model is invoked. The cloud-based large language model can parse the physics experiment principle document to obtain physics experiment background knowledge, thereby generating multiple questions related to the physics experiment based on the physics experiment background knowledge. Next, the questions are input into the quantized local language model, and the model infers and generates original answers to the questions. The question-answer pair containing the question and the original answer is sent to the cloud-based large language model. The cloud-based large language model, based on the physics experiment background knowledge, judges whether there are logical flaws in the original answer's description of the physics experiment principles, experimental steps, and experimental phenomena. If logical flaws exist, the cloud-based large language model corrects the original answer based on the physics experiment background knowledge to conform to the inherent logic of the physics experiment, ultimately generating standard question-answer pairs conforming to the instruction fine-tuning format. By collecting and storing multiple standard question-answer pairs, a training dataset is formed, and the quantized local language model is fine-tuned using the training dataset to obtain a local physics experiment question-answering model.
[0097] Thus, this method achieves automated generation of fine-tuned datasets for vertical domain physics experiments and automated training of local physics experiment question-answering models by constructing an edge-cloud collaborative closed-loop mechanism of "cloud-based question generation - local question answering - cloud-based polishing - local training".
[0098] This application also provides a local physics experiment question-answering model training system for executing the local physics experiment question-answering model training method described in the above embodiments, with reference to... Figure 3 As shown, Figure 3 This is a schematic diagram of a local physics experiment question-answering model training system provided in an embodiment of this application. The system includes: Local computing unit 310: contains at least one consumer-grade GPU for loading and running a local language model optimized for video memory, and performing model inference to generate the original answer; Experimental Knowledge Base 320: Stores documents on the principles of physical experiments; Cloud Interface Module 330: Used to communicate with the cloud-based large language model, send question generation requests and question-answer pair correction requests, and receive the returned results; Data generation and control module 340: Connected to the local computing unit, the experimental knowledge base, and the cloud interface module respectively, and used to control the following processes: Step 1: Read the physical experiment principle document from the experimental knowledge base, and drive the cloud-based large language model to generate multiple questions through the cloud interface module; Step 2: Send the question to the local computing unit to obtain the generated original answer; Step 3: Send the question-answer pair containing the question and the original answer to the cloud-based large language model through the cloud interface module for logical correction; Step 4: Format and store the corrected answers and corresponding questions to form a training dataset; Step 5: Fine-tune the local language model using the training dataset to obtain a local physical experiment question-answering model.
[0099] In this embodiment, the data generation and control module 340 is the control core of the entire system, connecting and coordinating the work of the other three modules to achieve full-process automation. The connection between the data generation and control module 340 and the experimental knowledge base 320 is used to read the content of physical experiment principle documents and obtain knowledge sources. The connection between the data generation and control module 340 and the cloud interface module 330 involves sending requests for question generation and question-answer pair correction, and receiving results returned by the cloud model. The connection between the data generation and control module 340 and the local computing unit 310 involves sending questions to the local language model for reasoning, and receiving the generated original answers; during the fine-tuning phase, it is also responsible for transmitting training data to the local computing unit for model updates.
[0100] It is understood that the local physical experiment question-answering model training system in this application embodiment can implement the local physical experiment question-answering model training method in the above embodiment. The local physical experiment question-answering model training system and the above local physical experiment question-answering model training method have the same advantages over the prior art, and will not be repeated here.
[0101] This application also provides an electronic device, see embodiments thereof. Figure 4 , Figure 4 This is a schematic diagram of the structure of an electronic device provided in an embodiment of this application. For example... Figure 4 As shown, the electronic device 400 includes a memory 410 and a processor 420. The memory 410 and the processor 420 are connected via a bus for communication. The memory 410 stores a computer program that can run on the processor 420 to implement the steps of the local physical experiment question-and-answer model training method described in the embodiments of this application.
[0102] This application also provides a computer-readable storage medium storing a computer program thereon, which, when executed by a processor, implements the steps of the local physical experiment question-answering model training method described in this application.
[0103] This application also provides a computer program product, including a computer program that, when executed by a processor, implements the steps of the local physical experiment question-answering model training method described in this application.
[0104] The various embodiments in this specification are described in a progressive manner, with each embodiment focusing on the differences from other embodiments. The same or similar parts between the various embodiments can be referred to each other.
[0105] This application describes embodiments of methods and systems according to embodiments of this application with reference to flowchart illustrations and / or block diagrams. It should be understood that each block of the flowchart illustrations and / or block diagrams, and combinations of blocks in the flowchart illustrations and / or block diagrams, can be implemented by computer program instructions. These computer program instructions can be provided to a processor of a general-purpose computer, special-purpose computer, embedded processor, or other programmable data processing terminal device to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing terminal device, generate instructions for implementing the flowchart illustrations. Figure 1 One or more processes and / or boxes Figure 1 A device that provides the functions specified in one or more boxes.
[0106] These computer program instructions may also be stored in a computer-readable storage medium that can direct a computer or other programmable data processing terminal device to operate in a particular manner, such that the instructions stored in the computer-readable storage medium produce an article of manufacture including instruction means, which are implemented in a process Figure 1 One or more processes and / or boxes Figure 1 The function specified in one or more boxes.
[0107] These computer program instructions can also be loaded onto a computer or other programmable data processing terminal equipment, causing a series of operational steps to be performed on the computer or other programmable terminal equipment to produce a computer-implemented process, thereby providing instructions that execute on the computer or other programmable terminal equipment for implementing the process. Figure 1 One or more processes and / or boxes Figure 1 The steps of the function specified in one or more boxes.
[0108] Although preferred embodiments of the present application have been described, those skilled in the art, upon learning the basic inventive concept, can make other changes and modifications to these embodiments. Therefore, the appended claims are intended to be interpreted as including the preferred embodiments as well as all changes and modifications falling within the scope of the embodiments of the present application.
[0109] Finally, it should be noted that in this document, relational terms such as "first" and "second" are used only to distinguish one entity or operation from another, and do not necessarily require or imply any such actual relationship or order between these entities or operations. Furthermore, the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or terminal device that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or terminal device. Without further limitations, an element defined by the phrase "comprising one..." does not exclude the presence of other identical elements in the process, method, article, or terminal device that includes said element.
[0110] The above provides a detailed description of the local physical experiment question-answering model training method and system provided in this application. Specific examples have been used to illustrate the principles and implementation methods of this application. The description of the above embodiments is only for the purpose of helping to understand the method and core ideas of this application. At the same time, for those skilled in the art, there will be changes in the specific implementation methods and application scope based on the ideas of this application. Therefore, the content of this specification should not be construed as a limitation of this application.
Claims
1. A method for training a local physical experiment question-and-answer model, characterized in that, include: Load a pre-trained local language model and perform GPU memory optimization on the local language model to obtain a quantized local language model. The system invokes a cloud-based large language model to generate multiple questions related to physics experiments based on background knowledge of physics experiments; the background knowledge of physics experiments is obtained by parsing documents on the principles of physics experiments. The question is input into the quantized local language model, and the model inference generates the original answer to the question. The question-and-answer pair containing the question and the original answer is sent to the cloud-based large language model. The cloud-based large language model performs logical evaluation and correction on the original answer based on the background knowledge of the physics experiment, and generates a personalized standard question-and-answer pair based on the performance of the local language model and conforming to the instruction fine-tuning format. Collect and store multiple standard question-answer pairs to form a training dataset, and use the training dataset to fine-tune the quantized local language model to obtain a local physical experiment question-answering model.
2. The method according to claim 1, characterized in that, Memory optimization of the local language model includes: The weights of the local language model are quantized using 4-bit or 8-bit methods to convert the original bit-floating-point quantization into low-bit integers; and a low-rank adapter is loaded into the quantized local language model to construct a parameter-efficient fine-tuning structure. The fine-tuning of the quantized local language model using the training dataset includes updating the parameters of the low-rank adapter using the training dataset.
3. The method according to claim 1, characterized in that, By calling upon a cloud-based large language model, multiple questions related to physics experiments are generated based on background knowledge of physics experiments, including: Maintain a list of generated issues; Each time the cloud-based large language model is invoked, the list of generated questions is used as context input to guide the cloud-based large language model in generating new questions that do not overlap with existing questions in the list of generated questions.
4. The method according to claim 1, characterized in that, The cloud-based big language model performs logical evaluation and correction on the original answer based on the aforementioned background knowledge of the physics experiment, including: Based on the background knowledge of the physical experiment, the cloud-based big language model judges whether there are logical flaws in the original answer's description of the physical experiment's principles, steps, and phenomena. If there are logical flaws, the cloud-based big language model will correct the original answer based on the background knowledge of the physics experiment to conform to the inherent logic of the physics experiment.
5. The method according to claim 1, characterized in that, Generate personalized standard question-answer pairs based on the local language model's representation and conforming to the instruction fine-tuning format, including: The cloud-based large language model scores multiple different original answers generated for the same question; the scoring is based on the accuracy, completeness, and conformity with the experimental principles of the answer. Select the original answers that meet the preset threshold, take them as the preferred answers, and combine them with the question to form the standard question-answer pair.
6. The method according to any one of claims 1-5, characterized in that, Before inputting the problem into the quantized local language model, the method further includes: The background knowledge of the physical experiments is sliced and stored in a vector database; Based on the aforementioned problem, relevant knowledge slices are retrieved from the vector database; The relevant knowledge slices are used as part of the input context and are input along with the question into the quantized local language model to help it generate the original answer.
7. The method according to any one of claims 1-5, characterized in that, After completing one round of data generation tasks, the following is also included: Release GPU memory resources used during this generation task to prevent memory fragmentation from accumulating due to prolonged operation.
8. The method according to any one of claims 1-5, characterized in that, The background knowledge of the physical experiment also includes real-time data acquired from the physical experiment equipment; the real-time data is read directly from digital sensors connected to the physical experiment equipment via a serial communication interface.
9. The method according to any one of claims 1-5, characterized in that, The physical experiment is a steady-state method for measuring thermal conductivity. The document on the principle of the physical experiment includes at least one of the following: the physical formula for measuring thermal conductivity, the method for judging steady-state conditions, the steps for measuring the cooling rate of the heat sink, and the operating procedures for digital display thermocouples.
10. A local physics experiment question-and-answer model training system, characterized in that, A method for training a local physical experiment question-answering model according to any one of claims 1-8 includes: Local computing unit: contains at least one consumer-grade GPU for loading and running a local language model optimized for video memory, and performing model inference to generate the original answer; Experimental Knowledge Base: Stores documents on the principles of physical experiments; Cloud Interface Module: Used to communicate with the cloud-based large language model, send question generation requests and question-answer pair correction requests, and receive the returned results; Data generation and control module: Connected to the local computing unit, the experimental knowledge base, and the cloud interface module respectively, and used to control the following processes: The document on the principle of physical experiments is read from the experimental knowledge base, and multiple questions are generated by driving the cloud-based large language model through the cloud interface module. The question is sent to the local computing unit to obtain the generated original answer; The question-and-answer pair, containing the question and the original answer, is sent to the cloud-based large language model via the cloud interface module for logical correction. The corrected answers and corresponding questions are formatted and stored to form a training dataset; The local language model is fine-tuned using the training dataset to obtain a local physical experiment question-answering model.