A knowledge updating and relearning system and method based on an insurance contract interpretation large model
By using a comparative learning method guided by insurance knowledge graphs, the problems of distinguishing between old and new knowledge and managing multiple versions in the interpretation of insurance contracts in large models are solved. This enables efficient and low-cost knowledge updates and quality monitoring, and improves the logical accuracy and reliability of the model.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- SOUTHEAST UNIV
- Filing Date
- 2026-03-10
- Publication Date
- 2026-06-19
AI Technical Summary
Existing general-purpose models struggle to distinguish between old and new knowledge logic in insurance contract interpretation scenarios, lack management of different versions of knowledge, and lack closed-loop verification and cost control in the update process, leading to logical confusion and high update costs.
A conflict knowledge retrieval and localization module based on insurance knowledge graph is adopted, combined with a comparison decoupling and incremental update module and a closed-loop verification and feedback optimization module. Through comparison learning and low-rank adaptation techniques, parameter fine-tuning is performed to achieve accurate differentiation between new and old knowledge and management of multiple versions of knowledge. Furthermore, quality monitoring and adaptive optimization are carried out through a multi-dimensional validation set.
It achieves accurate differentiation between new and old knowledge, ensures that the model maintains conceptual distance in the semantic space, reduces update costs, improves logical accuracy and system reliability, and meets the complex management needs of insurance business.
Smart Images

Figure CN122242677A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to artificial intelligence and natural language processing technologies in the insurance industry, specifically to a knowledge update and relearning system and method based on a large-scale insurance contract interpretation model. Background Technology
[0002] With the development of fintech, large-scale insurance contract interpretation models have become core tools for improving underwriting efficiency and optimizing claims processes. Knowledge updating and relearning are key technologies to ensure these models adapt to iterative insurance terms and changes in regulatory rules. However, existing general-purpose large-scale model fine-tuning or incremental learning methods face the following three main technical challenges when applied to rigorous insurance contract interpretation scenarios:
[0003] Difficulty in distinguishing between old and new knowledge leads to confusion in cognitive logic: Existing incremental fine-tuning methods often focus on enabling the model to memorize new data, but lack a precise mechanism for identifying logical conflicts between old and new knowledge. When insurance terms change (such as the underwriting conclusion changing from "increased premium" to "rejection"), the model is prone to confusing old and new concepts in the parameter space, resulting in logical contradictions or factual illusions when answering questions, and failing to accurately distinguish between "memorized old terms" and "memorized new terms".
[0004] Lack of refined management of the coexistence of different knowledge versions: Insurance business has a unique time attribute, and policies from different periods may be subject to different versions of terms. Existing fine-tuning or model editing technologies are usually overwrite updates, that is, "erasing" old knowledge with new knowledge. This makes it impossible for the model to retain the ability to consult old regulations while mastering new regulations, making it difficult to meet the needs of managing the coexistence of multiple versions of knowledge in the insurance scenario where "new policies use new regulations, and old policies use old regulations".
[0005] Relearning methods lack closed-loop validation and cost control: Existing large-scale model update processes are typically "black box" operations, lacking real-time quantitative evaluation of update effects and adaptive feedback mechanisms. In the insurance sector, erroneous knowledge updates can lead to serious compliance risks. Furthermore, full parameter retraining is costly, while fine-tuning without validation struggles to guarantee quality, failing to achieve low-cost, highly reliable, and traceable full lifecycle management of knowledge iteration. Summary of the Invention
[0006] Purpose of the invention: The purpose of this invention is to address the shortcomings of existing technologies, such as the confusion between old and new knowledge logic, lack of management of different knowledge versions, and high update and verification costs, when large language models face dynamic iterations of knowledge in professional fields such as insurance. The invention provides a knowledge update and relearning method based on a large model for interpreting insurance contracts, which can solve the problem of updating related clauses and related knowledge in insurance contract updates.
[0007] Technical solution:
[0008] Firstly, this invention provides a knowledge update and relearning system based on a large-scale insurance contract interpretation model, including a conflict knowledge retrieval and localization module, a comparison decoupling and incremental update module, and a closed-loop verification and feedback optimization module; the conflict knowledge retrieval and localization module is used to solve the problem of distinguishing between old and new knowledge; this module has a built-in insurance knowledge graph. It is responsible for querying the insurance knowledge graph when it receives newly added or changed knowledge data. The system identifies existing knowledge that logically conflicts with the new knowledge and generates conflicting knowledge pairs containing both new and old knowledge based on the conflict logic, providing factual basis for the model to distinguish different versions of knowledge memory. The contrast decoupling and incremental update module is used to solve the problem of coexistence and management of different knowledge versions. This module uses a basic large-scale language model as a carrier and is responsible for constructing a contrastive learning task based on the conflicting knowledge pairs output by the conflicting knowledge retrieval and localization module. It uses a contrastive loss function to decouple new and old knowledge in the semantic space and updates the model parameters through efficient parameter fine-tuning technology to achieve stable reconstruction of new knowledge and accurate retention of old knowledge. The closed-loop verification and feedback optimization module is used to solve the problems of quality monitoring and cost control in the relearning process. This module is responsible for constructing a dedicated verification set containing multi-dimensional test subsets, performing quantitative performance evaluation of the updated model, and dynamically adjusting the key hyperparameters in the contrast decoupling and incremental update module based on the evaluation results to achieve adaptive closed-loop optimization.
[0009] Secondly, this invention provides a knowledge update and relearning method based on a large-scale insurance contract interpretation model, including the following steps:
[0010] Step S1: Memory differentiation based on conflict retrieval
[0011] The conflict knowledge retrieval and localization module executes a logical conflict detection algorithm. Taking a new or changed piece of knowledge as input, it queries a preset insurance knowledge graph, locates existing knowledge that has a logical conflict with the new knowledge, and generates conflict knowledge pairs containing the new and old knowledge based on the insurance clause knowledge conflict pair generation algorithm.
[0012] Step S2: Knowledge Management and Updating Based on Comparative Learning
[0013] The decoupling and incremental update module executes a parameter-efficient incremental update algorithm. Based on the conflict knowledge pairs generated in step S1, a training batch containing positive and negative samples is constructed, and the model parameters are fine-tuned by minimizing the overall objective function including the contrastive loss.
[0014] Step S3: Verification and optimization of the closed-loop feedback method
[0015] The closed-loop verification and feedback tuning module executes a multi-dimensional performance evaluation algorithm. Using a dedicated verification set as input, it calculates the accuracy of new knowledge and the retention rate of old knowledge to measure the performance of the updated insurance contract interpretation model. Then, it executes an adaptive tuning algorithm to generate feedback tuning instructions for adjusting key hyperparameters in step S2 based on the comparison between the performance measurement results and preset thresholds.
[0016] Furthermore, the specific steps of "memory differentiation based on conflict retrieval" in step S1 are as follows:
[0017] S1.1 New Knowledge Parsing and Entity Linking: When the system receives a newly added or modified knowledge text, i.e., new knowledge... At that time, the conflict knowledge retrieval and location module first performs... Perform natural language processing to extract its core main entity. Relationships / Attributes and object / attribute value This involves identifying the logical polarity of the entities within the text (such as negation, affirmation, or conditional change). Subsequently, entity linking techniques are used to link the entities within the text. and Insurance Knowledge Graph The unique node in the table corresponds to the node.
[0018] S1.2 Knowledge Graph Conflict Detection: The conflict knowledge retrieval and location module in the insurance knowledge graph In the middle, the linked entities Starting from a given point, query its relationship with existing entity nodes. The existing relationship between them The system uses conflict detection logic and algorithms to determine if a conflict exists: if a conflict is found... Then determine the existing triplet. The old knowledge it represents With new knowledge There is a direct logical conflict; if the query finds... Then further verification and It will return a conflict detection failure if and only if they are completely identical.
[0019] S1.3 Conflict Knowledge Pair Generation: After locating a conflict, the conflict knowledge retrieval and location module will generate new knowledge. Its corresponding conflicting prior knowledge Combined into an initial conflict knowledge pair This is then passed as output to the comparison decoupling and incremental update module in step S2.
[0020] Furthermore, the specific steps of "knowledge management and updating based on contrastive learning" in step S2 are as follows:
[0021] S2.1, Knowledge Graph-Based Training Sample Expansion: Comparing the initial conflicting knowledge pairs received by the decoupling and incremental update modules. Then, the sample expansion strategy is executed to construct a sample containing Training batch of training samples The sample expansion strategy includes: 1. Core conflict samples: Constructing a core training triplet from the initial conflict knowledge pairs. 2. Expansion of conflicting samples within the same category: In knowledge graphs In the process, find other entities that have the same logical pattern as the initial conflict (e.g., all other downgraded diseases in the same batch) and automatically generate [data / information] for them. A pair of conflicting knowledge of the same kind, and thus construct 3. Training triplet; 4. Relationship sample expansion: based on knowledge graph. The topological structure provides new knowledge Find more positive samples (such as hypernym and co-nym concepts) and negative samples (such as easily confused neighboring concepts) to construct... 4. Training triples with richer contextual relationships; 5. Memory replay samples: randomly selected from a pre-stored important historical knowledge base that is irrelevant to the current update. Each knowledge point is used to construct training samples designed to reinforce memory. Finally, the training batches... ,in .
[0022] S2.2 Definition of Optimization Objective Function: The objective function for minimizing the loss of incremental updates is as follows:
[0023]
[0024] For a training batch Total loss function Loss due to a task Compared with a loss The weighted composition is calculated using the following formula:
[0025]
[0026] in, This is a hyperparameter used to balance the two losses.
[0027] The task loss The standard cross-entropy loss function is calculated using the following formula over the entire batch:
[0028] in, Is with anchor point The corresponding real tags, It is the probability predicted by the model.
[0029] The contrast loss The triplet margin loss is calculated over the entire batch using the following formula:
[0030] in, This represents the encoder part of a basic large-scale language model, which takes the text space as input. Samples in Mapped to a A semantic representation vector of dimension; The distance metric is cosine distance, which is defined as follows: ,in These are two semantic vectors; The preset positive margin is a hyperparameter used to ensure the distance between negative sample pairs and positive sample pairs.
[0031] S2.3 Efficient Parameter Fine-Tuning Execution: Compared with the decoupling and incremental update modules, the low-rank adaptation (LoRA) technique is used to execute based on the total loss function. The gradient descent optimization technique is used. The core idea of this technique is to keep the original weight parameters of the pre-trained model unchanged during fine-tuning, and to learn incremental changes in knowledge by injecting a trainable, low-rank decomposition matrix.
[0032] Specifically, for a pre-trained weight matrix in a basic large-scale language model Its update process in fine-tuning can be represented as ,in, It refers to the learned weight changes. LoRA technology is based on weight changes. The assumption of low-rank structure is decomposed into the product of two smaller, low-rank matrices: , where the matrix ,matrix , and rank It is much smaller The hyperparameters. In LoRA fine-tuning, the original weight matrix is first... The system is completely frozen, and its parameters do not participate in any gradient updates; secondly, the system initializes and trains two low-rank "adapter" matrices. and The number of parameters that need to be trained is reduced from Reduce to Finally, during the forward propagation of the model, for an input vector... To output vector The calculation formula is expressed as: This approach enables efficient, low-cost updates with minimal impact on the model's existing knowledge base.
[0033] Furthermore, the specific steps of "method verification and optimization based on closed-loop feedback" in step S3 are as follows:
[0034] S3.1 Construction and Preparation of Dedicated Validation Set: In order to comprehensively and accurately evaluate the relearning effect of the model, this invention adopts a strategically constructed dedicated validation set. Taking the "update of underwriting rules for level three hypertension" as an example, the validation set is pre-constructed before the relearning task begins and consists of the following four types of test subsets: 1. New knowledge validation subset 1. Questions such as "I have stage 3 hypertension, can I buy this critical illness insurance?" should have the expected answer "According to the new regulations, no"; 2. Conflicting verification subset of old knowledge. : Includes and The same question, but used to detect if the model will incorrectly output an outdated answer like "Yes, but may require an extra charge"; 3. Irrelevant prior knowledge validation subset. : Includes questions unrelated to the topic of "hypertension", such as "What are the underwriting rules for diabetes?" or "What is the waiting period for 'Ping An Fu 2019 Edition'?"; 4. Version-specific verification subset Questions such as "How is coverage for stage 3 hypertension handled for policies signed in 2022 that are subject to the old regulations?" should be answered with "Insurance may be offered with an additional premium." (Verification set) It consists of the union of these subsets, that is .
[0035] S3.2 Multi-dimensional performance evaluation: The closed-loop verification and feedback tuning module uses the aforementioned dedicated verification set. The updated model is evaluated for performance. This module calculates the following two core metrics: 1. New Knowledge Accuracy (NKA): This measures the model's mastery of new knowledge. Its calculation formula is as follows:
[0036] ,in This represents the total number of samples in the new knowledge verification subset; Represents a test sample. It is the input text. It is its corresponding real label; It is the model's response to the input The predicted output; It is an indicator function; its value is 1 when the condition inside the parentheses is true, and 0 otherwise.
[0037] 2. Old Knowledge Retention (OKR) measures a model's ability to withstand catastrophic forgetting. Its calculation formula is as follows: ,in This refers to the current updated model validating a subset of irrelevant prior knowledge. The accuracy rate calculated above; This refers to the model being on the same subset before performing the relearning task. The baseline accuracy is calculated and stored in advance; the closer the OKR value is to 1, the better the model retains old knowledge.
[0038] S3.3 Performance Threshold Judgment and Feedback Tuning: The closed-loop verification and feedback tuning module will calculate the indicators. and Compared with the system's preset performance threshold and (For example , Compare them. If and If the relearning is successful, the process ends; if at least one metric fails to reach the threshold, the closed-loop verification and feedback optimization module will execute an adaptive optimization algorithm, meaning the system will automatically increase the total loss function in step S2.2. Weight hyperparameters The value of is adjusted to strengthen the protection of old knowledge in the next round of training. After receiving the instruction, the decoupling and incremental update module re-executes step S2 using the adjusted hyperparameters, thus forming an adaptive, closed-loop optimization process.
[0039] Beneficial effects: Compared with the prior art, the present invention has the following advantages:
[0040] (1) This invention solves the problem of confusion between old and new knowledge and achieves accurate logical distinction. Through the conflict knowledge retrieval and location module, this invention utilizes the structured characteristics of insurance knowledge graphs to automatically discover specific logical conflicts arising from clause iteration. This not only provides clear conflict "targets" for large models, avoiding blind coverage in traditional fine-tuning, but also fundamentally teaches the model to distinguish between "old clause memory" and "new clause memory," eliminating model illusion and improving the logical accuracy of the answer.
[0041] (2) Effective coexistence and refined management of multiple versions of knowledge are achieved. This invention creatively applies a contrastive learning paradigm in the contrast decoupling and incremental update modules. Guided by the knowledge graph, strong negative samples are constructed to force the model to distance the old and new concepts in the semantic space. This method not only effectively suppresses catastrophic forgetting, but also enables the model to retain the ability to understand different versions of clauses, meeting the complex management needs of "new policies with new rules and old policies with old rules" in insurance business.
[0042] (3) This invention designs a closed-loop verification and feedback optimization module, changing the "black box" state of the previous large model update process. Through a dedicated multi-dimensional verification set and adaptive feedback mechanism, the system can not only quantitatively monitor the update quality in real time, but also automatically adjust parameters when performance is not up to standard. This mechanism significantly reduces the cost of trial and error, and records the key evaluation process, providing credible evidence for subsequent audit traceability, and comprehensively improving the reliability and security of the system. Attached Figure Description
[0043] Figure 1 This is a system functional module architecture diagram of the present invention.
[0044] Figure 2 This is a schematic diagram illustrating the comparative learning principle of the present invention.
[0045] Figure 3 This is a flowchart of the relearning method of the present invention. Detailed Implementation
[0046] The technical solutions of the present invention will be described in detail below with reference to the accompanying drawings of the embodiments of the present invention. However, the scope of protection of the present invention is not limited to the embodiments described.
[0047] like Figure 1As shown, this invention provides a knowledge update and relearning system based on a large-scale insurance contract interpretation model. Its logical architecture mainly consists of three core modules: a conflict knowledge retrieval and localization module, a contrastive decoupling and incremental update module, and a closed-loop verification and feedback optimization module. The conflict knowledge retrieval and localization module is the entry point for solving the problem of knowledge memory differentiation. Internally, it integrates (or connects to) an insurance knowledge graph G, a structured knowledge base for storing insurance terms, product information, and their logical relationships. The main function of this module is: when new knowledge is received, it uses graph G to provide factual and logical basis for identifying knowledge conflicts, locates existing conflicting knowledge through querying, and generates conflicting knowledge pairs. The contrastive decoupling and incremental update module is the core computing unit for performing knowledge management and reconstruction. It uses a basic large-scale language model as its carrier, which is a pre-trained network with general understanding capabilities. This module is responsible for receiving conflicting pairs output by the retrieval module, constructing a contrastive learning task, and updating the model parameters using efficient parameter fine-tuning techniques (such as LoRA), thereby achieving the injection of new knowledge and the decoupling of old knowledge. The closed-loop validation and feedback tuning module is a monitoring unit that ensures the quality of relearning. After the update, it is responsible for quantitatively evaluating the model performance using a pre-built dedicated validation set (calculating NKA and OKR metrics), and adjusting the hyperparameters of the aforementioned incremental update module based on the evaluation results, forming an adaptive optimization closed loop.
[0048] The knowledge update and relearning method based on the insurance contract interpretation big data model in this embodiment, taking the update of the underwriting rules for "grade 3 hypertension" as an example, specifically includes the following steps:
[0049] Step 1 (Memory Differentiation Based on Conflict Retrieval):
[0050] 1.1 New Knowledge Analysis and Entity Linking: When the system receives a newly added or modified knowledge text... For example, "For applicants diagnosed with stage 3 hypertension, the underwriting conclusion is adjusted to 'rejection'." The conflict knowledge retrieval and location module first... Perform natural language processing to extract core entities ("Grade III hypertension"), Relationship / Attribute ("Underwriting conclusion") and new object / attribute values (“refusal to insure”), and link it to the insurance knowledge graph. Link nodes and relationship types in the data.
[0051] 1.2 Knowledge Graph Conflict Detection: The conflict knowledge retrieval and localization module in the knowledge graph In the middle, the linked entity nodes Starting from the first element, query its relationships. The existing object / attribute value connected to ("underwriting conclusion") (“Insurance available with additional premium”). When discovered When a logical conflict is detected, the system will discard the triple containing prior knowledge. The corresponding text Positioned as conflicting knowledge. If found... Then further verification and It will return a conflict detection failure if and only if they are completely identical.
[0052] 1.3 Generation of Conflicting Knowledge Pairs: The conflict knowledge retrieval and location module combines new and old knowledge texts into an initial conflict knowledge pair. And use it as the output.
[0053] Step 2 (Knowledge Management and Updating Based on Comparative Learning):
[0054] 2.1 Knowledge Graph-Based Training Sample Expansion: Comparing the initial conflicting knowledge pairs received by the decoupling and incremental update modules Then, the sample expansion strategy is executed to construct a sample containing Training batch of training samples The sample expansion strategy includes: 1. Core conflict samples: Constructing a core training triplet from the initial conflict knowledge pairs. 2. Expansion of conflicting samples within the same category: In knowledge graphs In the process, find other entities with the same logical pattern as the initial conflict and automatically generate [something] for them. A pair of conflicting knowledge of the same kind, and thus construct 3. Training triplet; 4. Relationship sample expansion: based on knowledge graph. The topological structure provides new knowledge Find more positive and negative samples to build 4. Training triples with richer contextual relationships; 5. Memory replay samples: randomly selected from a pre-stored important historical knowledge base that is irrelevant to the current update. Each knowledge point is used to construct training samples designed to reinforce memory. Finally, the training batches... ,in .
[0055] 2.2 Optimize the objective function definition: Compare the decoupling and incremental update modules and use the minimum loss function min To optimize the objective, the large model is fine-tuned. The total loss function consists of the task loss. Compared with loss Weighted composition:
[0056] in, This is a hyperparameter used to balance the two losses.
[0057] The task loss The standard cross-entropy loss function is calculated using the following formula over the entire batch:
[0058] in, Is with anchor point The corresponding real tags, It is the probability predicted by the model.
[0059] The contrast loss The triplet margin loss is calculated over the entire batch using the following formula:
[0060] in, This represents the encoder part of a basic large-scale language model, which takes the text space as input. Samples in Mapped to a A semantic representation vector of dimension; The distance metric function is cosine distance, which is defined as follows: ,in These are two semantic vectors; This is a preset positive boundary value, a hyperparameter used to ensure the distance between negative and positive sample pairs. For example... Figure 2 As shown, the role of this loss function is to incorporate new knowledge (anchor points) into the semantic representation space of the model. ) and conflicting prior knowledge (negative samples) The vector representation of "pushing away" achieves knowledge decoupling.
[0061] 2.3 Efficient Parameter Fine-Tuning: Compared to the decoupling and incremental update modules, the low-rank adaptation (LoRA) technique is used to perform gradient descent optimization. For a pre-trained weight matrix in the model... LoRA technology will adjust the weight changes that occur during fine-tuning. Decompose into the product of two smaller matrices: , where the matrix ,matrix , and rank It is much smaller The hyperparameters. During forward propagation, the model, for an input vector... To output vector The calculation formula is expressed as: This enables efficient "minimally invasive" updates.
[0062] Step 3 (Verification and Optimization of Closed-Loop Feedback-Based Methods):
[0063] 3.1 Construction and Preparation of Dedicated Validation Set: In order to comprehensively and accurately evaluate the relearning effect of the model, this invention adopts a strategically constructed dedicated validation set. Taking the "update of underwriting rules for level three hypertension" as an example, the validation set is pre-constructed before the relearning task begins and consists of the following four types of test subsets: 1. New knowledge validation subset 1. Questions such as "I have stage 3 hypertension, can I buy this critical illness insurance?" should have the expected answer "According to the new regulations, no"; 2. Conflicting verification subset of old knowledge. : Includes and The same question, but used to detect if the model will incorrectly output an outdated answer like "Yes, but may require an extra charge"; 3. Irrelevant prior knowledge validation subset. : Includes questions unrelated to the topic of "hypertension", such as "What are the underwriting rules for diabetes?" or "What is the waiting period for 'Ping An Fu 2019 Edition'?"; 4. Version-specific verification subset Questions such as "How is coverage for stage 3 hypertension handled for policies signed in 2022 that are subject to the old regulations?" should be answered with "Insurance may be offered with an additional premium." (Verification set) It consists of the union of these subsets, that is .
[0064] 3.2 Multi-dimensional performance evaluation: The closed-loop verification and feedback tuning module uses the aforementioned dedicated verification set. The updated model is then evaluated for performance, and the New Knowledge Accuracy (NKA) and Old Knowledge Retention Rate (OKR) are calculated. The formula for calculating the New Knowledge Accuracy is: ,in This represents the total number of samples in the new knowledge verification subset; This is an indicator function; its value is 1 when the condition within the parentheses is true, and 0 otherwise. The formula for the retention rate of prior knowledge is: ,in This refers to the current updated model validating a subset of irrelevant prior knowledge. The accuracy rate calculated above; This refers to the model being on the same subset before performing the relearning task. The baseline accuracy is calculated and stored in advance; The closer the value is to 1, the better the model retains old knowledge.
[0065] 3.3 Adaptive Feedback Tuning: The closed-loop verification and feedback tuning module compares the evaluation metrics with preset thresholds. If the metrics are not met (e.g., ...), the system will adjust the evaluation criteria accordingly. If so, feedback instructions will be automatically generated to adjust the weight hyperparameters. (e.g., increase) The value of the decoupling and incremental update modules is compared and retrained until all metrics meet the target, forming a closed-loop optimization process.
[0066] like Figure 3 As shown, steps 1, 2, and 3 constitute the core learning process of the method of this invention.
[0067] As can be seen from the above embodiments, by using a knowledge graph-guided comparative relearning method, this invention can ensure the learning accuracy and memory stability of large language models when facing knowledge iteration. This enables the model to effectively decouple conflicting knowledge from new knowledge, and allows for refined differentiation and management of different versions of knowledge. This ensures the model's decision-making accuracy in complex business scenarios and effectively improves the efficiency of knowledge updates while reducing computational costs. Furthermore, this invention designs a parameter-efficient, closed-loop relearning system architecture. The collaborative workflow, consisting of modules such as knowledge retrieval, incremental updates, and verification feedback, provides an automated and monitorable environment for continuous model learning. It not only allows for reliable quality assessment and adaptive optimization of the relearning effect through quantitative performance indicators, but also records key update and verification processes, providing credible evidence for the management and accountability of the model throughout its entire lifecycle.
Claims
1. A knowledge update and relearning system based on a large-scale insurance contract interpretation model, characterized in that: It includes a conflict knowledge retrieval and localization module, a comparison decoupling and incremental update module, and a closed-loop verification and feedback optimization module; the conflict knowledge retrieval and localization module is used to solve the problem of distinguishing between old and new knowledge; this module has a built-in insurance knowledge graph. It is responsible for querying the insurance knowledge graph when it receives newly added or changed knowledge data. The system identifies existing knowledge that logically conflicts with the new knowledge and generates conflicting knowledge pairs containing both new and old knowledge based on the conflict logic, providing factual basis for the model to distinguish different versions of knowledge memory. The contrast decoupling and incremental update module is used to solve the problem of coexistence and management of different knowledge versions. This module uses a basic large-scale language model as a carrier and is responsible for constructing a contrastive learning task based on the conflicting knowledge pairs output by the conflicting knowledge retrieval and localization module. It uses a contrastive loss function to decouple new and old knowledge in the semantic space and updates the model parameters through efficient parameter fine-tuning technology to achieve stable reconstruction of new knowledge and accurate retention of old knowledge. The closed-loop verification and feedback optimization module is used to solve the problems of quality monitoring and cost control in the relearning process. This module is responsible for constructing a dedicated verification set containing multi-dimensional test subsets, performing quantitative performance evaluation of the updated model, and dynamically adjusting the key hyperparameters in the contrast decoupling and incremental update module based on the evaluation results to achieve adaptive closed-loop optimization.
2. A knowledge update and relearning method based on a large-scale insurance contract interpretation model, applied to the system as described in claim 1, characterized in that, Includes the following steps: Step S1: Memory differentiation based on conflict retrieval The conflict knowledge retrieval and localization module executes a logical conflict detection algorithm. Taking a new or changed piece of knowledge as input, it queries a preset insurance knowledge graph, locates existing knowledge that has a logical conflict with the new knowledge, and generates conflict knowledge pairs containing the new and old knowledge based on the insurance clause knowledge conflict pair generation algorithm. Step S2: Knowledge Management and Updating Based on Comparative Learning The decoupling and incremental update module executes a parameter-efficient incremental update algorithm. Based on the conflict knowledge pairs generated in step S1, a training batch containing positive and negative samples is constructed, and the model parameters are fine-tuned by minimizing the overall objective function including the contrastive loss. Step S3: Verification and optimization of the closed-loop feedback method The closed-loop verification and feedback optimization module executes a multi-dimensional performance evaluation algorithm, using a dedicated verification set as input, to calculate the accuracy of new knowledge and the retention rate of old knowledge, and to measure the performance of the updated insurance contract interpretation model. Then, an adaptive tuning algorithm is executed to generate feedback tuning instructions for adjusting key hyperparameters in step S2 based on the comparison between performance metrics and preset thresholds.
3. The knowledge update and relearning method based on the insurance contract interpretation model according to claim 2, characterized in that: The specific steps of memory differentiation based on conflict retrieval in step S1 are as follows: S1.1 New Knowledge Parsing and Entity Linking: When the system receives a newly added or modified knowledge text, i.e., new knowledge... At that time, the conflict knowledge retrieval and location module first performs... Perform natural language processing to extract its core entities. Relationships / Attributes and object / attribute value And identify its logical polarity; Subsequently, entity linking technology is used to link entities in the text. and Insurance Knowledge Graph The unique node in the data is matched accordingly; S1.2 Knowledge Graph Conflict Detection: The conflict knowledge retrieval and location module in the insurance knowledge graph In the middle, the linked entities Starting from a given point, query its relationship with existing entity nodes. The existing relationship between them The system uses conflict detection logic and algorithms to determine if a conflict exists: if a conflict is found... Then determine the existing triplet. The old knowledge it represents With new knowledge There is a direct logical conflict; if the query finds... Then further verification and It will return a conflict detection failure if and only if they are completely identical; S1.3 Conflict Knowledge Pair Generation: After locating a conflict, the conflict knowledge retrieval and location module will generate new knowledge. Its corresponding conflicting prior knowledge Combined into an initial conflict knowledge pair This is then passed as output to the comparison decoupling and incremental update module in step S2.
4. The knowledge update and relearning method based on the insurance contract interpretation big data model according to claim 2, characterized in that: The detailed process of step S2 is as follows: S2.1, Knowledge Graph-Based Training Sample Expansion: Comparing the initial conflicting knowledge pairs received by the decoupling and incremental update modules. Then, the sample expansion strategy is executed to construct a sample containing Training batch of training samples ; Sample expansion strategies include:
1. Core conflict samples: Constructing a core training triplet from the initial conflict knowledge pairs.
2. Expansion of conflicting samples within the same category: In knowledge graphs In the process, find other entities with the same logical pattern as the initial conflict and automatically generate [something] for them. A pair of conflicting knowledge of the same kind, and thus construct 3. Training triplet; 4. Relationship sample expansion: based on knowledge graph. The topological structure provides new knowledge Find more positive and negative samples to build 4. Training triples with richer contextual relationships; 5. Memory replay samples: randomly selected from a pre-stored important historical knowledge base that is irrelevant to the current update. Each knowledge point is used to construct training samples designed to reinforce memory; ultimately, training batches... ,in ; S2.2 Definition of Optimization Objective Function: The objective function for minimizing the loss of incremental updates is as follows: For a training batch Total loss function Loss due to a task Compared with a loss The weighted composition is calculated using the following formula: in, This is a hyperparameter used to balance the two losses; The task loss The standard cross-entropy loss function is calculated using the following formula over the entire batch: in, Is with anchor point The corresponding real tags, It is the probability predicted by the model; The contrast loss The loss for the triplet with boundaries is calculated over the entire batch using the following formula: in, This represents the encoder part of a basic large-scale language model, which takes the text space as input. Samples in Mapped to a A semantic representation vector of dimension; The distance metric function is cosine distance, which is defined as follows: ,in These are two semantic vectors; The preset positive boundary value is a hyperparameter used to ensure the distance interval between negative sample pairs and positive sample pairs; S2.3, Efficient Parameter Fine-Tuning: Compared to the decoupling and incremental update modules, the low-rank adaptation technique is used to perform gradient descent optimization; for a pre-trained weight matrix in the model... LoRA will adjust the weight changes that occur during fine-tuning. Decompose into the product of two smaller matrices: , where the matrix ,matrix , and rank It is much smaller The hyperparameters; during forward propagation of the model, for an input vector To output vector The calculation formula is expressed as: .
5. The knowledge update and relearning method based on the insurance contract interpretation model according to claim 2, characterized in that: The detailed process of step S3 is as follows: S3.1 Construction and Preparation of Dedicated Validation Set: A dedicated validation set is constructed using a strategic approach. In the update of the underwriting rules for level 3 hypertension, the validation set is pre-constructed before the start of the relearning task and consists of the following four types of test subsets: (1) New knowledge validation subset (2) Conflicting prior knowledge verification subset (3) Irrelevant prior knowledge verification subset (4) Version differentiation verification subset Validation set It consists of the union of these subsets, that is ; S3.2 Multi-dimensional performance evaluation: The closed-loop verification and feedback tuning module uses the aforementioned dedicated verification set. The updated model is evaluated for performance using the following two core metrics: (1) New Knowledge Accuracy (NKA) is used to measure the model's mastery of new knowledge. Its calculation formula is as follows: in This represents the total number of samples in the new knowledge verification subset; Represents a test sample. It is the input text. It is its corresponding real label; It is the model's response to the input The predicted output; It is an indicator function; its value is 1 when the condition inside the parentheses is true, and 0 otherwise. (2) The knowledge retention rate (OKR) is used to measure the model's ability to resist catastrophic forgetting. Its calculation formula is as follows: in This refers to the current updated model validating a subset of irrelevant prior knowledge. The accuracy rate calculated above; This refers to the model being on the same subset before performing the relearning task. The baseline accuracy is calculated and stored in advance; the closer the OKR value is to 1, the better the model retains old knowledge. S3.3 Performance Threshold Judgment and Feedback Tuning: The closed-loop verification and feedback tuning module will calculate the indicators. and Compared with the system's preset performance threshold and Compare; if and If the relearning is successful, the process ends; if at least one metric fails to reach the threshold, the closed-loop verification and feedback optimization module will execute an adaptive optimization algorithm, meaning the system will automatically increase the total loss function in step S2.
2. Weight hyperparameters The value of is adjusted to strengthen the protection of old knowledge in the next round of training. After receiving the instruction, the decoupling and incremental update module re-executes step S2 using the adjusted hyperparameters, thus forming an adaptive, closed-loop optimization process.