Method and related device for few-shot entity relation extraction based on active meta-learning
By decoupling the triple extraction task into relation extraction and named entity recognition subtasks through an active meta-learning architecture, and by utilizing location identifiers and adaptive training mechanisms, the problem of high accuracy and low cost in cross-domain entity relation extraction in professional fields with scarce labeled data is solved, and efficient knowledge graph construction is achieved.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- HUAZHONG AGRI UNIV
- Filing Date
- 2025-07-10
- Publication Date
- 2026-06-19
AI Technical Summary
In professional fields such as biomedicine and agricultural science, labeled data is scarce and expensive to obtain. As a result, traditional triplet extraction methods are difficult to achieve high-precision cross-domain transfer entity relation extraction due to the scarcity of labeled data, high entity localization error rate, and insufficient processing of semantic coupling between entities and relations.
By adopting an active meta-learning architecture, the triple extraction task is decoupled into two sub-tasks: relation extraction and named entity recognition. Cross-domain complexity is reduced by using location identifiers and semantic constraints. The model is optimized by using adaptive training and dynamic feedback mechanisms to form a semantic processing pipeline, thereby achieving high-precision and low-cost cross-domain entity relation extraction.
It significantly reduces the dependence of cross-domain entity relationship extraction on labeled resources, improves the model's generalization ability and accuracy under conditions of few samples, and achieves efficient knowledge graph construction.
Smart Images

Figure CN120873200B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of natural language processing, and in particular to a method and related equipment for extracting few-sample entity relations based on active meta-learning. Background Technology
[0002] Traditional triplet extraction methods primarily rely on supervised learning frameworks, which assume the availability of abundant high-quality labeled data to support model training. Knowledge extraction is achieved through end-to-end joint learning or step-by-step named entity recognition and relation extraction. However, in specialized fields such as biomedicine and agricultural science, labeled data is often sparsely distributed and costly to acquire. Furthermore, domain-specific characteristics mean that labeling work heavily relies on expert knowledge, leading to a triple technical dilemma for traditional methods: First, the scarcity of labeled data restricts the sufficiency of model training, especially in cross-domain transfer scenarios, where insufficient labeled samples result in entity localization errors exceeding 30%. Second, existing models are insufficient in handling the semantic coupling between entities and relations; when relation types are complex (e.g., "clinical differential diagnosis" versus "pathological stage judgment"), the entity mismatch rate increases significantly. Third, while large language models possess zero-sample potential, they have low sensitivity to entity location and consume enormous computational resources, making it difficult to meet the precise extraction requirements of specialized scenarios. These shortcomings severely hinder the practical application of knowledge graphs in specialized fields. Therefore, a new method for entity relation extraction that can achieve high-precision cross-domain transfer under low-labeling resource conditions is urgently needed. Summary of the Invention
[0003] In view of the above problems, the present invention provides a method and related equipment for extracting entity relations based on active meta-learning with few samples. The main purpose is to solve the technical problem of how to achieve high-precision and low-cost cross-domain entity relation extraction through task decoupling and adaptive training mechanisms in professional fields where labeled data is scarce.
[0004] To address at least one of the aforementioned technical problems, in a first aspect, the present invention provides a method for extracting few-shot entity relations based on active meta-learning, the method comprising:
[0005] Design an active meta-learning architecture;
[0006] Relation extraction and named entity recognition are performed based on the aforementioned active meta-learning architecture;
[0007] The results of relation extraction and named entity recognition are combined into a triple.
[0008] Optionally, the design of the active meta-learning architecture includes:
[0009] Obtain the unstructured text dataset to be extracted;
[0010] The triple extraction task of the unstructured text dataset is decoupled into a relation extraction subtask and a named entity recognition subtask.
[0011] Optionally, the relation extraction and named entity recognition based on the active meta-learning architecture includes:
[0012] Execute the relation extraction subtask to extract relations from the statement and generate a relation description containing head entity location identifiers and tail entity location identifiers;
[0013] Perform the named entity recognition subtask to extract head and tail entities from the statement based on the relation description.
[0014] Optionally, the method further includes:
[0015] The matching loss between location prototypes and entity prototypes is calculated in a unified semantic space;
[0016] If the matching loss is less than or equal to a preset threshold, the relationship description, the head entity, and the tail entity are output.
[0017] If the matching loss is greater than the preset threshold, an active learning mechanism is triggered to adjust the training weights.
[0018] The step of triggering an active learning mechanism to adjust training weights when the matching loss is greater than the preset threshold includes:
[0019] Determine the initial values for the training weights, wherein the initial values are less than 1;
[0020] If the number of times the matching loss exceeds the preset threshold accumulates, the training weights are increased by 0.1.
[0021] The step of increasing the training weight by 0.1 is performed iteratively until the training weight reaches its maximum value.
[0022] Optionally, merging the results of relation extraction and named entity recognition into a triple includes:
[0023] Based on the relationship description, the head entity and the tail entity are merged into a triple.
[0024] Optionally, the method further includes:
[0025] The relation identifier, entity identifier, and tag identifier are mapped to a 1024-dimensional vector space based on a linear transformation layer to determine the unified semantic space, wherein the weight matrix of the linear transformation layer has a dimension of 512×1024.
[0026] Optionally, the method further includes:
[0027] The triples are imported into a graph database to construct a knowledge graph.
[0028] Secondly, embodiments of the present invention also provide a few-shot entity relation extraction device based on active meta-learning, comprising:
[0029] Design unit, used to design active meta-learning architecture;
[0030] The execution unit is used to perform relation extraction and named entity recognition based on the active meta-learning architecture;
[0031] The merging unit is used to merge the results of relation extraction and named entity recognition into a triple.
[0032] To achieve the above objectives, according to a third aspect of the present invention, a computer-readable storage medium is provided, the computer-readable storage medium comprising a stored program, wherein, when the program is executed by a processor, the steps of the above-described method for extracting few-shot entity relations based on active meta-learning are implemented.
[0033] To achieve the above objectives, according to a fourth aspect of the present invention, an electronic device is provided, comprising at least one processor and at least one memory connected to the processor; wherein the processor is configured to invoke program instructions in the memory to execute the steps of the above-described method for extracting few-sample entity relations based on active meta-learning.
[0034] By employing the above technical solutions, the present invention provides a method and related equipment for extracting entity relations from a small number of samples based on active meta-learning. Addressing the technical problem of achieving high-precision, low-cost cross-domain entity relation extraction through task decoupling and adaptive training mechanisms in specialized fields where labeled data is scarce, the present invention designs an active meta-learning architecture; performs relation extraction and named entity recognition based on the active meta-learning architecture; and merges the results of relation extraction and named entity recognition into triples. In the above scheme, the dynamic collaborative mechanism of the task decoupling architecture reduces the dependence of cross-domain entity relation extraction on labeled resources. The end-to-end triple extraction is split into a two-stage task of relation extraction and named entity recognition, forming a semantic processing pipeline. Finally, the relation types, head entities, and tail entities output from the two stages are merged using structured rules. Thus, high-dimensional joint probability modeling is unnecessary, and the decoupling architecture enables the model to converge faster with a small number of samples, achieving "one-time training, multi-domain fine-tuning." High-precision, low-cost cross-domain entity relation extraction is achieved through task decoupling and adaptive training mechanisms.
[0035] Correspondingly, the few-sample entity relation extraction device, equipment, and computer-readable storage medium based on active meta-learning provided in the embodiments of the present invention also have the above-mentioned technical effects.
[0036] The above description is merely an overview of the technical solution of the present invention. In order to better understand the technical means of the present invention and to implement it in accordance with the contents of the specification, and in order to make the above and other objects, features and advantages of the present invention more apparent and understandable, specific embodiments of the present invention are described below. Attached Figure Description
[0037] Various other advantages and benefits will become apparent to those skilled in the art upon reading the following detailed description of preferred embodiments. The accompanying drawings are for illustrative purposes only and are not intended to limit the invention. Furthermore, the same reference numerals denote the same parts throughout the drawings. In the drawings:
[0038] Figure 1 The diagram illustrates a flowchart of a few-shot entity relation extraction method based on active meta-learning provided by an embodiment of the present invention.
[0039] Figure 2 This diagram illustrates the composition of a few-sample entity relation extraction device based on active meta-learning provided in an embodiment of the present invention.
[0040] Figure 3 This diagram illustrates the composition of an electronic device for extracting few-shot entity relations based on active meta-learning, as provided in an embodiment of the present invention. Detailed Implementation
[0041] Exemplary embodiments of the invention will now be described in more detail with reference to the accompanying drawings. While exemplary embodiments of the invention are shown in the drawings, it should be understood that the invention may be implemented in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this invention will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art.
[0042] To address the technical challenge of achieving high-precision, low-cost cross-domain entity relation extraction through task decoupling and adaptive training mechanisms in specialized fields where labeled data is scarce, this invention provides a few-shot entity relation extraction method based on active meta-learning, such as... Figure 1 As shown, the method includes:
[0043] S101, Design an active meta-learning architecture;
[0044] In one embodiment, the design of the active meta-learning architecture includes:
[0045] Obtain the unstructured text dataset to be extracted;
[0046] The triple extraction task of the unstructured text dataset is decoupled into a relation extraction subtask and a named entity recognition subtask.
[0047] For example, an unstructured text dataset refers to a collection of raw texts without human annotation (such as medical records or agricultural research reports), containing potentially extractable triple information but lacking structured labels. The triple extraction task refers to the goal of extracting structured knowledge from head entities, relations, to tail entities. The relation extraction subtask identifies semantic relation types (e.g., "clinical manifestations") from sentences and generates relation description templates containing location identifiers ([HEAD] / [TAIL]) (e.g., "head entity_[HEAD] triggers tail entity [TAIL]_symptoms"). The named entity recognition subtask locates specific head and tail entities in a sentence based on semantic constraints in the relation description template (e.g., [HEAD] must be a disease name).
[0048] This application's task decoupling reduces the complexity of cross-domain extraction through semantic isolation and dynamic feedback. The relation extraction subtask outputs a relation description template (containing [HEAD] / [TAIL] placeholders) providing domain-prior knowledge for entity recognition (e.g., in medical scenarios, [HEAD] must be a pathological term), reducing the entity search scope from the global domain to a relation-defined semantic space. When the named entity recognition subtask detects a conflict between entity type and relation semantics (e.g., in agricultural text, [HEAD] is located as "hospital" but the relation template requires "crop gene"), the feedback signal triggers iterative correction of the relation description template, forming a bidirectional calibration loop. Traditional end-to-end models require joint learning of high-dimensional entity-relation interaction distributions; the decoupling architecture decomposes this into two low-dimensional subtasks, significantly reducing the convergence difficulty of the model with a small number of samples. [HEAD] / [TAIL], as domain-independent location identifiers, transforms abstract relation semantics into a coordinate reference system for entity positioning (e.g., "[HEAD]" points to a disease in the medical field and to a crop in the agricultural field). Generalized terms in the relation description template (such as "trigger" and "influence") serve as replaceable parameter slots. When migrating across domains, only the slot content needs to be refilled instead of the model being reconstructed, preserving invariant features such as positional encoding.
[0049] By employing the aforementioned technical solution, the unstructured text dataset to be extracted is acquired, and the triple extraction task is decoupled into a relation extraction subtask and a named entity recognition subtask. This significantly reduces the dependence of cross-domain entity relation extraction on labeled resources, while improving the model's generalization ability and accuracy under conditions of few samples. The unstructured text dataset, as the input source (such as medical records or agricultural reports), requires the model to be adaptive due to its unlabeled nature. The task decoupling architecture breaks down end-to-end triple extraction into two independent subtasks: relation extraction focuses on identifying the semantic relation types in the sentences and generating relation description templates containing positional identifiers, while named entity recognition locates specific entities based on the semantic constraints of these templates. This decoupling mechanism reduces the coupling complexity between entities and relationships through semantic isolation, avoiding the convergence difficulties caused by high-dimensional interaction distributions in traditional joint learning models. This allows the model to adapt quickly in specialized domains where labeled data is scarce. For example, location identifiers (such as [HEAD]) in the relationship description template provide domain prior knowledge for entity recognition, reducing the search space from the global domain to the relationship-defined range, thus reducing entity location bias and mismatch risks. Simultaneously, the back-feedback mechanism allows entity recognition errors to trigger template correction, forming a dynamic calibration loop to enhance robustness across domains. Ultimately, this architecture enables the model to efficiently learn domain-specific patterns using a small number of samples, supporting low-cost construction of knowledge graph foundations.
[0050] S102. Perform relation extraction and named entity recognition based on the active meta-learning architecture;
[0051] In one embodiment, the relation extraction and named entity recognition based on the active meta-learning architecture includes:
[0052] Execute the relation extraction subtask to extract relations from the statement and generate a relation description containing head entity location identifiers and tail entity location identifiers;
[0053] Perform the named entity recognition subtask to extract head and tail entities from the statement based on the relation description.
[0054] For example, the relation extraction subtask identifies semantic relation types from the input statement and generates relation description text containing predefined location identifiers (such as [HEAD], [TAIL]). For instance, for the medical text "Penicillin treats pneumonia", the output is "head entity_[HEAD]treatment tail entity[TAIL]_disease". Location identifiers ([HEAD] / [TAIL]): serve as abstract coordinate anchors for entity localization, transforming relational semantics into an operable entity location constraint framework. The named entity recognition subtask, guided by the semantic framework in the relation description (such as "treat disease") and location identifiers, accurately locates the head entity ("penicillin") and the tail entity ("pneumonia") in the statement.
[0055] The abstract semantic space established by the location identifiers in the relation description of this application (e.g., [HEAD] must be a drug entity, [TAIL] must be a disease entity) significantly narrows the entity search range, allowing the model to avoid irrelevant semantic interference. When entity recognition finds that the localization result deviates from the relation description constraints (e.g., the head entity is located to a non-drug term), a semantic conflict signal is transmitted in reverse to trigger iterative optimization of the relation description, forming a closed-loop calibration. Traditional triple joint extraction requires simultaneous modeling of high-dimensional interactions between entities and relations, while this scheme isolates the task through location identifiers, decomposing it into two low-dimensional sub-problems: relation classification and conditional entity localization, reducing the difficulty of model training and convergence. The location identifiers serve as a domain-invariant coordinate reference system: when migrating from the medical field to the agricultural field, the entity type of the same location identifier [HEAD] changes from "drug" to "pesticide," but the location encoding logic carried by the identifiers maintains topological consistency, requiring only minor adjustments to the domain keywords in the relation description (e.g., "treatment" → "prevention").
[0056] By employing the above technical solution, the complexity and localization bias of cross-domain entity relation extraction are significantly reduced through a location identifier system and a dual-task collaborative mechanism: When executing the relation extraction subtask, a relation description containing head entity location identifiers and tail entity location identifiers is generated (e.g., "head entity_[HEAD] treats tail entity[TAIL] disease"). The location identifiers serve as domain-independent semantic anchors to establish an abstract coordinate framework for entity localization, enabling the named entity recognition subtask to accurately locate entities based on the constraints of this framework (e.g., in the medical field, [HEAD] must be a drug-related noun). This decomposes the high-dimensional entity-relation joint learning problem in traditional end-to-end models into two low-dimensional optimization stages—relation extraction focusing. The semantic relation classification and output of location identifiers establish a constraint space, while entity recognition searches for entity location within the constraint range. This decoupled architecture reduces the risk of interference between tasks through semantic isolation. At the same time, the location identifier system provides a unified reference system for cross-domain transfer (e.g., in the agricultural domain, only the entity type of [HEAD]_ needs to be adjusted from "drug" to "pesticide" without reconstructing the model). Furthermore, semantic closed-loop calibration is achieved through a dynamic feedback mechanism. When entity recognition detects a conflict between the location result and the relation description (e.g., the head entity is located to a non-drug noun), it triggers iterative correction of the relation description in reverse, forming a bidirectional optimization path to enhance domain adaptability and robustness. Finally, high-precision triple extraction is achieved in professional domains where annotations are scarce.
[0057] In one embodiment, the method further includes:
[0058] The matching loss between location prototypes and entity prototypes is calculated in a unified semantic space;
[0059] If the matching loss is less than or equal to a preset threshold, the relationship description, the head entity, and the tail entity are output.
[0060] If the matching loss is greater than the preset threshold, an active learning mechanism is triggered to adjust the training weights.
[0061] The step of triggering an active learning mechanism to adjust training weights when the matching loss is greater than the preset threshold includes:
[0062] Determine the initial values for the training weights, wherein the initial values are less than 1;
[0063] If the number of times the matching loss exceeds the preset threshold is accumulated, the training weight is increased by 0.1.
[0064] The step of increasing the training weight by 0.1 is performed iteratively until the training weight reaches its maximum value.
[0065] For example, a unified semantic space maps location identifiers (e.g., [HEAD]) in relation descriptions to the same vector dimension system, making abstract location constraints and concrete entity features quantifiable and comparable. Location prototype: A vectorized representation of the location identifiers generated based on the relation description, carrying the theoretical location features of the entity (e.g., the semantic coordinates of the drug name expected to be pointed to by [HEAD] in medical text). Entity prototype: A vectorized representation of the head / tail entities output by the named entity recognition subtask, reflecting the semantic features of the actually located entity (e.g., the embedding vector of "penicillin" in a sentence). Matching loss: A measure of the difference between the location prototype and the entity prototype in the unified semantic space, reflecting the degree of deviation between entity location and relation description. Preset threshold: A critical value for judging whether the matching loss is acceptable; below the threshold indicates reliable location, above the threshold requires intervention. Training weights: Parameters that dynamically adjust the learning intensity of samples, with a low initial value (less than 1), gradually increasing as difficult samples accumulate.
[0066] The location prototype in this application implies the entity type rules expected by the relation description (e.g., [HEAD] must be a drug), while the entity prototype carries the actual extracted entity features (e.g., the semantic vector of "penicillin"). The similarity calculation between the two in a unified space essentially verifies whether the actual entity conforms to the semantic constraints of the relation description. Low-loss paths (the matching loss is less than or equal to a preset threshold): indicate that the entity location highly matches the relation description (e.g., "penicillin" conforms to the expected drug class), and the result is directly output to ensure efficiency; high-loss paths (the matching loss is greater than the preset threshold): indicate semantic deviation (e.g., the entity location is "hospital" but the expected value is a drug), triggering the active learning mechanism. All samples are given a basic learning strength (initial value < 1) to avoid premature overfitting. When high-loss samples accumulate to a preset number of times (e.g., multiple location deviations), the weight is increased by a fixed small amount (e.g., 0.1), gradually increasing the model's attention to this type of sample; when the weight reaches its maximum value (e.g., 1.0), the strengthening stops to prevent interference from noisy samples.
[0067] Specifically, the aforementioned active learning mechanism intervenes, using the matching loss between the location prototype (the theoretical entity location described by the relation) and the entity prototype (the actual extracted entity semantics) in the unified semantic space as the decision-making basis. When the loss consistently exceeds a threshold, it indicates the existence of semantic topological bias (e.g., in medical text, the [HEAD] identifier is expected to point to a drug-related entity, but it is actually located to an institution name). A progressive reinforcement strategy is adopted: the initial weight is set to a value less than 1 (e.g., 0.5) to keep the model cautious in learning early samples; when high loss states accumulate continuously to a preset number (e.g., 3 times), the weight is increased by a fixed small amount (+0.1); iterative adjustments are made until the weight reaches the upper limit (e.g., 1.0), forming a step-by-step reinforcement path. After the weight adjustment, the model parameters are updated, focusing on enhancing the feature extraction ability of the current difficult samples. For example, in biomedical text, when the entity of the "pathological stage judgment" relation is repeatedly mislocated, increasing the weight of this sample can enable the model to deeply learn the boundary features of pathological terms. The number of consecutive high losses is used as the trigger condition (not a single loss) to effectively distinguish between systematic bias and random noise. For example, persistent positioning errors caused by terminology variations in agricultural science texts will be identified, while individual spelling errors will be ignored.
[0068] By employing the above technical solution, the dynamic control mechanism of matching loss significantly reduces the training instability and output error risk in scenarios with few samples: the matching loss between the location prototype (the theoretical entity location in the relation description) and the entity prototype (the actual extracted entity semantics) is calculated in a unified semantic space. When the loss is lower than or equal to a preset threshold, triples are directly output to ensure efficient flow of reliable results. When the loss is higher than the threshold, a progressive weight adjustment mechanism is triggered—based on an initial training weight value of less than 1, the weight value is increased stepwise (by 0.1 each time) according to the number of consecutive accumulations of high-loss samples until the weight upper limit is reached. This process guides the model to focus on difficult samples rather than occasional noise through multiple micro-enhancements, while the upper limit of the weight is controlled to avoid the risk of overfitting. The matching loss serves as a quantitative benchmark for the consistency between entity localization and relation description. Its threshold decision mechanism forms a semantic closed-loop verification (e.g., in a medical scenario, if the entity prototype deviates from the "drug category" constraint of the location prototype, intervention is triggered). The cumulative triggering condition of weight adjustment (not a single loss response) further optimizes the efficiency of annotation resource allocation, ultimately achieving a balance between annotation cost and model accuracy while ensuring output reliability.
[0069] In one embodiment, the method further includes:
[0070] The relation identifier, entity identifier, and tag identifier are mapped to a 1024-dimensional vector space based on a linear transformation layer to determine the unified semantic space, wherein the weight matrix of the linear transformation layer has a dimension of 512×1024.
[0071] For example, relation identifiers: semantic relation encodings output from relation extraction subtasks (e.g., vector representations of "clinical manifestations"), carrying abstract features of relation types. Entity identifiers: head / tail entity encodings output from named entity recognition subtasks (e.g., semantic vectors of "penicillin"), reflecting the boundary and category features of specific entities. Label identifiers: predefined entity type label encodings (e.g., "drugs," "diseases"), representing domain knowledge constraints. Linear transformation layer: a fully connected neural network structure that transforms the feature space through a weight matrix. Weight matrix dimension 512×1024: the input dimension of 512 corresponds to the feature capacity of the original identifier, and the output dimension of 1024 is the representation depth of the target semantic space.
[0072] This application first inputs relation identifiers (feature vectors representing semantic relation types), entity identifiers (feature vectors representing head / tail entity semantic features), and label identifiers (feature vectors representing entity category labels) as 512-dimensional input vectors to a linear transformation layer. This transformation layer performs feature space transformation using a 512×1024-dimensional weight matrix, where the 512 rows of the weight matrix correspond to each dimension of the input features, and the 1024 columns correspond to each basis of the output space. During the transformation, the weight matrix performs weighted combination operations on each dimension of the input vectors. For example, it performs cross-modal fusion and recombination on the "action association" feature of the relation identifier, the "boundary sensitivity" feature of the entity identifier, and the "category discrimination" feature of the label identifier to generate a 1024-dimensional output vector containing composite semantics. This transformation essentially achieves feature dimensionality enhancement and interaction through matrix multiplication, transforming the original 512-dimensional feature vector into a linear transformation layer. Discrete semantic features in the 2D input space (such as the action vector of the "treatment" relation, the chemical attribute vector of the "penicillin" entity, and the boundary definition vector of the "drug" label) are projected onto a unified 1024-dimensional continuous vector space. In this space, the features of the three types of identifiers are re-encoded into compatible representations: the first 256 dimensions mainly encode the action constraint features of relations and entities (such as the compatibility between the "treatment" relation and the "drug" entity), the middle 512 dimensions focus on the type matching features of entities and labels (such as the association strength between the "penicillin" entity and the "drug" label), and the last 256 dimensions carry the rule verification features of relations and labels (such as the logical adaptation between the "clinical manifestation" relation and the "disease" label). Finally, a standardized semantic coordinate system is formed in the 1024-dimensional space, which allows the similarity calculation of vectors of different types of identifiers, providing a quantitative basis for the matching loss between the location prototype and the entity prototype.
[0073] Using the above technical solution, relation identifiers, entity identifiers, and label identifiers are mapped to a 1024-dimensional unified semantic space through a linear transformation layer. The 512×1024 dimension of the weight matrix balances feature capacity and computational efficiency. This mapping realizes the standardization process of heterogeneous semantics: the abstract association features of the original relation identifiers (such as the "treatment" action), the specific instance attributes of the entity identifiers (such as the "penicillin" chemical structure), and the classification rules of the label identifiers (such as the "drug class" boundary) are reorganized into compatible representations in the unified space. The first 256 dimensions focus on the action constraints of relation-entity, the middle 512 dimensions strengthen the type matching of entity-label, and the last 256 dimensions verify the rule compatibility of relation-label. This structured encoding makes the difference between the position prototype (the theoretical entity position described by the relation) and the entity prototype (the actual extracted entity) quantifiable, providing a stable measurement basis for matching loss. At the same time, the fixed weight part (70%) of the weight matrix retains cross-domain invariant features (such as entity boundary awareness), while the fine-tunable part (30%) adapts to domain rule transfer (medical drugs → agricultural pesticides), significantly reducing the semantic alignment difficulty during domain adaptation.
[0074] S103. The results of relation extraction and named entity recognition are combined into a triple.
[0075] In one embodiment, merging the results of relation extraction and named entity recognition into a triple includes:
[0076] Based on the relationship description, the head entity and the tail entity are merged into a triple.
[0077] This application reduces the risk of cross-domain triple combination errors through dynamic compatibility verification of semantic frameworks: based on the relation description (containing a semantic framework of predefined position identifiers such as [HEAD] / [TAIL]) generated by the relation extraction subtask, real-time semantic rule verification is performed on the head entity and tail entity output by the named entity recognition subtask. For example, when the relation description is "head entity_[HEAD] treatment tail entity [TAIL]_disease", the merging stage first parses the type constraints implicit in the position identifiers (such as [HEAD] must be a drug entity and [TAIL] must be a disease entity), and then verifies whether the head entity (such as "penicillin") conforms to the drug category characteristics and whether the tail entity (such as "pneumonia") conforms to the disease category characteristics. If the entity type is consistent with the framework constraints (such as the drug name matching the drug category in a medical scenario), it is automatically assembled into a triple (penicillin, treatment, pneumonia). If a type conflict is detected (such as [HEAD] being located as "soil" in agricultural text but the relation description requires "crop gene"), the merging process is interrupted and feedback is sent to the upstream module to trigger semantic correction. This process assigns type validation labels to entities through structured rules in relation descriptions (such as domain ontology constraints carried by identifiers), upgrading the traditional triple concatenation to a semantically driven dynamic adaptation mechanism, significantly reducing the risk of entity-relation mismatch caused by domain migration.
[0078] By employing the above technical solution, the dynamic compatibility verification mechanism based on relation descriptions, head entities, and tail entities significantly reduces the risk of cross-domain triple combination errors: During the merging process, the positional identifier in the relation description (e.g., "head entity_[HEAD] treatment" and the tail entity[TAIL]_disease") is parsed into semantic constraint rules (e.g., [HEAD] must be a drug entity). Real-time type matching verification is performed on the head and tail entities output by named entity recognition. When the entity features conform to the domain rules implicit in the identifier (e.g., in a medical scenario, "penicillin" conforms to drug features and "pneumonia" conforms to disease features), the triple is automatically assembled. The system groups entities, and if a type conflict is detected (e.g., in agricultural text, the tail entity is located as "soil" but the relation description requires "disease object"), the merging process is interrupted and a correction signal is fed back to the upstream module. This semantically driven mechanism upgrades the traditional splicing operation into a structured adaptation process. It uses the ontology constraints carried by identifiers (e.g., [HEAD] is bound to drug type in the medical field and pesticide type in the agricultural field) to establish domain-adaptive verification labels. This not only prevents triples that mismatch entity-relationships from polluting the knowledge base, but also guides the iterative optimization of entity location and relation description through feedback loop, ultimately maintaining the semantic consistency of triples in cross-domain migration scenarios.
[0079] In one embodiment, the method further includes:
[0080] The triples are imported into a graph database to construct a knowledge graph.
[0081] By employing the above technical solution, a knowledge graph is constructed by importing semantically validated triples into a graph database. The topological representation capability of the graph structure reduces the conversion loss from unstructured knowledge to structured storage. The triples are structured data units (e.g., (penicillin, treatment, pneumonia) in the medical field) generated through dynamic compatibility validation. The graph database uses a node-edge-attribute topological model for knowledge storage. During the import process, the head and tail entities are mapped to graph nodes with attached type attributes (e.g., the "penicillin" node is labeled with the type: drug). Relationships are mapped to directed edges connecting nodes and carry semantic descriptions (e.g., the "treatment" edge is associated with symptom intensity parameters). Attribute sets are automatically extracted from the constraint rules of the relation description (e.g., the "treatment" edge inherits the action mechanism in the relation description: antibacterial attribute). This mapping mechanism transforms the discrete semantics of triples into a continuous topological expression of the graph network. Multi-hop connections between nodes (e.g., "penicillin → treatment → pneumonia → complications → pleural effusion") form cross-entity semantic reasoning chains, significantly weakening the limitations of the traditional database's matrix structure on complex relation modeling.
[0082] By employing the above technical solution, the lossless mapping from triples to graph structures significantly reduces the difficulty of storing and reasoning about complex knowledge: First, the output semantic verification triples are decomposed into node entities and relation edges, where nodes are attached with entity type attributes (e.g., the medical entity "pneumonia" is bound to a disease category label), and relation edges inherit the action constraints in the description (e.g., the "treatment" edge carries the route of administration parameter); then, the topological association capability of the graph database is used to construct multi-hop semantic chains (e.g., "penicillin → treatment → pneumonia → complications → pleural effusion"), transforming discrete triples into a reasonable knowledge network; finally, the knowledge base is self-optimized through dynamic attribute expansion (e.g., adding ICD encoding based on the medical ontology) and conflict detection mechanisms (e.g., importing pharmacological conflict interruptions), greatly reducing the structural loss of nonlinear knowledge in traditional relational databases.
[0083] In summary, the technical solution provided in this application systematically addresses the triple technical contradictions in the background technology—scarcity of labeled data, difficulty in cross-domain transfer, and excessive semantic coupling—through the synergistic innovation of task decoupling architecture and dynamic training mechanism. At the task decoupling level, end-to-end triple extraction is split into sequentially executed relation extraction subtasks and named entity recognition subtasks. Relation extraction generates relation description templates containing positional identifiers (e.g., [HEAD] / [TAIL]) (e.g., "head entity_[HEAD] treatment tail entity [TAIL]_disease"). This template serves as a semantic scaffold, providing domain-prior constraints for entity recognition. This reduces the named entity recognition subtask from a global search to a relation-defined semantic space (e.g., [HEAD] requires locking drug-related terms). The high-dimensional entity-relation interaction distribution that traditional joint learning requires modeling is decomposed into two low-dimensional sub-problems, significantly reducing the convergence difficulty of the model with a small number of samples. At the dynamic training optimization level (corresponding to weights 4-6), the relation identifier, entity identifier, and label identifier are mapped to a 1024-dimensional unified semantic space based on the linear transformation layer (the weight matrix is 512×1024 to balance feature capacity and computational efficiency). In this space, the matching loss between the position prototype (the theoretical entity position described by the relation) and the entity prototype (the actual extracted entity) is calculated. When the loss is continuously higher than the threshold, a progressive weight adjustment mechanism is triggered. The initial weight is less than 1 to avoid overfitting. The weight value is increased stepwise according to the cumulative number of high-loss samples (e.g., +0.1 for every 10 samples), guiding the model to learn domain features in stages. This mechanism realizes intelligent screening of difficult samples through loss quantification, tilting limited labeling resources towards high-value samples. At the cross-domain adaptation level, the location identifier serves as a domain-invariant coordinate reference system (e.g., [HEAD] points to drugs in the medical field and pesticides in the agricultural field). Only the domain keywords in the relation description need to be fine-tuned (e.g., "treatment" → "prevention") without reconstructing the model. Combined with the semantic verification in the ternary combination stage, it blocks entity-relation type mismatch (e.g., when the tail entity "soil" and "disease object" in agricultural text conflict with the constraint, the output is interrupted), forming a low-resource adaptation system for the entire process from feature learning to result verification.
[0084] Furthermore, as a response to the above Figure 1 In addition to the implementation of the method shown, this embodiment of the invention also provides a few-shot entity relation extraction device based on active meta-learning, used for the above-mentioned... Figure 1 The method shown is implemented accordingly. This device embodiment corresponds to the foregoing method embodiment. For ease of reading, this device embodiment will not repeat the details of the foregoing method embodiment, but it should be clear that the device in this embodiment can implement all the contents of the foregoing method embodiment. Figure 2 As shown, the device includes: a design unit 21, an execution unit 22, and a merging unit 23, wherein...
[0085] Design unit 21 is used to design the active meta-learning architecture;
[0086] Execution unit 22 is used to perform relation extraction and named entity recognition based on the active meta-learning architecture;
[0087] Merging unit 23 is used to merge the results of relation extraction and named entity recognition into a triple.
[0088] The processor contains a kernel, which retrieves the corresponding program units from memory. One or more kernels can be configured, and by adjusting kernel parameters, a few-shot entity relation extraction method based on active meta-learning can be implemented. This method addresses the technical challenge of achieving high-precision, low-cost cross-domain entity relation extraction in specialized fields where labeled data is scarce, through task decoupling and adaptive training mechanisms.
[0089] This invention provides a computer-readable storage medium including a stored program that, when executed by a processor, implements the few-shot entity relation extraction method based on active meta-learning.
[0090] This invention provides a processor for running a program, wherein the program executes the few-shot entity relation extraction method based on active meta-learning.
[0091] This invention provides an electronic device, which includes at least one processor and at least one memory connected to the processor; wherein the processor is used to call program instructions in the memory to execute the few-shot entity relation extraction method based on active meta-learning as described above.
[0092] This invention provides an electronic device 30, such as... Figure 3 As shown, the electronic device includes at least one processor 301, and at least one memory 302 and bus 303 connected to the processor; wherein, the processor 301 and the memory 302 communicate with each other through the bus 303; the processor 301 is used to call program instructions in the memory to execute the above-mentioned method for extracting few-sample entity relations based on active meta-learning.
[0093] The smart electronic devices mentioned in this article can be PCs, tablets, mobile phones, etc.
[0094] This application also provides a computer program product that, when executed on a process management electronic device, is suitable for executing a program that initializes the above-described method for extracting few-sample entity relations based on active meta-learning.
[0095] It should be noted that the descriptions of each embodiment in the above embodiments have different focuses. For parts that are not described in detail in a certain embodiment, please refer to the relevant descriptions in other embodiments.
[0096] Those skilled in the art will understand that embodiments of this application can be provided as methods, systems, or computer program products. Therefore, this application can take the form of a completely hardware embodiment, a completely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, this application can take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer-usable program code.
[0097] This application is described with reference to flowchart illustrations and / or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of this application. It will be understood that each block of the flowchart illustrations and / or block diagrams, and combinations of blocks in the flowchart illustrations and / or block diagrams, can be implemented by computer program instructions. These computer program instructions can be provided to a processor of a general-purpose computer, special-purpose computer, embedded computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create a machine for implementing the flowchart illustrations. Figure 1 One or more processes and / or boxes Figure 1 A device that provides the functions specified in one or more boxes.
[0098] These computer program instructions may also be stored in a computer-readable storage medium that can direct a computer or other programmable data processing device to function in a particular manner, such that the instructions stored in the computer-readable storage medium produce an article of manufacture including instruction means, which are implemented in a process Figure 1 One or more processes and / or boxes Figure 1 The function specified in one or more boxes.
[0099] These computer program instructions may also be loaded onto a computer or other programmable data processing equipment to cause a series of operational steps to be performed on the computer or other programmable equipment to produce a computer-implemented process, thereby providing instructions that execute on the computer or other programmable equipment for implementing the process. Figure 1 One or more processes and / or boxes Figure 1 The steps of the function specified in one or more boxes.
[0100] This application also provides a computer program product, which includes computer software instructions that, when executed on a processing device, cause the processing device to perform actions such as... Figure 1The control flow of the memory in the corresponding embodiment.
[0101] A computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on a computer, all or part of the flow or function according to the embodiments of this application is generated. The computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable device. The computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another. For example, computer instructions may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, digital subscriber line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.) means. The computer-readable storage medium may be any available medium that a computer can store or a data storage device such as a server or data center that integrates one or more available media. The available medium may be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., solid-state disk (SSD)).
[0102] Those skilled in the art will clearly understand that, for the sake of convenience and brevity, the specific working processes of the systems, devices, and units described above can be referred to the corresponding processes in the foregoing method embodiments, and will not be repeated here.
[0103] In the several embodiments provided in this application, it should be understood that the disclosed systems, apparatuses, and methods can be implemented in other ways. For example, the apparatus embodiments described above are merely illustrative; for instance, the division of units is only a logical functional division, and in actual implementation, there may be other division methods. For example, multiple units or components may be combined or integrated into another system, or some features may be ignored or not executed. Furthermore, the coupling or direct coupling or communication connection shown or discussed may be through some interfaces, or indirect coupling or communication connection between apparatuses or units, and may be electrical, mechanical, or other forms.
[0104] The units described as separate components may or may not be physically separate. The components shown as units may or may not be physical units; that is, they may be located in one place or distributed across multiple network units. Some or all of the units can be selected to achieve the purpose of this embodiment according to actual needs.
[0105] Furthermore, the functional units in the various embodiments of this application can be integrated into one processing unit, or each unit can exist physically separately, or two or more units can be integrated into one unit. The integrated unit can be implemented in hardware or as a software functional unit.
[0106] If the integrated unit is implemented as a software functional unit and sold or used as an independent product, it can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of this application, in essence, or the part that contributes to the prior art, or all or part of the technical solution, can be embodied in the form of a software product. This computer software product is stored in a storage medium and includes several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) to execute all or part of the steps of the methods of the various embodiments of this application. The aforementioned storage medium includes various media capable of storing program code, such as USB flash drives, portable hard drives, read-only memory (ROM), random access memory (RAM), magnetic disks, or optical disks.
[0107] The above embodiments are only used to illustrate the technical solutions of this application, and are not intended to limit it. Although this application has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that modifications can still be made to the technical solutions described in the foregoing embodiments, or equivalent substitutions can be made to some of the technical features. Such modifications or substitutions do not cause the essence of the corresponding technical solutions to deviate from the spirit and scope of the technical solutions of the embodiments of this application.
Claims
1. A method for extracting few-shot entity relations based on active meta-learning, characterized in that, include: Design an active meta-learning architecture; Relation extraction and named entity recognition are performed based on the aforementioned active meta-learning architecture; The results of relation extraction and named entity recognition are combined into a triple; The proposed active meta-learning architecture includes: Obtain the unstructured text dataset to be extracted; The triple extraction task of the unstructured text dataset is decoupled into a relation extraction subtask and a named entity recognition subtask. The relation extraction and named entity recognition based on the active meta-learning architecture include: Execute the relation extraction subtask to extract relations from the statement and generate a relation description containing head entity location identifiers and tail entity location identifiers; Perform the named entity recognition subtask to extract head and tail entities from the statement based on the relation description; The matching loss between location prototypes and entity prototypes is calculated in a unified semantic space; If the matching loss is less than or equal to a preset threshold, the relationship description, the head entity, and the tail entity are output. If the matching loss is greater than the preset threshold, an active learning mechanism is triggered to adjust the training weights. The step of triggering an active learning mechanism to adjust training weights when the matching loss is greater than the preset threshold includes: Determine the initial values for the training weights, wherein the initial values are less than 1; If the number of times the matching loss exceeds the preset threshold is accumulated, the training weight is increased by 0.
1. The step of increasing the training weight by 0.1 is performed iteratively until the training weight reaches its maximum value.
2. The method according to claim 1, characterized in that, The step of merging the results of relation extraction and named entity recognition into a triple includes: Based on the relationship description, the head entity and the tail entity are merged into a triple.
3. The method according to claim 1, characterized in that, Also includes: The relation identifier, entity identifier, and tag identifier are mapped to a 1024-dimensional vector space based on a linear transformation layer to determine the unified semantic space, wherein the weight matrix of the linear transformation layer has a dimension of 512×1024.
4. The method according to claim 1, characterized in that, Also includes: The triples are imported into a graph database to construct a knowledge graph.
5. A few-shot entity relation extraction device based on active meta-learning, characterized in that, Also includes: Design unit, used to design active meta-learning architecture; The execution unit is used to perform relation extraction and named entity recognition based on the active meta-learning architecture; The merging unit is used to merge the results of relation extraction and named entity recognition into a triple; The proposed active meta-learning architecture includes: Obtain the unstructured text dataset to be extracted; The triple extraction task of the unstructured text dataset is decoupled into a relation extraction subtask and a named entity recognition subtask. The relation extraction and named entity recognition based on the active meta-learning architecture include: Execute the relation extraction subtask to extract relations from the statement and generate a relation description containing head entity location identifiers and tail entity location identifiers; Perform the named entity recognition subtask to extract head and tail entities from the statement based on the relation description; The matching loss between location prototypes and entity prototypes is calculated in a unified semantic space; If the matching loss is less than or equal to a preset threshold, the relationship description, the head entity, and the tail entity are output. If the matching loss is greater than the preset threshold, an active learning mechanism is triggered to adjust the training weights. The step of triggering an active learning mechanism to adjust training weights when the matching loss is greater than the preset threshold includes: Determine the initial values for the training weights, wherein the initial values are less than 1; If the number of times the matching loss exceeds the preset threshold is accumulated, the training weight is increased by 0.
1. The step of increasing the training weight by 0.1 is performed iteratively until the training weight reaches its maximum value.
6. A computer-readable storage medium, characterized in that, The computer-readable storage medium includes a stored program, wherein, when the program is executed by a processor, it implements the steps of the few-shot entity relation extraction method based on active meta-learning as described in any one of claims 1 to 4.
7. An electronic device, characterized in that, The electronic device includes at least one processor and at least one memory connected to the processor; wherein the processor is configured to invoke program instructions in the memory to execute the steps of the few-sample entity relation extraction method based on active meta-learning as described in any one of claims 1 to 4.