Open-domain relation inference method, system, device and medium based on boundary perception prototype

By using a boundary-aware prototype method, relation boundaries are dynamically generated and optimized using a loss function. This solves the limitations of fixed thresholds and boundary ambiguity in open-domain relation inference, achieving high accuracy and robust relation recognition.

CN122242733APending Publication Date: 2026-06-19SOUTH CHINA UNIV OF TECH

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
SOUTH CHINA UNIV OF TECH
Filing Date
2026-03-13
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

In existing open-domain relation inference methods, fixed thresholds cannot adapt to the multimodal distribution differences of different relation categories. This leads to compact classes being prone to misjudging noise, sparse classes being prone to missing positive samples, and the boundary definition being vague, making it difficult to effectively reject unknown relations.

Method used

A boundary-aware prototype-based approach is adopted, which dynamically generates geometric boundaries for each relation category through a residual boundary generator and optimizes the boundary positions using a collaborative training strategy. By combining boundary calibration loss and boundary repulsion loss, accurate relation inference is achieved.

Benefits of technology

It achieves high accuracy and robustness in open-domain relation inference, and can adaptively establish a dedicated threshold for each relation, significantly improving the ability to reject unknown noise.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122242733A_ABST
    Figure CN122242733A_ABST
Patent Text Reader

Abstract

This invention discloses an open-domain relation inference method, system, electronic device, and storage medium based on boundary-aware prototypes, belonging to the field of natural language processing technology. The method includes: extracting instance feature vectors and relation prototype vectors using a Sentence-BERT model; inputting the relation prototype vectors into a boundary generator, generating dynamic relation boundary vectors around the prototypes through a residual connection mechanism; constructing a boundary calibration loss function and a boundary repulsion loss function, co-training the model, and optimizing the feature space distribution; in the inference stage, calculating the similarity between the instance and the prototype and the boundary, and using a dynamic threshold to determine the relation category. This invention solves the problem that fixed thresholds cannot adapt to complex distributions by learning a unique geometric boundary for each relation category, significantly improving the accuracy and robustness of relation inference in open-domain environments.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention belongs to the field of artificial intelligence and natural language processing technology, specifically relating to an open-domain relation inference method, system, electronic device and storage medium based on a boundary-aware prototype, which is suitable for high-accuracy relation inference in open application scenarios containing unknown categories of noise. Background Technology

[0002] Relation inference aims to identify semantic relationships between entity pairs from unstructured natural language text and is a core technology for constructing knowledge graphs. Traditional supervised learning methods rely on closed, predefined sets of relations, which cannot handle the constantly emerging new relations or irrelevant noise in the real world.

[0003] Existing open-domain relation inference methods typically employ a prototype-based metric learning framework, using the Sentence-BERT model for vector generation and identifying relations by calculating the similarity between instance feature vectors and relation prototype vectors. However, existing technologies suffer from the following drawbacks:

[0004] 1. Limitations of Fixed Thresholds: Existing methods typically set a globally fixed similarity threshold to reject unknown relationships. However, the semantic distribution density varies significantly across different relationship categories. For example, the expressions for "place of birth" relationships are relatively fixed (e.g., "born in", "hometown"), resulting in a compact distribution of samples in the feature space; while the expressions for "through..." relationships are highly diverse, leading to a sparse distribution of samples. Using a uniform fixed threshold can cause compact classes (requiring high thresholds) to be prone to misclassifying noise, while sparse classes (requiring low thresholds) are prone to missing positive samples, making it difficult to cover all categories.

[0005] 2. Vague boundary definition: Traditional prototype networks only focus on intra-class compactness and do not clearly model the geometric boundaries between classes. This leads to the model having an overly high confidence level when faced with semantically similar unknown relationships, and lacking effective rejection ability. Summary of the Invention

[0006] The main objective of this invention is to overcome the problem that existing technologies using fixed thresholds cannot adapt to differences in multimodal distributions, and to provide an open-domain relation inference method, system, electronic device, and storage medium based on a boundary-aware prototype. This invention introduces a residual boundary generator to dynamically generate geometric boundaries that adapt to the semantic distribution of each relation category, and utilizes a collaborative training strategy to optimize the boundary positions, thereby achieving accurate open-domain relation inference.

[0007] To achieve the above objectives, the present invention adopts the following technical solution:

[0008] In a first aspect, the present invention provides an open-domain relation inference method based on a boundary-aware prototype, the open-domain relation inference method comprising the following steps:

[0009] S1, Feature encoding step, which encodes the natural language text containing head and tail entities. Input the Sentence-BERT model to extract instance feature vectors At the same time, the relationship tags The descriptive text and aliases are input into the Sentence-BERT model to extract relation feature vectors. A weighted average of the extracted relation feature vectors is then calculated to obtain the relation prototype vector. ;

[0010] S2. Dynamic boundary generation step: Constructing a boundary generator based on a residual network structure. , relation prototype vector Input boundary generator Generate a prototype vector relative to the relation through nonlinear feature transformation. Feature offset and feature offset Superimposed on relation prototype vector Generate a set of relation boundary vectors that define the semantic scope of the relation. ;

[0011] S3. Co-training step: Constructing the boundary calibration loss function. and boundary repulsion loss ,right Model and boundary generator Train to make the instance feature vector Relation boundary vectors distributed under the same relation label Within the defined scope, at the same time, make the instance feature vector Relation boundary vectors distributed across different relation labels Outside the defined scope;

[0012] S4. Dynamic inference step: The natural language text to be identified... and relationship tags Perform steps S1 and S2 to obtain the instance feature vector. relational prototype vector and relation boundary vector Calculate the instance feature vector With relation prototype vector Entity relationship similarity and relation prototype vector Relationship boundary vector with the same relation label Relationship Boundary Similarity ,like Greater than Then determine the natural language text Belongs to relational tags Otherwise, it is judged as natural language text. Not a relationship tag .

[0013] Further, step S1 specifically includes:

[0014] S101, in natural language text Head entity markers are inserted before and after the head entity and tail entity, respectively. Tail entity marker Then, input the Sentence-BERT model to extract the head entity feature vector. Sum of tail entity feature vectors The instance feature vector is obtained by weighted summation. This step addresses the problem in traditional methods where relational features are overwhelmed by irrelevant contextual noise. By inserting entity tags, it aggregates the contextual information surrounding the entity, thereby enhancing the expressive power of the feature vector.

[0015] The Sentence-BERT model, as an existing technology, was proposed by Reimers N and Gurevych I and is published in the paper: Sentence-BERT: Sentence embeddings using siamese BERT-networks, presented at the 2019 Empirical Methods Conference on Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), with pages 3982-3992. The model takes natural language text as input and performs vector encoding processing on the input text through word segmentation, vector transformation, and contextual semantic fusion, finally outputting a feature vector.

[0016] S102, Add relation tags The descriptive text and aliases are input into the Sentence-BERT model to extract relation feature vectors, and the relation prototype vector is calculated by weighted averaging. This step addresses the problem of insufficient descriptive information in relational representations by introducing detailed descriptive text and aliases, and using weighted averaging for semantic aggregation, thus improving the relational prototype vector. To express relational semantic information more accurately.

[0017] Furthermore, in step S2, the boundary generator Includes a first linear layer, a ReLU activation function, a second linear layer, and a residual connection layer, along with a relational boundary vector. The generation process is as follows:

[0018] S201, with dimension as Relational prototype vector The input is processed through a linear transformation in the first linear layer, followed by ReLU activation to extract high-dimensional semantic features. This step serves as the starting point for boundary generation. A nonlinear transformation maps the original prototype vectors to a higher-dimensional semantic space, uncovering the implicit relation prototype vectors. The deep semantics in the data provide a rich feature base for the subsequent generation of accurate geometric boundaries.

[0019] S202. Input the high-dimensional semantic features into the second linear layer and map them as... Each dimension is set of offset vectors ,in It is the output dimension of the second linear layer. It is a set of offset vectors The An offset, The core design of this step lies in learning the relational prototype vector. Instead of absolute position, the boundary generator uses "offset" to transform semantic features into offsets in geometric space through linear mapping, enabling the boundary generator to... It can flexibly adjust the size and direction of the boundary.

[0020] S203, Convert the relation prototype vector and offset vector set The input residual connection layers are summed to obtain... A set of relation boundary vectors ,in It is a set of relation boundary vectors The A relational boundary vector, , The residual connection layer leverages the ease of optimization of residuals to ensure that the generated boundary vectors remain semantically consistent with the prototype, thus solving the problem that fixed thresholds cannot adapt to complex distributions. This is achieved by providing a prototype vector for each relation. Generating specified geometric boundaries significantly improves the model's ability to reject unknown categories.

[0021] Furthermore, the boundary calibration loss function Indicates the similarity of entity relationships Similarity with relation boundaries The weighted average difference is calculated using the following formula:

[0022] ,

[0023] ,

[0024] in, Representation and relation prototype vector Entity feature vectors with the same relation label This indicates the calculation of cosine similarity. The pre-defined positive sample boundary margin, This represents the modified linear unit activation function, as shown in the formula: , Represents arbitrary real number input; boundary calibration loss function By applying geometric constraints to positive samples in the feature space, the similarity of samples within the same class is forced to be higher than the boundary similarity, thus solving the problem of loose distribution of samples within the class and making the relation prototype vector more consistent. Instance feature vectors with the same relation label It is highly compact in the feature space.

[0025] The boundary rejection loss Indicates the similarity of entity relationships in negative samples Similarity with relation boundaries The weighted average difference is calculated using the following formula:

[0026] ,

[0027] in, Representation and relation prototype vector Entity feature vectors with different relation labels This represents the preset negative sample boundary margin. Boundary repulsion loss. Establish a clear exclusion mechanism to mandate that negative sample entity relationship similarity Similarity below the relation boundary This solves the problem of unknown noise easily getting mixed into the decision boundary in open domain scenarios, and effectively isolates noise.

[0028] Furthermore, step S4 is as follows:

[0029] S401, Calculate the natural language text to be recognized. Instance feature vectors With all known relation prototype vectors cosine similarity ;

[0030] S402, Calculate the prototype vector for each relation. With each relation prototype vector generated A set of relation boundary vectors The mean cosine similarity is used as the dynamic threshold for determining the relationship. ;

[0031] S403. Traverse all relation categories. If a relation satisfies... Then the samples will be classified as The largest known relation label R; if none of the relations satisfy this condition, then it is considered an unknown relation. .

[0032] Finally, by traversing the decision steps, accurate screening of open-domain samples was achieved. This was accomplished through rigorous comparison of instance cosine similarity. With dynamic judgment threshold It can not only accurately identify known relationships, but also effectively reject noisy samples using thresholds, thus achieving highly robust open-domain relationship inference.

[0033] Secondly, the present invention provides an open-domain relation inference system based on a boundary-aware prototype, used to execute the above-described open-domain relation inference method based on a boundary-aware prototype, wherein the open-domain relation inference system includes:

[0034] The feature encoding module encodes natural language text containing head and tail entities. Input the Sentence-BERT model to extract instance feature vectors At the same time, the relationship tags The descriptive text and aliases are input into the Sentence-BERT model to extract relation feature vectors. A weighted average of the extracted relation feature vectors is then calculated to obtain the relation prototype vector. ;

[0035] The dynamic boundary generation module constructs a boundary generator based on a residual network structure. , relation prototype vector Input boundary generator Generate a prototype vector relative to the relation through nonlinear feature transformation. Feature offset and feature offset Superimposed on relation prototype vector Generate a set of relation boundary vectors that define the semantic scope of the relation. ;

[0036] The collaborative training module constructs a boundary calibration loss function. and boundary repulsion loss ,right Model and boundary generator Train to make the instance feature vector Relation boundary vectors distributed under the same relation label Within the defined scope, at the same time, make the instance feature vector Relation boundary vectors distributed across different relation labels Outside the defined scope;

[0037] The dynamic inference module will process the natural language text to be recognized. and relationship tags Perform steps S1 and S2 to obtain the instance feature vector. relational prototype vector and relation boundary vector Calculate the instance feature vector With relation prototype vector Entity relationship similarity and relation prototype vector Relationship boundary vector with the same relation label Relationship Boundary Similarity ,like Greater than Then determine the natural language text Belongs to relational tags Otherwise, it is judged as natural language text. Not a relationship tag .

[0038] Thirdly, the present invention provides an electronic device, including a processor and a memory for storing a processor-executable program, wherein when the processor executes the program stored in the memory, it implements the above-described open domain relation inference method based on a boundary-aware prototype.

[0039] Fourthly, the present invention provides a storage medium storing a program, which, when executed by a processor, implements the above-described open domain relation inference method based on a boundary-aware prototype.

[0040] Compared with the prior art, the present invention has the following advantages and beneficial effects:

[0041] (1) The present invention provides an open domain relation inference method based on boundary-aware prototypes, which uses a pre-trained language model to extract features, dynamically generates geometric boundary vectors for each relation prototype through a residual boundary generator, and performs collaborative training of the model by combining boundary calibration loss and boundary repulsion loss, and finally determines the relation based on the dynamically generated boundary threshold.

[0042] (2) The residual boundary generation mechanism in this invention constructs the boundary by learning the feature offset relative to the prototype. This operation utilizes the characteristic that residual learning is easy to optimize, which helps to generate a geometric boundary that is semantically coherent and close to the relation prototype, thereby accurately defining the semantic range of the relation. The introduced boundary calibration and rejection loss function forces positive samples to be distributed inside the boundary and negative samples to be distributed outside the boundary through geometric distance constraints. This explicit spatial constraint is consistent with the characteristic that different relation categories have different distribution densities, enabling the model to adaptively establish a specific threshold for each relation, effectively solving the problem that fixed thresholds are difficult to balance compact and sparse categories, and greatly improving the ability to reject unknown noise.

[0043] (3) The present invention is based on an adaptive iteration mechanism with dynamic boundaries, which enables the model to continuously refine the feature space using high-confidence samples. In summary, this method achieves high accuracy, high robustness and adaptive open-domain relation inference. Attached Figure Description

[0044] To more clearly illustrate the technical solutions in the embodiments of this application, the accompanying drawings used in the description of the embodiments will be briefly introduced below. Obviously, the accompanying drawings described below are only some embodiments of this application. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.

[0045] Figure 1 This is a flowchart of the open domain relation inference method based on a boundary-aware prototype disclosed in an embodiment of this application;

[0046] Figure 2 This is the boundary generator provided in the embodiments of this application. Model structure diagram;

[0047] Figure 3 This is a structural block diagram of an open domain relation inference method based on a boundary-aware prototype provided in an embodiment of this application;

[0048] Figure 4 This is a structural block diagram of the open domain relation inference system based on a boundary-aware prototype in Embodiment 3 of the present invention;

[0049] Figure 5 This is a structural block diagram of the electronic device in Embodiment 4 of the present invention. Detailed Implementation

[0050] To enable those skilled in the art to better understand the present application, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are merely some embodiments of the present application, and not all embodiments. All other embodiments obtained by those skilled in the art based on the embodiments of the present application without creative effort are within the scope of protection of the present application.

[0051] In this application, the reference to "embodiment" means that a specific feature, structure, or characteristic described in connection with an embodiment may be included in at least one embodiment of this application. The appearance of this phrase in various places throughout the specification does not necessarily refer to the same embodiment, nor is it a mutually exclusive, independent, or alternative embodiment. It will be explicitly and implicitly understood by those skilled in the art that the embodiments described in this application can be combined with other embodiments.

[0052] Example 1

[0053] This embodiment discloses an open-domain relation inference method based on a boundary-aware prototype, characterized in that the method includes the following steps:

[0054] S1, Feature encoding step, for input natural language text To extract the relationship between "Steve Jobs" and "Apple," which states that "Steve Jobs was the founder of [Apple]", we first insert header entity markers before and after the header entity "Steve Jobs". Insert tail entity markers before and after the tail entity "apple". Input it into the Sentence-BERT model to extract the head entity feature vector. Sum of tail entity feature vectors The instance feature vector is obtained by weighted summation. At the same time, the relationship tags The descriptive text and alias of the "founder" are input into the Sentence-BERT model to extract relation feature vectors. The extracted relation feature vectors are then weighted and averaged to obtain the relation prototype vector. (dimension is) =768);

[0055] S2. Dynamic boundary generation step: Constructing a boundary generator based on a residual network structure. The generator Composed of a first linear layer connected in sequence, a nonlinear activation function, and a second linear layer, it relates the prototype vector. The input is processed by the first linear layer, undergoing linear transformation and ReLU activation function to extract high-dimensional semantic features; these high-dimensional semantic features are then input into the second linear layer and mapped to... =10 dimensions set of offset vectors Performing residual join operations will geometrically adjust the offset vector. Superimposed on relation prototype vector Generate a set of relation boundary vectors that define the semantic scope of the relation. ;

[0056] S3. Co-training step: Constructing the boundary calibration loss function. and boundary repulsion loss The formula for calculating the boundary calibration loss function is:

[0057] ,

[0058] ,

[0059] in, The instance feature vector representing the "founder" relationship label. , This represents the modified linear unit activation function, as shown in the formula: , Represents any real number input;

[0060] Boundary repulsion loss The calculation formula is: ,

[0061] in, This represents the feature vector of entities that are not labeled with the "founder" relationship. .

[0062] right Model and boundary generator Train to make the instance feature vector Relation boundary vectors distributed under the same relation label Within the defined scope, instance feature vector Relation boundary vectors distributed across different relation labels Outside the defined scope;

[0063] S4. Dynamic inference step: For the natural language text to be identified... "Bill Gates founded Microsoft," and related tags For the "acquisition relationship," steps S1 and S2 are executed to obtain the instance feature vector. relational prototype vector and relation boundary vector Calculate the instance feature vector With relation prototype vector Entity relationship similarity and relation prototype vector Relationship boundary vector with the same relation label Relationship Boundary Similarity ,like Less than Then determine the natural language text Not a relationship tag .

[0064] Example 2

[0065] Referring to steps S1 to S4 of the open domain relation inference method based on boundary-aware prototype disclosed in Embodiment 1, this embodiment conducts a 5-fold cross-validation experiment on the FewRel dataset, where each fold contains 80 training relations and 15 unseen OOD (Out-of-Distribution) relations to simulate a real open domain scenario.

[0066] Model parameter settings: Sentence-BERT model selected, feature dimension d is 768; boundary generator Includes 2-level linear boundary margin Set to 0.05, negative sample boundary margin Set the learning rate to 0.1, the learning rate to 2e-5, and the number of training rounds to 10.

[0067] Table 1. Experimental results comparing the model with the baseline model (AlignRE)

[0068]

[0069] As can be seen from the comparative experimental results with the Baseline model in Table 1 above, the Baseline model, using a fixed threshold, achieved an F1 score of 0.7794 on the test set, but its ability to reject out-of-order (OOD) errors was weak (F1 score was only 0.3606), indicating its limited ability to filter unknown noise. In contrast, the boundary-aware prototype-based method disclosed in this embodiment establishes a dedicated threshold for each relation through a dynamic boundary mechanism, not only improving the F1 score on the test set to 0.8202, but also significantly improving the OOD rejection F1 score to 0.5359. This demonstrates that while maintaining high accuracy for known relations, this method significantly enhances robustness to open-domain noise, effectively solving the technical challenge of fixed thresholds failing to accommodate both compact and sparse categories.

[0070] Example 3

[0071] Reference Figure 4This embodiment provides an open-domain relation inference system based on a boundary-aware prototype, used to execute an open-domain relation inference method based on a boundary-aware prototype disclosed in the above embodiment. The open-domain relation inference system based on a boundary-aware prototype includes a feature encoding module 401, a dynamic boundary generation module 402, a collaborative training module 403, and a dynamic inference module 404 connected sequentially, wherein:

[0072] Feature encoding module 401 encodes natural language text containing head and tail entities. Input the Sentence-BERT model to extract instance feature vectors At the same time, the relationship tags The descriptive text and aliases are input into the Sentence-BERT model to extract relation feature vectors. A weighted average of the extracted relation feature vectors is then calculated to obtain the relation prototype vector. ;

[0073] Dynamic boundary generation module 402 constructs a boundary generator based on a residual network structure. , relation prototype vector Input boundary generator Generate a prototype vector relative to the relation through nonlinear feature transformation. Feature offset and feature offset Superimposed on relation prototype vector Generate a set of relation boundary vectors that define the semantic scope of the relation. ;

[0074] Collaborative training module 403 constructs the boundary calibration loss function. and boundary repulsion loss ,right Model and boundary generator Train to make the instance feature vector Relation boundary vectors distributed under the same relation label Within the defined scope, at the same time, make the instance feature vector Relation boundary vectors distributed across different relation labels Outside the defined scope;

[0075] The dynamic inference module 404 will identify the natural language text to be recognized. and relationship tags Perform steps S1 and S2 to obtain the instance feature vector. relational prototype vector and relation boundary vector Calculate the instance feature vector With relation prototype vector Entity relationship similarity and relation prototype vector Relationship boundary vector with the same relation label Relationship Boundary Similarity ,like Greater than Then determine the natural language text Belongs to relational tags Otherwise, it is judged as natural language text. Not a relationship tag .

[0076] Example 4

[0077] This embodiment provides an electronic device, which can be a computer, such as... Figure 5 As shown, the processor 502, memory, input device 503, display 504, and network interface 505 are connected via system bus 501. The processor provides computing and control capabilities. The memory includes a non-volatile storage medium 506 and internal memory 507. The non-volatile storage medium 506 stores the operating system, computer programs, and database. The internal memory 507 provides an environment for the operation of the operating system and computer programs in the non-volatile storage medium. When the processor 502 executes the computer program stored in the memory, it implements the open domain relation inference method based on a boundary-aware prototype proposed in Embodiment 1. The open domain relation inference method based on a boundary-aware prototype includes the following steps:

[0078] S1, Feature encoding step, which encodes the natural language text containing head and tail entities. Input the Sentence-BERT model to extract instance feature vectors At the same time, the relationship tags The descriptive text and aliases are input into the Sentence-BERT model to extract relation feature vectors. A weighted average of the extracted relation feature vectors is then calculated to obtain the relation prototype vector. ;

[0079] S2. Dynamic boundary generation step: Constructing a boundary generator based on a residual network structure. , relation prototype vector Input boundary generator Generate a prototype vector relative to the relation through nonlinear feature transformation. Feature offset and feature offset Superimposed on relation prototype vector Generate a set of relation boundary vectors that define the semantic scope of the relation. ;

[0080] S3. Co-training step: Constructing the boundary calibration loss function. and boundary repulsion loss ,right Model and boundary generator Train to make the instance feature vector Relation boundary vectors distributed under the same relation label Within the defined scope, at the same time, make the instance feature vector Relation boundary vectors distributed across different relation labels Outside the defined scope;

[0081] S4. Dynamic inference step: The natural language text to be identified... and relationship tags Perform steps S1 and S2 to obtain the instance feature vector. relational prototype vector and relation boundary vector Calculate the instance feature vector With relation prototype vector Entity relationship similarity and relation prototype vector Relationship boundary vector with the same relation label Relationship Boundary Similarity ,like Greater than Then determine the natural language text Belongs to relational tags Otherwise, it is judged as natural language text. Not a relationship tag .

[0082] Example 5

[0083] This embodiment provides a storable medium, which is a computer-readable and storable medium storing a computer program. When the computer program is executed by a processor, it implements an open-domain relationship inference method based on a boundary-aware prototype proposed in Embodiment 1 above. This bridge damage identification method includes the following steps:

[0084] S1, Feature encoding step, which encodes the natural language text containing head and tail entities. Input the Sentence-BERT model to extract instance feature vectors At the same time, the relationship tags The descriptive text and aliases are input into the Sentence-BERT model to extract relation feature vectors. A weighted average of the extracted relation feature vectors is then calculated to obtain the relation prototype vector. ;

[0085] S2. Dynamic boundary generation step: Constructing a boundary generator based on a residual network structure. , relation prototype vector Input boundary generator Generate a prototype vector relative to the relation through nonlinear feature transformation. Feature offset and feature offset Superimposed on relation prototype vector Generate a set of relation boundary vectors that define the semantic scope of the relation. ;

[0086] S3. Co-training step: Constructing the boundary calibration loss function. and boundary repulsion loss ,right Model and boundary generator Train to make the instance feature vector Relation boundary vectors distributed under the same relation label Within the defined scope, at the same time, make the instance feature vector Relation boundary vectors distributed across different relation labels Outside the defined scope;

[0087] S4. Dynamic inference step: The natural language text to be identified... and relationship tags Perform steps S1 and S2 to obtain the instance feature vector. relational prototype vector and relation boundary vector Calculate the instance feature vector With relation prototype vector Entity relationship similarity and relation prototype vector Relationship boundary vector with the same relation label Relationship Boundary Similarity ,like Greater than Then determine the natural language text Belongs to relational tags Otherwise, it is judged as natural language text. Not a relationship tag .

[0088] It should be noted that, for the sake of simplicity, the aforementioned method embodiments are all described as a series of actions. However, those skilled in the art should understand that the present invention is not limited to the described order of actions, because according to the present invention, some steps can be performed in other orders or simultaneously.

[0089] The above embodiments are preferred embodiments of the present invention, but the embodiments of the present invention are not limited to the above embodiments. Any changes, modifications, substitutions, combinations, or simplifications made without departing from the spirit and principle of the present invention shall be considered equivalent substitutions and shall be included within the protection scope of the present invention.

Claims

1. An open-domain relation inference method based on a boundary-aware prototype, characterized in that, The open-domain relation inference method includes the following steps: S1, Feature encoding step, which encodes the natural language text containing head and tail entities. Input the Sentence-BERT model to extract instance feature vectors At the same time, the relationship tags The descriptive text and aliases are input into the Sentence-BERT model to extract relation feature vectors, and a weighted average is calculated to obtain the relation prototype vector. ; S2. Dynamic boundary generation step: Constructing a boundary generator based on a residual network structure. , relation prototype vector Input boundary generator Generate a prototype vector relative to the relation through nonlinear feature transformation. Feature offset and feature offset Superimposed on relation prototype vector Generate a set of relation boundary vectors that define the semantic scope of the relation. ; S3. Co-training step: Constructing the boundary calibration loss function. and boundary repulsion loss ,right Model and boundary generator Train to make the instance feature vector Relation boundary vectors distributed under the same relation label Within the defined scope, at the same time, make the instance feature vector Relation boundary vectors distributed across different relation labels Outside the defined scope; S4. Dynamic inference step: The natural language text to be identified... and relationship tags Perform steps S1 and S2 to obtain the instance feature vector. relational prototype vector and relation boundary vector Calculate the instance feature vector With relation prototype vector Entity relationship similarity and relation prototype vector Relationship boundary vector with the same relation label Relationship Boundary Similarity ,like Greater than Then determine the natural language text. Belongs to relational tags Otherwise, it is judged as natural language text. Not a relationship tag .

2. The open-domain relation inference method based on a boundary-aware prototype according to claim 1, characterized in that, Step S1 specifically includes: S101, in natural language text Head entity markers are inserted before and after the head entity and tail entity, respectively. Tail entity marker Then, input the Sentence-BERT model to extract the head entity feature vector. Sum of tail entity feature vectors The instance feature vector is obtained by weighted summation. ; S102, Add relation tags The descriptive text and aliases are input into the Sentence-BERT model to extract relation feature vectors, and the relation prototype vector is calculated by weighted averaging. .

3. The open-domain relation inference method based on a boundary-aware prototype according to claim 1, characterized in that, In step S2, the boundary generator Includes a first linear layer, a ReLU activation function, a second linear layer, and a residual connection layer, along with a relational boundary vector. The generation process is as follows: S201, with dimension as Relational prototype vector The input is processed by the first linear layer and then subjected to a linear transformation, followed by ReLU activation function processing to extract high-dimensional semantic features; S202. Input the high-dimensional semantic features into the second linear layer and map them as... Each dimension is set of offset vectors ,in It is the output dimension of the second linear layer. It is a set of offset vectors The An offset, ; S203, Convert the relation prototype vector and offset vector set The input residual connection layers are summed to obtain... A set of relation boundary vectors ,in It is a set of relation boundary vectors The A relational boundary vector, , .

4. The open-domain relation inference method based on a boundary-aware prototype according to claim 1, characterized in that, The boundary calibration loss function Indicates the similarity of entity relationships Similarity with relation boundaries The weighted average difference is calculated using the following formula: , , in, Representation and relation prototype vector Entity feature vectors with the same relation label This indicates the calculation of cosine similarity. The pre-defined positive sample boundary margin, This represents the modified linear unit activation function, as shown in the formula: , This indicates that any real number can be input.

5. The open-domain relation inference method based on a boundary-aware prototype according to claim 4, characterized in that, The boundary rejection loss Indicates the similarity of entity relationships in negative samples Similarity with relation boundaries The weighted average difference is calculated using the following formula: , in, Representation and relation prototype vector Entity feature vectors with different relation labels This is the preset negative sample boundary margin.

6. The open-domain relation inference method based on a boundary-aware prototype according to claim 1, characterized in that, The process of step S4 is as follows: S401, Calculate the natural language text to be recognized. Instance feature vectors With all known relation prototype vectors cosine similarity ; S402. Calculate the prototype vector for each relation. With each relation prototype vector generated A set of relation boundary vectors The mean cosine similarity is used as the dynamic threshold for determining the relationship. ; S403. Traverse all relation categories. If a relation satisfies... Then the samples will be classified as The largest known relation label R; if none of the relations satisfy this condition, then it is considered an unknown relation. .

7. An open-domain relation inference system based on a boundary-aware prototype, used to execute the open-domain relation inference method based on a boundary-aware prototype as described in any one of claims 1 to 6, characterized in that, The open-domain relation inference system includes: The feature encoding module encodes natural language text containing head and tail entities. Input the Sentence-BERT model to extract instance feature vectors At the same time, the relationship tags The descriptive text and aliases are input into the Sentence-BERT model to extract relation feature vectors, and a weighted average is calculated to obtain the relation prototype vector. ; The dynamic boundary generation module constructs a boundary generator based on a residual network structure. , relation prototype vector Input boundary generator Generate a prototype vector relative to the relation through nonlinear feature transformation. Feature offset and feature offset Superimposed on relation prototype vector Generate a set of relation boundary vectors that define the semantic scope of the relation. ; The collaborative training module constructs a boundary calibration loss function. and boundary repulsion loss ,right Model and boundary generator Train to make the instance feature vector Relation boundary vectors distributed under the same relation label Within the defined scope, at the same time, make the instance feature vector Relation boundary vectors distributed across different relation labels Outside the defined scope; The dynamic inference module will process the natural language text to be recognized. and relationship tags Perform steps S1 and S2 to obtain the instance feature vector. relational prototype vector and relation boundary vector Calculate the instance feature vector With relation prototype vector Entity relationship similarity and relation prototype vector Relationship boundary vector with the same relation label Relationship Boundary Similarity ,like Greater than Then determine the natural language text. Belongs to relational tags Otherwise, it is judged as natural language text. Not a relationship tag .

8. An electronic device comprising a processor and a memory for storing a processor-executable program, characterized in that, When the processor executes the program stored in the memory, it implements the open domain relation inference method based on the boundary-aware prototype as described in any one of claims 1 to 6.

9. A storage medium storing a program, characterized in that, When the program is executed by the processor, it implements the open domain relation inference method based on the boundary-aware prototype as described in any one of claims 1 to 6.