Text entity relationship joint extraction method and system based on multi-task learning and long-distance dependence
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- CHONGQING UNIV OF POSTS & TELECOMM
- Filing Date
- 2026-03-13
- Publication Date
- 2026-06-19
Smart Images

Figure CN122242507A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the fields of natural language processing and artificial intelligence, specifically to a method and system for joint extraction of text entity relations based on multi-task learning and long-distance dependency. Background Technology
[0002] With the rapid development of internet finance, massive amounts of financial data (such as research reports, announcements, and news) are exploding. Constructing financial knowledge graphs is key to extracting value from this unstructured data. However, financial texts are characterized by high technicality, complex sentence structures, and numerous nested, long, and difficult sentences. Existing technologies mainly suffer from the following shortcomings: 1. Weak ability to capture long-distance dependencies: In financial texts, the two entities constituting a relationship are often far apart, with numerous modifiers in between. Traditional CNN or ordinary RNN / LSTM models struggle to capture the semantic relationships between entities over long distances, leading to "entity pair loss." 2. Error propagation in pipelined models: Traditional methods separate entity recognition (NER) and relation extraction (RE) into two steps. Errors in NER directly cause RE failures and cannot utilize relation information to feed back into entity recognition. 3. Feature entanglement problems under complex overlapping relationships: Financial and other professional texts exhibit dense single entity overlap (SEO) and entity pair overlap (EPO) phenomena. Traditional joint extraction models often employ a single feature sharing pool and a single-channel classifier. Different semantic relations are highly entangled in the same feature space, which makes the model prone to classifier mutual exclusion collapse when facing complex contexts where the same entity contains multiple relations, resulting in a large number of missed effective relations. Summary of the Invention
[0003] To address the problems existing in the prior art, this invention proposes a text relation joint extraction method based on multi-task learning and feature decoupling. The method includes: acquiring unstructured text data to be processed, cleaning and preprocessing the text data; and inputting the preprocessed text data into a trained text entity relation extraction model to obtain text entity relations.
[0004] Training the text entity relation extraction model includes: acquiring a training dataset containing unstructured text data and entity and relation labels; cleaning and preprocessing the data in the training dataset; inputting the preprocessed text data into a shared encoding layer, which includes a pre-trained BERT module and a CA-ON-LSTM module; obtaining word-level contextual dynamic semantic vectors and global semantic classification label vectors through the BERT module; inputting the contextual semantic vectors and global semantic classification label vectors into the CA-ON-LSTM module; and unsupervisedly inducing a hierarchical dependency syntax tree of the text by introducing a master gating mechanism with a global context bias term to generate a hierarchical syntactic feature encoding sequence that filters redundant modifiers; inputting the hierarchical syntactic feature encoding sequence into a multi-task output layer, which simultaneously performs entity boundary recognition based on an entity boundary awareness mechanism and feature decoupling relation classification based on the orthogonal fusion of global multi-head attention and local multi-scale convolution to obtain triples containing entities and text relations; constructing a model loss function based on text relations; adjusting the model parameters; and completing model training when the loss function converges.
[0005] A text relation joint extraction system based on multi-task learning and feature decoupling, the system includes: a data acquisition and preprocessing unit, a semantic and syntactic joint encoding unit, a multi-task feature decoupling reasoning unit, and a knowledge graph construction unit;
[0006] The data acquisition and preprocessing unit is used to acquire multi-source texts of news, research reports, and announcements and perform standardized processing.
[0007] The semantic and syntactic joint encoding unit incorporates BERT and CA-ON-LSTM models to simultaneously extract the contextual semantic features and hierarchical syntactic structure features of the text.
[0008] The multi-task feature decoupling inference unit has a built-in dual-channel network architecture, which is used to simultaneously perform boundary-aware entity recognition and feature decoupling-based entity overlap relationship classification, and output structured triples.
[0009] The knowledge graph construction unit is used to perform entity alignment and knowledge fusion on the extracted results, and update the knowledge graph of a specific domain.
[0010] The beneficial effects of this invention are:
[0011] First, this invention introduces a context-adaptive ordered long short-term memory network (CA-ON-LSTM) in the shared coding layer, utilizing global semantic priors extracted by BERT to guide the ordering and gating mechanism of neurons. This mechanism breaks the limitations of traditional flattened sequences, simulating the tree structure of natural language, automatically filtering redundant local modification noise in long and complex sentences, and locking core entities in high-dimensional neurons for lossless long-range transmission. This effectively captures long-distance dependencies between entities with large spans, significantly reducing the problem of entity pair loss in long text scenarios.
[0012] Second, this invention constructs a dual-channel network architecture based on multi-view feature decoupling at the private layer decoding end, completely breaking through the single-dimensional feature matching bottleneck of traditional classifiers. By fusing the macroscopic independent subspace projection capability of multi-head attention with the microscopic local trigger word capture capability of multi-scale convolutional neural networks, orthogonal fusion of heterogeneous features is achieved. This design perfectly dismantles the semantic entanglement phenomenon that occurs in the latent space of highly homologous entity pairs, enabling the model to accurately peel off and extract highly dense single entity overlap (SEO) and entity pair overlap (EPO) relationships.
[0013] Third, this invention employs a multi-task learning framework to jointly optimize entity recognition and relation extraction, utilizing shared underlying features to enable the model to simultaneously learn the interaction information of entities and relations. Simultaneously, by combining Conditional Random Fields (CRF) and a joint loss function, the gradient backpropagation of the relation extraction task can feed back into the entity boundary localization, effectively avoiding the error cascading propagation problem of traditional pipeline models and greatly improving the feature extraction accuracy and generalization ability of the model in complex vertical domains (such as the financial field). Attached Figure Description
[0014] Figure 1 This is a schematic diagram of the overall process of the method of the present invention;
[0015] Figure 2 This is a diagram of the deep neural network architecture of the joint extraction model in this invention. Detailed Implementation
[0016] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.
[0017] A method for joint extraction of text entity relations based on multi-task learning and long-distance dependency, such as... Figure 1 and Figure 2As shown, the method includes: acquiring unstructured text data to be processed, cleaning and preprocessing the text data; inputting the preprocessed text data into a trained text entity relation extraction model to obtain text relations; training the text entity relation extraction model includes: acquiring a training dataset, wherein the training dataset contains unstructured text data and entity labels and relation labels; cleaning and preprocessing the data in the training dataset; inputting the preprocessed text data into a shared encoding layer to obtain an encoding sequence; inputting the encoding sequence into a multi-task output layer to obtain text relations; constructing a model loss function based on the text relations, adjusting the model parameters, and completing the model training when the loss function converges.
[0018] To address the issues of lost entity pair dependencies due to excessively long sentences and numerous modifiers in text, and classifier collapse caused by overlap of single entities and entity pairs in complex contexts, this embodiment employs a multi-task learning framework of "CA-ON-LSTM shared layer + MV-FDD dual-channel private layer," simultaneously outputting entity labels and relation categories. Specifically, it includes:
[0019] Step 1: Data acquisition and preprocessing.
[0020] Data Acquisition: Obtain unstructured text data from terminals or web crawlers, including company announcements, financial news, research reports, etc. Data Cleaning: Remove HTML tags, garbled characters, and non-text symbols. Sequence Labeling: Use the BIO (Begin, Inside, Outside) or BIOES labeling system to label the boundaries and types of entities in the text. Relationship Labeling: Label the relationship categories of entity pairs in the text (e.g., "investment," "belongs to," "competition," "no relationship"). Dataset Construction: Divide the cleaned text sequence S and its corresponding entity label sequence and relationship label sequence into training set, validation set, and test set.
[0021] Step 2: Construct a joint extraction model based on multi-task learning and feature decoupling. The model architecture in this embodiment includes the following four layers:
[0022] (1) BERT Embedding Layer
[0023] Input text sequence Input a pre-trained BERT model. BERT utilizes a multi-head self-attention mechanism to obtain word-level contextual dynamic semantic representations. Let the length of the input sequence be... The output sequence feature vector is Simultaneously, the classification tag vector ([CLS] token vector) condensing the macro-level semantics of the entire text, extracted from the top-level output of BERT, is denoted as... This serves as a subsequent global context bias term.
[0024] (2) CA-ON-LSTM Syntax-Aware Shared Layer
[0025] To address the problem of long-distance dependencies, this invention integrates a Context-Adaptive ON-LSTM (CA-ON-LSTM) network after BERT. The principle is as follows: Traditional LSTM networks are physically flat, with neurons performing forgetting and writing information independently and in a strictly unordered manner. This mechanism cannot distinguish between core information (such as subject, verb, and object) and subordinate information (such as modifiers and adverbs). This leads to the loss of high-level semantics due to low-level local noise when processing long texts containing many nested clauses, resulting in the breakdown of long-distance dependencies. To overcome the shortcomings of traditional LSTM, the ON-LSTM network introduces a master forgetting gate and a master input gate. Through a cumulative probability function (cumax), neurons are given explicit hierarchical physical meaning: neurons with smaller indices update more frequently and are responsible for recording short-term local modifiers; neurons with larger indices update less frequently and are responsible for remembering long-distance core information, thus implicitly learning the tree-like grammatical structure of the text. However, conventional ON-LSTM relies solely on historical states within a local sliding window to generate gating signals, lacking a global perspective. Therefore, this invention further proposes CA-ON-LSTM, innovatively constructing a top-down global guidance mechanism. This mechanism utilizes the global semantic vector extracted by BERT. Deep injection is incorporated into the generation computation of the main forget gate and the main input gate. The network computes a linear joint transformation incorporating global contextual biases based on the current input, historical state, and global priors. The joint transformation matrix is then activated using the cumax function to generate a master gate signal with strict hierarchical constraints: Through the above mechanism, a hierarchical grammatical feature encoding sequence for filtering redundant modifiers is generated. This layer serves as a shared foundation for multi-task learning, providing extremely clean semantic and syntactic features for both NER and RE tasks.
[0026] (3) Named Entity Recognition (NER) Private Layer
[0027] This branch is used to identify the boundaries and type of an entity.
[0028] Boundary-aware sequence feature enhancement: sharing layer output A bidirectional long short-term memory (BiLSTM) network is input to further enhance the dynamic evolution of the context at each location. To more accurately capture entity boundaries, for each time step... The hidden states are calculated, and their difference vectors with the previous and next time steps are calculated respectively. The original features are concatenated with the difference vectors to explicitly encode boundary mutation information. The enhanced features are mapped to the label space and fed into a Conditional Random Field (CRF) layer to learn the transition rules between labels (e.g., "I-ORG" cannot directly follow "B-PER"), and finally the optimal entity label sequence is output.
[0029] (4) Relation Extraction (RE) Private Layer
[0030] This branch is used to determine the relationships between entities. To address the challenge of feature entanglement caused by overlapping relationships, this invention designs a dual-channel network (MV-FDD) for multi-view feature decoupling: Channel 1: Global semantic decoupling based on multi-head attention mechanism. A multi-head self-attention mechanism is used to map highly homogeneous entity features into multiple independent attention subspaces. Each attention head focuses on different relationship triggering patterns, thereby decoupling entangled relationships into independent subspaces. Channel 2: Local trigger word capture based on multi-scale CNN. Multiple one-dimensional convolutional kernels of different sizes (e.g., window sizes of 1, 3, 5) are used in parallel to slide on the input sequence. After max pooling, local relationship trigger word features of different granularities (e.g., single characters, three-character phrases, five-character phrases) around the entity are extracted. Orthogonal fusion and joint decoding: The global features output from the left channel and the local features output from the right channel are orthogonally fused by element-wise addition. For any two candidate entities in the sequence, obtain the head and tail features of the head entity and the tail entity, as well as the context pooling features between the two entities. After concatenating the above features, input them into a multilayer perceptron (MLP) for relation classification and calculate the probability distribution of relation categories.
[0031] Step 3: Multi-task joint training
[0032] Define the total loss function for:
[0033] in, This is a hyperparameter used to balance the weights of the two tasks. For the loss function of the named entity recognition task, negative log-likelihood loss is used: in, Given the input sequence, A sequence of real entity labels. Let x be all possible label sequences in the conditional random field, and score(x,y) be the path score composed of the emission probability and the transition probability. For the loss function of the relation extraction task, the cross-entropy loss function is adopted: in, This is the set of all candidate positive entity pairs in the text that have a real relationship. The number of entity pairs in the set. Represents a set The first in Individual entities and the first A pair of entities consisting of tail entities. For this entity, the actual predefined relation type, In the multi-task output layer, the relation extraction branch predicts whether the entity pair belongs to the true relation type. The probability of [the model's parameters] is determined. The AdamW optimizer is used for end-to-end joint updates of the model parameters. During training, the entity recognition task helps the model more accurately locate boundaries, while the relation extraction task prompts the shared layers to learn deeper semantic dependencies; the two complement each other.
[0034] Step 4: Anomaly and conflict detection and knowledge base update.
[0035] Input the financial text to be detected into the trained model. Output the entities and triples (e.g., <Party A, Supplier, Party B>). Conflict detection: If the NER identifies an entity as a "person's name," but the RE identifies it as the subject of a "wholly-owned subsidiary" relationship, rule verification is triggered, and the one with higher confidence is used. Store the final cleaned triples in a graph database (e.g., Neo4j) to complete the construction of a domain-specific knowledge graph.
[0036] A text relation joint extraction system based on multi-task learning and feature decoupling is disclosed. The system comprises: a data acquisition and preprocessing unit, a semantic and syntactic joint encoding unit, a multi-task feature decoupling inference unit, and a knowledge graph construction unit. The data acquisition and preprocessing unit acquires multi-source texts such as news articles, research reports, and announcements and performs standardization processing. The semantic and syntactic joint encoding unit incorporates BERT and CA-ON-LSTM models to simultaneously extract contextual dynamic semantic features and filter redundant hierarchical syntactic structural features. The multi-task feature decoupling inference unit incorporates a dual-channel network architecture to simultaneously perform boundary-aware entity recognition and entity overlap relation feature decoupling classification based on the orthogonal fusion of global attention and local convolution, outputting structured triples. The knowledge graph construction unit performs entity alignment and knowledge fusion on the extracted results to update the domain-specific knowledge graph.
[0037] In this embodiment, the system further includes a long-tail relation enhancement module, which is used to perform weighted learning on specific relation categories with sparse samples in relation extraction tasks.
[0038] The system implementation method of the present invention is the same as the method implementation method.
[0039] The above-described embodiments further illustrate the purpose, technical solution, and advantages of the present invention. It should be understood that the above-described embodiments are merely preferred embodiments of the present invention and are not intended to limit the present invention. Any modifications, equivalent substitutions, improvements, etc., made to the present invention within the spirit and principles of the present invention should be included within the protection scope of the present invention.
Claims
1. A method for joint extraction of text entity relations based on multi-task learning and long-distance dependency, characterized in that, include: Obtain the unstructured text data to be processed, and clean and preprocess the text data; The preprocessed text data is input into the trained text entity relation extraction model to obtain the text entity relations. Training the text entity relation extraction model includes: acquiring a training dataset containing unstructured text data and entity and relation labels; cleaning and preprocessing the data in the training dataset; inputting the preprocessed text data into a shared encoding layer, which includes a pre-trained BERT module and a CA-ON-LSTM module; obtaining word-level contextual dynamic semantic vectors and global semantic classification label vectors through the BERT module; inputting the contextual semantic vectors and global semantic classification label vectors into the CA-ON-LSTM module; and unsupervisedly inducing a hierarchical dependency syntax tree of the text by introducing a master gating mechanism with a global context bias term to generate a hierarchical syntactic feature encoding sequence that filters redundant modifiers; inputting the hierarchical syntactic feature encoding sequence into a multi-task output layer, which simultaneously performs entity boundary recognition based on an entity boundary awareness mechanism and feature decoupling relation classification based on the orthogonal fusion of global multi-head attention and local multi-scale convolution to obtain triples containing entities and text relations; constructing a model loss function based on text relations; adjusting the model parameters; and completing model training when the loss function converges.
2. The text entity relation joint extraction method based on multi-task learning and long-distance dependency as described in claim 1, characterized in that, Text data cleaning and preprocessing includes: removing HTML tags, garbled characters, and non-text symbols from the text data; text data preprocessing includes: labeling entities in the text using the BIO annotation system; labeling the relation categories of entity pairs in the text; and processing the cleaned text sequence. and its corresponding entity label sequence and relationship tags It is divided into training set and validation set.
3. The text entity relation joint extraction method based on multi-task learning and long-distance dependency as described in claim 1, characterized in that, The CA-ON-LSTM module uses the control mechanisms of the main forget gate and the forget gate to make the neuron state updates follow hierarchical constraints, thereby implicitly learning the tree-like grammatical structure of the text in the sequence model to capture long-distance dependencies between entities with large spans.
4. The text entity relation joint extraction method based on multi-task learning and long-distance dependency as described in claim 3, characterized in that, The neuron update specifically includes: extracting the global semantic classification label vector output from the top layer of the BERT module as a global context bias term; performing a linear joint transformation on the current input vector, the hidden state vector from the previous time step, and the global context bias term, and activating them through the cumulative Softmax activation function to generate a main forgetting gate vector with strictly monotonically increasing characteristics and a main input gate vector with monotonically decreasing characteristics, respectively; calculating an overlap mask term using the intersection of the main forgetting gate vector and the main input gate vector as a buffer for smoothly transitioning between high and low level boundaries in the physical space; and decomposing the cell state tensor update process into a high-level forced preservation region, a low-level forced rewriting region, and a standard update transition region based on the main forgetting gate vector, the main input gate vector, and the overlap mask term, so that high-level neurons maintain long-term stability to transmit the main semantics across paragraphs, and low-level neurons are rewritten frequently to absorb short-distance local modification features, thereby generating an encoding sequence containing hierarchical grammatical features in the hidden state space.
5. The text entity relation joint extraction method based on multi-task learning and long-distance dependency as described in claim 1, characterized in that, The multi-task output layer includes a named entity recognition branch and a relation extraction branch.
6. The text entity relation joint extraction method based on multi-task learning and long-distance dependency as described in claim 5, characterized in that, The data processing steps of the Named Entity Recognition branch include: inputting the hierarchical syntax feature encoding sequence output by the shared encoding layer into a bidirectional long short-term memory network to further extract sequence features; for the hidden state at each time step, calculating the difference vector between it and the hidden state at the previous and next time steps respectively, concatenating the original hidden state with the two difference vectors to explicitly encode entity boundary information; mapping the concatenated features to the label space, learning the transition rules between labels through a conditional random field layer, and decoding to output the optimal entity label sequence.
7. The text entity relation joint extraction method based on multi-task learning and long-distance dependency as described in claim 5, characterized in that, The process of relation extraction branch processing data includes: constructing a multi-view feature decoupling network containing a left global channel and a right local channel; the left global channel uses a multi-head self-attention mechanism to map homogeneous entity features to multiple independent attention subspaces to extract global semantic dependencies between entities; the right local channel uses multiple one-dimensional convolutional kernels with different window sizes to slide on the input sequence in parallel, and after max pooling, they are concatenated to extract local phrase trigger word features of different granularities around the entities; the global feature tensor output by the left channel and the local feature tensor output by the right channel are orthogonally fused by element-wise addition, and then the first and last features of candidate entity pairs and the contextual pooling features between entities are extracted, concatenated, and input into a multilayer perceptron to output the probability distribution of relation categories.
8. The text entity relation joint extraction method based on multi-task learning and long-distance dependency as described in claim 1, characterized in that, The model's loss function is: ; in, The loss function for the named entity recognition task is... Let λ be the loss function for the relation extraction task, and λ be a hyperparameter used to balance the weights of the two tasks.
9. A text entity relation joint extraction system based on multi-task learning and long-distance dependency, the system being used to execute the text entity relation joint extraction method based on multi-task learning and long-distance dependency as described in any one of claims 1 to 8, characterized in that, The system includes: a data acquisition and preprocessing unit, a semantic and syntactic joint encoding unit, a multi-task joint reasoning unit, and a knowledge graph construction unit; The data acquisition and preprocessing unit is used to acquire multi-source texts of news, research reports, and announcements and perform standardized processing. The semantic and grammatical joint encoding unit incorporates BERT and a context-adaptive ordered neuron long short-term memory network model, which is used to simultaneously extract the contextual dynamic semantic features of the text and the hierarchical grammatical structure features that filter out redundant modifications. The multi-task joint reasoning unit is used to simultaneously perform entity boundary recognition and entity relationship classification, and output structured triples; The knowledge graph construction unit is used to perform entity alignment and knowledge fusion on the extracted results and update the financial knowledge graph.
10. A text entity relation joint extraction system based on multi-task learning and long-distance dependency as described in claim 9, characterized in that, The system also includes a long-tail relationship enhancement module, which is used to perform weighted learning on financial-specific relationship categories with sparse samples in the relationship extraction task.