A process standard entity relationship extraction method fusing deep learning and dependency syntax
By combining the MacBERT-BiGRU-IDCNN-CRF model with dependency syntax, the problem of ambiguous entity boundaries in the field of process standards was solved, enabling accurate extraction of entity relationships and efficient construction of knowledge graphs, thereby improving the granularity of knowledge representation and analytical capabilities.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- KUNMING UNIV OF SCI & TECH
- Filing Date
- 2024-09-30
- Publication Date
- 2026-06-26
AI Technical Summary
In the construction of knowledge graphs in the field of process standards, entity boundaries are fuzzy and ontology engineering is complex, making it difficult for existing technologies to accurately extract entity relationships.
By employing the MacBERT-BiGRU-IDCNN-CRF model combined with dependency parsing, and through annotation, word segmentation, dependency parsing, and triplet construction rules, we can achieve accurate entity boundary segmentation and relation extraction.
It improves the accuracy of entity boundary delineation and the granularity of semantic representation of knowledge graphs, thereby enhancing the richness and analytical capabilities of knowledge graphs.
Smart Images

Figure CN119166754B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to a method for extracting process standard entity relations that integrates deep learning and dependency syntax, belonging to the field of knowledge graph technology. Background Technology
[0002] Process knowledge has a significant impact on product development, quality control, innovation enhancement, and overall competitiveness in modern enterprises, serving as a crucial intellectual resource. In the product manufacturing process of process manufacturing companies, a wealth of data on process requirements, operating procedures, and process standards is accumulated, playing a vital role in modern industrial production and directly impacting product quality and the competitiveness of the manufacturing enterprise. Therefore, constructing a knowledge graph of production process standards is of great significance for achieving digital production and controlling product quality in enterprises.
[0003] One of the key technologies for constructing knowledge graphs is extracting triple structures from unstructured data to represent semantic relationships between entities. Currently, automated construction of knowledge graphs in vertical domains still faces many challenges. On the one hand, the knowledge structure of data in vertical domains is more complex, often containing intricate ontology engineering and rule-based knowledge; on the other hand, the quality requirements for knowledge extraction in vertical domains are higher, relying heavily on joint extraction from structured and unstructured data within enterprises. Furthermore, vertical domain knowledge is highly specialized, with different domains having different proprietary vocabulary and concepts that require specialized processing and extraction. To address these issues, some scholars have proposed constructing triple relationships by combining entity dependency relationships within sentences. However, due to the highly specialized nature of technical standard texts and the fuzzy entity boundaries, traditional word segmentation tools struggle to accurately delineate entity boundaries. Summary of the Invention
[0004] To address the challenges posed by ambiguous entity boundaries and complex ontology engineering in the field of production process standards, this invention provides a method for extracting entity relations from process standards that integrates deep learning and dependency syntax. This method solves the problem of entity boundary delineation in text by natural language processing tools. Furthermore, a triplet construction rule is proposed to adapt to the textual characteristics of the Chinese process standards domain.
[0005] The technical solution of this invention is:
[0006] Firstly, a method for extracting entity relations of process production standards by integrating deep learning and dependency syntax is provided, including: Step 1: Collecting unstructured text of process production standards and labeling the entities to be extracted to establish an entity relation dataset; Step 2: Building a MacBERT-BiGRU-IDCNN-CRF entity extraction model, adjusting hyperparameters, and dividing the labeled entity relation dataset into training and testing sets for training the MacBERT-BiGRU-IDCNN-CRF entity extraction model; Step 3: Extracting entities from the unstructured text of process production standards to be extracted using the trained deep learning model; Step 4: Importing the extracted entities into a natural language processing tool using a dictionary; segmenting the unstructured text of process production standards to be extracted into sentences; performing dependency analysis on each component based on the segmentation results to obtain the dependency relations of the sentences; Step 5: Based on the dependency relations of the sentences, performing hyponymy / hypernymy division on each component of the sentence to obtain hyponymy / hypernymy relations; converting the sentences into triple structures based on the triple construction rules of the hyponymy / hypernymy relations and importing them into a graph database.
[0007] Furthermore, using text annotation tools, the BMEO annotation strategy is adopted to annotate the text, classifying the entity relationship types contained in the text into several categories: equipment, process parameters, standards, processes, methods, materials, and functions. Based on these seven categories, the unstructured text of the process production technology is annotated.
[0008] Furthermore, the MacBERT-BiGRU-IDCNN-CRF entity extraction model is constructed using MacBERT layers, BIGRU layers, IDCNN layers, feature fusion layers, fully connected layers, and a CRF model. The MacBERT layer is a pre-trained deep bidirectional Transformer model used to divide the input labeled data into sentences and characters, and then convert them into character features and sentence features respectively. The BiGRU layer uses two GRU layers, and the character features converted by the MacBERT layer are input into the BiGRU layer for processing. The sentence features output by the MacBERT layer are concatenated with the character features to achieve the purpose of sentence feature dimensionality enhancement, and then the concatenated character features are input into the IDCNN layer using iterative dilated convolution. The character features output by IDCNN, which contain syntactic and sentence structure information, and the character features output by BiGRU, which contain hyponymous and syntactic semantic information, are fused in the feature fusion layer using a weighted average method. Then, the fused character features are input into the fully connected layer to convert them into the dimension required for the output result. Finally, the CRF model outputs the predicted label result for each character.
[0009] Furthermore, the triplet construction rules are as follows: First, determine whether a sentence can independently construct a triplet based on its hierarchical relationship. Specifically, this can be done by calculating the difference between the maximum and minimum values of hops. If the difference is greater than or equal to 2, it is determined that a triplet can be constructed independently. In this case, the head entity and tail entity are determined based on the hierarchical relationship of the entities, and the predicate is used as the corresponding relation to construct the triplet. If the difference is less than 2, it means that the sentence components cannot independently construct a triplet. In this case, it is necessary to introduce the name of the superior of the sentence as the head entity of the triplet, the predicate in the sentence as the relation, and the remaining components as the tail entity to construct the triplet. Here, hops represents the distance from any lexical to the root word Root.
[0010] Secondly, a system for extracting entity relations for process standards, integrating deep learning and dependency syntax, is provided. This system includes: an annotation module for performing step 1: collecting unstructured text of process standards and annotating the entities to be extracted to establish an entity relation dataset; a building module for performing step 2: building a MacBERT-BiGRU-IDCNN-CRF entity extraction model, adjusting hyperparameters, and dividing the labeled entity relation dataset into training and testing sets for training the MacBERT-BiGRU-IDCNN-CRF entity extraction model; and an extraction module for performing step 3: extracting the entities to be extracted. The unstructured text of the process production standard is used to extract entities through a trained deep learning model; the module is used to perform step 4: import the extracted entities into a natural language processing tool through a dictionary; the unstructured text of the process production standard to be extracted is segmented into sentences; based on the segmentation results, dependency analysis is performed on each component to obtain the dependency relations of the sentences; the transformation module is used to perform step 5: based on the dependency relations of the sentences, the components in the sentences are divided into hyponyms and hypernyms to obtain the hyponyms and hypernyms; based on the triple construction rules constructed according to the hyponyms and hypernyms, the sentences are transformed into triple structures and imported into a graph database.
[0011] Thirdly, a terminal is provided, including a processor, a memory, and a computer program stored in the memory and executable on the processor, the processor being configured to execute the process-standard entity relation extraction method that integrates deep learning and dependency syntax as described in any one of the above.
[0012] The beneficial effects of this invention are as follows: Addressing the problem of ambiguous entity boundaries in process standard knowledge texts, this invention proposes a deep learning model, MacBERT-BiGRU-IDCNN-CRF. By training and fusing the contextual features and sentence structure information of the text, the model's ability to extract entities is enhanced, thereby improving the accuracy of entity boundary segmentation in dependency parsing tasks. Furthermore, based on the characteristics of process standard texts, a triplet construction rule is proposed. Entity relationships extracted using this method are more flexible, which can, to some extent, solve the problem of complex ontology engineering in domain knowledge graphs. Moreover, the knowledge graph constructed using this method has more refined knowledge representation, containing richer semantic information, which is helpful for subsequent descriptive data analysis and data analysis based on logical reasoning. Attached Figure Description
[0013] Figure 1 This is a flowchart of the present invention;
[0014] Figure 2 This is a schematic diagram of the entity relation extraction process that integrates deep learning and dependency syntax;
[0015] Figure 3 This is an example of labeling using the Label Studio annotation tool;
[0016] Figure 4 Here is an example of the dependency analysis results;
[0017] Figure 5 It is a partial set of three-part standards for cigarette manufacturing processes. Detailed Implementation
[0018] To make the objectives, technical solutions, and advantages of the embodiments of the present invention clearer, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention. It should be noted that, unless otherwise specified, the embodiments and features in the embodiments of this application can be arbitrarily combined with each other.
[0019] Example 1: As Figure 1-5As shown, according to a first aspect of the present invention, a method for extracting entity relations of process standards by integrating deep learning and dependency syntax is provided, comprising: Step 1: collecting unstructured text of process standards for production processes and labeling the entities to be extracted to establish an entity relation dataset; Step 2: building a MacBERT-BiGRU-IDCNN-CRF entity extraction model, adjusting hyperparameters, and dividing the labeled entity relation dataset into training and testing sets for training the MacBERT-BiGRU-IDCNN-CRF entity extraction model; Step 3: extracting the entities to be extracted... Step 4: Extract entities from the unstructured text of the process production standard using a trained deep learning model; Step 5: Import the extracted entities into a natural language processing tool using a dictionary; Segment the unstructured text of the process production standard into words; Perform dependency analysis on each component based on the segmentation results to obtain the dependency relations of the sentences; Step 6: Based on the dependency relations of the sentences, perform hypernym / hypernym division on each component of the sentences to obtain the hypernym / hypernym relations; Transform the sentences into triple structures based on the triple construction rules of the hypernym / hypernym relations and import them into a graph database.
[0020] Furthermore, using the Label Studio text annotation tool, the BMEO annotation strategy is adopted to annotate the text, classifying the entity relationship types contained in the text into several categories: equipment, process parameters, standards, processes, methods, materials, and functions. Based on these seven categories, the text related to process production technology is annotated.
[0021] Furthermore, the MacBERT-BiGRU-IDCNN-CRF entity extraction model is constructed using MacBERT layers, BIGRU layers, IDCNN layers, feature fusion layers, fully connected layers, and a CRF model. The MacBERT layer is a pre-trained deep bidirectional Transformer model used to divide the input labeled data into sentences and characters, and then convert them into character features and sentence features respectively. The BiGRU layer uses two GRU layers, inputting the character features converted by the MacBERT layer into the BiGRU layer for processing. This allows the sentence features to be applied to entity recognition based on character annotations. The task requires dimensionality enhancement of sentence features. Sentence features output from the MacBERT layer are concatenated with character features to achieve this dimensionality enhancement. The concatenated character features are then input into an IDCNN layer using iterative dilated convolutions. The character features output from IDCNN, containing syntactic and sentence structure information, are then fused with the character features output from BiGRU, containing hyponymous semantic information, using a weighted average method in the feature fusion layer. The fused character features are then input into a fully connected layer to convert them into the required dimensions for the output. Finally, the CRF model outputs the predicted label for each character; the label includes entity boundaries and entity categories.
[0022] Specifically, the main components of the MacBERT-BiGRU-IDCNN-CRF entity extraction model are described below:
[0023] The MacBERT layer described is a pre-trained deep bidirectional Transformer model. To adapt to the characteristics of Chinese text, MacBERT improves its masking strategy by using similar words instead of [MASK] for masking, thus enhancing the model's ability to represent words and sentences. Furthermore, MacBERT is a general-purpose language representation model pre-trained on large-scale text corpora. This transfer learning capability allows the model to perform well even in entity recognition tasks with smaller datasets or in specific domains. In this model, the MacBERT encoder layer is used to transform the input sentence into a sequence of character features and sentence features.
[0024] The BiGRU layer is an important variant of the Recurrent Neural Network (RNN). BiGRU uses two GRU layers to process the forward and backward information of the sequence, respectively. This structure allows it to capture richer contextual information when processing sequences, making it particularly suitable for tasks such as natural language processing, time series prediction, and speech recognition. Structurally, GRU is a simplified version of LSTM. While each LSTM unit has three gate structures (input gate, forget gate, and output gate), GRU only has two (update gate and reset gate). Due to its simpler structure, fewer parameters, and higher computational efficiency, GRU units can more effectively solve the gradient vanishing or exploding problems that occur in RNN network structures. The specific formulas involved in BiGRU are as follows:
[0025] r t =σ(W r ·[h t-1 ,x t ])
[0026] z t =σ(W z ·[h t-1 ,x t ])
[0027]
[0028] In the formula, x is input from the current position. t and the output h of the hidden layer at the previous position t-1 The expression is composed of linear transformations, summation, and then a sigmoid activation function σ, W. r and W z These represent the weights of the reset gate and the update gate, respectively. Due to the existence of σ, its output value is between 0 and 1, used to select how much information to retain. The calculation process for the update gate is similar. The updated value is determined by the weight W of the new candidate state and the reset gate r. t The previous position outputs h. t-1 Enter x at this position t The final output at the current position is h, which is jointly determined. t , by the updated value Update Gate Z t Enter h in the previous position t-1 The decision is made jointly. The computation process for both forward and reverse GRUs is the same, and the final output of BiGRU is the concatenation of the forward and reverse hidden states.
[0029] In entity recognition tasks, the label of each character is not only related to the character features above it, but also closely related to the features below it. Therefore, the model of this invention chooses bidirectional GRU, which can take into account the text features of the context while training character features. Moreover, GRU optimizes the model structure compared to LSTM, which can improve the training efficiency of the model while ensuring the model's learning effect on text features.
[0030] The IDCNN layer uses an iterative dilated convolution structure. Dilated convolution, in traditional convolution operations, expands the receptive field of the convolution operation by adding dilations in the kernel without increasing the number of parameters. Iterative dilated convolution further increases the receptive field through multiple layers of iteration, thereby capturing longer-range contextual information. Compared to traditional convolutional neural networks, IDCNN can significantly improve the model's ability to model long-range dependencies without increasing the number of parameters, thus improving the model's performance in text sequence processing tasks. The formula for calculating the receptive field of ID-CNN dilated convolution is as follows:
[0031] F i+1 =(2 i+1 -1)×(2 i+1 -1)
[0032] In the formula, i is the step size. By adding an iterative convolutional neural network and taking advantage of its parallel computing characteristics, the feature vector that combines sentence features is trained, so that the trained features contain feature representations of different scales. This helps the network to understand the input data more comprehensively and improve its ability to represent the data. Feature fusion with the character feature vector trained by BiGRU can make these abstract feature representations more discriminative and generalizable, which helps the subsequent CRF layer to perform classification and labeling tasks more accurately.
[0033] The feature fusion layer fuses the feature vectors trained with contextual features from BiGRU with the character feature vectors trained with IDCNN that include syntactic and sentence structure features using a weighted average method. The formula for the weighted average is as follows:
[0034] h i =w·x i +(1-w)·y i
[0035] In the formula h i x represents the output hidden layer feature vector. i Let y represent the vector input from the BiGRU layer. i This represents the vector input from the IDCNN layer, where w is a weight factor representing the weight of the BiGRU input.
[0036] The CRF (Conditional Random Field) model is a log-linear model for sequence labeling, designed to constrain the dependencies between labels in the context. As a conditional probability distribution model, CRF can be used to represent linear chain conditional random fields, with the core principle formula as follows:
[0037]
[0038] Where x is the input variable, representing the labeled observation sequence; y is the output sequence, representing the label sequence corresponding one-to-one with x; f k Represents the characteristic function; ω k This represents the weights corresponding to the feature functions. The training set is used to obtain a conditional probability model through maximum likelihood estimation, and then the Viterbi algorithm is used to output the label sequence y with the highest conditional probability based on the given observation sequence.
[0039] After the feature vectors undergo dimensionality transformation in a fully connected layer, the CRF layer is responsible for decoding the predicted label sequence, realizing the transformation from features to probabilities. In this process, it considers the transition scores between labels and uses them to correct the emission scores given by the feature extraction layer. This means it considers not only the probability of individual labels but also the overall structure of the label sequence and the transitions between different labels. This allows the model to capture the correlations and constraints between labels, such as whether a certain entity category is more likely to appear after another entity category in a specific context. CRF helps ensure that the entire label sequence has a reasonable structure, thereby improving model performance.
[0040] Furthermore, precision (P), recall (R), and F1 score were used as evaluation metrics to comprehensively evaluate the experimental results. The calculation formulas are as follows:
[0041]
[0042] It should be noted that the number of correctly identified entities in the above evaluation indicators is calculated based on "the successful prediction of both the entity type and the start and end boundaries".
[0043] Furthermore, in step 4, regarding the dependency relationships in the sentence, every word element has a direct or indirect dependency relationship with the root word Root. The root word Root is obtained through the interface of the natural language processing tool LTP, and the hypernym / hypernym relationship is reproduced using Root as a word element. A single-step dependency relationship between two words is considered a hop, and the distance from any word element to Root is denoted as hops. hops is used as the basis for determining the hypernym / hypernym relationship. Let the hops of the core predicate of the sentence be 0. Components that form a subject-predicate relationship before the predicate have hops-1, and components that form a verb-object relationship after the predicate have hops+1, and so on. Modifying components decrease by one, and modified components increase by one. Parallel relationships have no change in hops. This allows for the clear hypernym / hypernym relationship of each component in the sentence.
[0044] Furthermore, the rules for constructing the triples are as follows:
[0045] First, determine whether a sentence can independently form a triple based on its hierarchical relationship. Specifically, this can be done by calculating the difference between the maximum and minimum values of `hops`. If the difference is greater than or equal to 2, it's determined that a triple can be formed independently. In this case, the head and tail entities are determined based on their hierarchical relationships, and the predicate is used as the corresponding relation to construct the triple. If the difference is less than 2, it means the sentence components cannot independently form a triple. In this case, the parent name of the sentence needs to be introduced as the head entity of the triple, the predicate in the sentence as the relation, and the remaining components as the tail entity to construct the triple. Here, `hops` represents the distance from any lexical to the root word.
[0046] It should be noted that traditional methods for constructing triples based on dependency syntax include subject-verb-object relations (Zhang San, eat, apple), possessive relations (book, author, Zhang San), and prepositional phrase relations (Zhang San, at, Beijing). However, in specific unstructured texts of technical standards, sentences may have missing components, such as missing subjects or objects. Constructing triples based on hierarchical relations requires that sentence components be divided into at least three levels to independently extract triple relations from the sentence. In the above technical solution, difference judgment is performed based on hierarchical relations, forming two different triple construction rules. These two rules work together to effectively compensate for the shortcomings of methods that rely solely on hierarchical relations to determine the head and tail entities, which cannot construct triples due to missing components. Furthermore, given the strong structure of unstructured texts of technical standards due to their textual characteristics, this invention introduces the parent name of the sentence as the head entity of the triple, which is clearly more in line with the structural characteristics of unstructured texts of technical standards. For example, the tobacco processing step includes the process of heating tobacco sheets, which in turn includes process tasks, incoming material standards, and other process information. Below that is specific unstructured text knowledge. Therefore, a sentence is a description and supplement to the process information it belongs to. Thus, for sentences with missing components, the name of the process information to which it belongs can be used as the head entity to construct a triplet relationship.
[0047] According to a second aspect of the present invention, a process standard entity relation extraction system integrating deep learning and dependency syntax is provided, comprising: an annotation module, configured to perform step 1: collecting unstructured text of process production standards and annotating the entities to be extracted to establish an entity relation dataset; a building module, configured to perform step 2: building a MacBERT-BiGRU-IDCNN-CRF entity extraction model, adjusting hyperparameters, and dividing the labeled entity relation dataset into training and testing sets for training the MacBERT-BiGRU-IDCNN-CRF entity extraction model; and an extraction module, configured to perform step 3: The unstructured text of the process production standard to be extracted is used to extract entities using a trained deep learning model; the module is used to execute step 4: import the extracted entities into a natural language processing tool through a dictionary; the unstructured text of the process production standard to be extracted is segmented into sentences; based on the segmentation results, dependency analysis is performed on each component to obtain the dependency relations of the sentences; the transformation module is used to execute step 5: based on the dependency relations of the sentences, the components in the sentences are divided into hyponyms and hypernyms to obtain the hyponyms and hypernyms; based on the triple construction rules constructed according to the hyponyms and hypernyms, the sentences are transformed into triple structures and imported into a graph database.
[0048] According to a third aspect of the present invention, a terminal is provided, including a processor, a memory, and a computer program stored in the memory and executable on the processor, wherein the processor is configured to execute the process-standard entity relation extraction method integrating deep learning and dependency syntax as described in any one of the preceding embodiments. In an exemplary embodiment, the terminal may further include a transmission device and an input / output device, wherein the transmission device is connected to the processor, and the input / output device is connected to the processor.
[0049] Example 2: Figure 1-5 As shown, this invention provides a method for extracting process standard entity relations by integrating deep learning and dependency syntax. The specific steps are as follows: Figure 1 , 2 As shown. Specifically includes:
[0050] Step (1): Taking the unstructured text of the cigarette manufacturing process standard as an example, and annotating the entities to be extracted, in order to establish an entity relationship dataset;
[0051] For example, the extracted entities for knowledge text related to cigarette manufacturing process standards include several categories: equipment, process parameters, standards, processes, methods, materials, and functions. Based on these seven categories, the knowledge related to the cigarette manufacturing process is annotated using the Label Studio annotation tool. To better illustrate the results, the annotation results are displayed in multiple columns instead of a single column, as shown below. Figure 3 As shown, taking the first column as an example, B-process parameter represents the starting boundary of the entity under the process parameter category, and E-process parameter represents the ending boundary of the entity under the process parameter category.
[0052] Step (2): Divide the labeled entity relation dataset into training and test sets to train the MacBERT-BiGRU-IDCNN-CRF entity extraction model (MBIC entity extraction model). Based on the characteristics of the experimental corpus and multiple experiments, the specific hyperparameters are as follows: max_len = 30, batch_size = 20, embedding_num = 768, hidden_num = 128, epoch = 100, number of convolutional kernels = 3, weight factor = 0.5, dropout = 0.15, and the dimension of the output prediction result = 24.
[0053] Step (3): Extract entities from the unstructured text of the process production standard to be extracted using a trained deep learning model;
[0054] For example, taking "the water flow rate of the drum-type rehumidification equipment is adjustable and controllable, and the steam pressure, hot air velocity, drum rotation speed, and exhaust damper opening are also adjustable" as an example, entity extraction was performed using MacBERT-BiGRU-IDCNN-CRF. The extraction results are "drum-type rehumidification equipment", "water flow rate", "steam pressure", "hot air velocity", "drum rotation speed", and "exhaust damper opening".
[0055] Step (4): Import the extracted entities into the Chinese Language Technology Platform LTP4.2.0 in dictionary form, and perform dependency analysis on the original sentence. The results of the dependency analysis are as follows: Figure 4 As shown.
[0056] Step (5): Through Figure 4 The dependency analysis results can be used to preliminarily classify the components of the sentence into hypernyms and hyponyms. The results are as follows:
[0057] Roller-type rehumidification equipment: Dependencies: (2, 'ATT'), hops: -2;
[0058] Water flow rate: Dependency: (3, 'SBV'), hops: -1;
[0059] Adjustable and controllable: Dependency: (0, 'HED'), hops: 0;
[0060] Steam pressure: Dependency: (13, 'SBV'), hops: -1;
[0061] Hot air velocity: Dependency: (5, 'COO'), hops: -1;
[0062] Cylinder rotation speed: Dependency: (5,'COO'), hops: -1;
[0063] Exhaust damper opening: Dependency: (5, 'COO'), hops: -1;
[0064] Equals: Dependency relation: (11, 'RAD'), hops: -1;
[0065] Adjustable: Dependencies: (3, 'COO'), hops: 0.
[0066] The segmentation results above represent the dependency relations obtained by LTP4.2.0 analysis after entity extraction using the MacBERT-BiGRU-IDCNN-CRF entity extraction model, and the hypernym / hypernym relations obtained based on the dependency relations. In the dependency relation column, the first number in parentheses indicates the sequence number of the hypernym of the current word, and the following letter group indicates the relationship between the current word and the hypernym, such as ATT for attributive-head relation, SBV for subject-verb relation, COO for coordinate relation, and HED for core relation, which is usually the content pointed to by the predicate. First, it is determined that the sentence can construct a complete triple relation based on the difference between the maximum and minimum hops values being greater than or equal to 2. Then, the head entity and tail entity are determined based on the hypernym / hypernym relations of the entities. Then, the predicate is used as the corresponding relation to construct the triple: judging from the dependency relations between the entities, "roller-type rehumidification equipment" is set as the attributive modifying the subject "water flow rate" and is set as the head entity Subject. Then, "water flow rate" is set as the modified entity Object, and its corresponding predicate "adjustable and controllable" with a hops value of 0 is set as the relation predicate pointing to the water flow rate. Then, "adjustable" forms a parallel relationship with the core predicate verb, so its hops value is also 0, and it is set as a predicate. According to the hops hierarchy rules, "steam pressure," "hot air velocity," "cylinder rotation speed," and "dampening damper opening" are all parallel relationships with the water flow rate, and therefore are also modified by the sentence's modifier "roller-type rehumidification equipment," so they are all set as tail entities (Objects), connected by the second predicate "adjustable" and the modifier "roller-type rehumidification equipment." The generated triplet results are imported into the graph database neo4j in graph form, and the effect is as follows: Figure 5 As shown.
[0067] To verify the effectiveness of the combined model of the present invention, it was compared with four other existing entity extraction models. The results of the comparative experiment are shown in Table 1.
[0068] Table 1 Overall entity recognition results for each model
[0069] Model Training time ( / s) accuracy Recall rate F1 value Bert-BiLSTM-CRF 47.23 0.902 0.920 0.911 Bert-BiGRU-CRF 42.55 0.917 0.880 0.898 Bert-IDCNN-CRF 41.87 0.846 0.880 0.863 Bert-IDCNN-BiGRU-CRF 43.97 0.938 0.920 0.918 MacBERT-BiGRU-IDCNN-CRF 45.52 0.959 0.940 0.949
[0070] Table 1 shows the training time, accuracy, recall, and F1 score of several existing algorithm models and the model proposed in this invention. The entity recognition performance was compared by combining the basic BERT model with neural network models such as BiLSTM-CRF, BiGRU-CRF, and IDCNN-CRF. Table 1 shows that the traditional BERT-BiLSTM-CRF has the highest F1 score, indicating its relatively best named entity recognition performance. However, its training time is significantly longer than that of BERT-BiGRU-CRF and BERT-IDCNN-CRF. This is due to the LSTM network structure, which cannot guarantee high efficiency while maintaining high extraction accuracy. While BERT-IDCNN-CRF has the shortest training time, its F1 score is the lowest, indicating that using only the IDCNN model trained with sentence features for label prediction does not achieve optimal prediction results. Combining BERT with BiGRU and IDCNN for prediction balances training efficiency and prediction accuracy. Its training time is shorter than that of BERT-BiLSTM-CRF, and its training accuracy is higher.
[0071] For entity extraction tasks, accuracy is crucial, but equally important is improving the comprehensiveness of recognition while maintaining accuracy. Therefore, the F1 score is the most important evaluation criterion. By replacing "Bert" in "Bert-IDCNN-BiGRU-CRF" with MacBert, the model of this invention achieves an F1 score 0.031 higher than the BERT-encoded model, indicating that the MacBert pre-trained semantic model outperforms the traditional BERT model in the Chinese encoding domain. Furthermore, the MacBert-BiGRU-IDCNN-CRF model shows the best F1 score compared to other combined models, and its training speed is faster than the traditional BiLSTM model. Additionally, its accuracy is slightly higher than its recall, indicating that this invention's knowledge extraction is more cautious, meaning fewer errors are extracted. This aligns with the objective needs of knowledge graph applications within enterprises, and it tends to avoid errors and losses in the enterprise's production process caused by erroneous information extraction.
[0072] Furthermore, to verify the rationality and effectiveness of the MacBERT-BiGRU-IDCNN-CRF network structure design, this invention conducted ablation experiments on the established GRU-CRF, BiGRU-CRF, IDCNN-CRF, IDCNN-BiGRU-CRF, MacBert-CRF, and MacBert-IDCNN-BiGRU-CRF models with and without sentence features. (It should be noted that the "MacBert-IDCNN-BiGRU-CRF model without sentence features" is based on this invention, but the extracted sentence features are discarded; the step of using sentence features for feature concatenation in this invention is removed, that is, the character features output by MacBert are directly used as the input of IDCNN.) The hyperparameters of each ablation part are the same as the parameter settings in the experiments in Table 1 above. The above models are trained under the same experimental conditions, and the experimental results are shown in Table 2.
[0073] Table 2 Predicted results of ablation experiments
[0074] Model accuracy Recall rate F1 value GRU-CRF 0.518 0.580 0.547 BiGRU-CRF 0.745 0.760 0.752 IDCNN-CRF 0.485 0.640 0.552 IDCNN-BiGRU-CRF 0.704 0.760 0.731 MacBert-CRF 0.808 0.840 0.824 MacBert-IDCNN-BiGRU-CRF (without sentence features) 0.885 0.920 0.902 MacBert-IDCNN-BiGRU-CRF (with sentence features) 0.959 0.940 0.949
[0075] Table 2 compares the prediction results across the three prediction metrics. It shows that BiGRU-CRF significantly outperforms GRU-CRF, indicating that the bidirectional GRU structure learns contextual information more effectively than the unidirectional GRU structure. Furthermore, since general embedding tools cannot extract sentence features for encoding, both the model using IDCNN alone and the model combined with BiGRU have poor prediction accuracy, below 80%. Table 2 also shows that the use of the MacBert-CRF model significantly improves prediction accuracy. When using MacBert-CRF alone, its F1 score reaches 0.824, significantly better than the combined model without MacBert encoding. In addition, this experiment compared the MacBert-IDCNN-BiGRU-CRF model with and without sentence features. It shows that after incorporating sentence features into the model training, the F1 score increased from 0.902 to 0.949, indicating that sentence features greatly enhance the model's entity extraction performance in standard production process scenarios. The above experiments verified the superiority of the proposed MacBert-IDCNN-BiGRU-CRF in predicting text labels for process standards and the effectiveness of its network structure.
[0076] The specific embodiments of the present invention have been described in detail above with reference to the accompanying drawings. However, the present invention is not limited to the above embodiments. Within the scope of knowledge possessed by those skilled in the art, various changes can be made without departing from the spirit of the present invention.
Claims
1. A method for extracting process standard entity relations by integrating deep learning and dependency syntax, characterized in that, include: Step 1: Collect unstructured text of process production standards and label the entities to be extracted to establish an entity relationship dataset; Step 2: Build the MacBERT-BiGRU-IDCNN-CRF entity extraction model, adjust the hyperparameters, and divide the labeled entity relation dataset into training and test sets for training the MacBERT-BiGRU-IDCNN-CRF entity extraction model. Step 3: Extract entities from the unstructured text of the process production standards to be extracted using a trained deep learning model. Step 4: Import the extracted entities into the natural language processing tool using a dictionary; segment the unstructured text of the process production standard to be extracted into sentences; perform dependency analysis on each component based on the segmentation results to obtain the dependency relationships of the sentences; Step 5: Based on the dependency relations of the sentence, divide the components of the sentence into hypernyms and hyponyms to obtain the hypernym / hypernym relationships; based on the triple construction rules of the hypernym / hypernym relationships, transform the sentence into a triple structure and import it into the graph database; The MacBERT-BiGRU-IDCNN-CRF entity extraction model is built using MacBERT layers, BIGRU layers, IDCNN layers, feature fusion layers, fully connected layers, and CRF models. The MacBERT layer is a pre-trained deep bidirectional Transformer model used to divide the input labeled data into sentences and characters, and then convert them into character features and sentence features respectively. The BiGRU layer uses two GRU layers to input the character features converted by the MacBERT layer into the BiGRU layer for processing. To achieve dimensionality enhancement of sentence features, the sentence features output from the MacBERT layer are concatenated to character features. The concatenated character features are then input into an IDCNN layer that uses iterative dilated convolution. The character features output from IDCNN, which contain syntactic and sentence structure information, and the character features output from BiGRU, which contain hierarchical semantic information, are then fused in a feature fusion layer using a weighted average method. The fused character features are then input into a fully connected layer to convert them into the dimension required for the output. Finally, the CRF model outputs the predicted label for each character.
2. The method for extracting process standard entity relations by integrating deep learning and dependency syntax according to claim 1, characterized in that, Using text annotation tools, the BMEO annotation strategy is adopted to annotate the text, and the entity relationship types contained in the text are divided into several categories: equipment, process parameters, standards, processes, methods, materials, and functions. Based on these seven categories, the unstructured text of the process production is annotated.
3. The method for extracting process standard entity relations by integrating deep learning and dependency syntax according to claim 1, characterized in that, The rules for constructing triples are as follows: First, determine whether a sentence can independently construct a triple based on its hierarchical relationship. Specifically, this can be done by calculating the difference between the maximum and minimum values of `hops`. If the difference is greater than or equal to 2, it is determined that a triple can be constructed independently. In this case, the head and tail entities are determined based on the hierarchical relationship of the entities, and the predicate is used as the corresponding relation to construct the triple. If the difference is less than 2, it means that the sentence components cannot independently construct a triple. In this case, it is necessary to introduce the name of the superordinate of the sentence as the head entity of the triple, the predicate in the sentence as the relation, and the remaining components as the tail entity to construct the triple. Here, `hops` represents the distance from any lexical to the root word.
4. A system for extracting process standard entity relations that integrates deep learning and dependency syntax, characterized in that, include: The annotation module is used to perform step 1: collect unstructured text of process production standards and annotate the entities to be extracted in order to establish an entity relationship dataset; The module is set up to perform step 2: build the MacBERT-BiGRU-IDCNN-CRF entity extraction model, adjust the hyperparameters, and divide the labeled entity relationship dataset into training and test sets for training the MacBERT-BiGRU-IDCNN-CRF entity extraction model. The extraction module is used to perform step 3: extract entities from the unstructured text of the process production standard to be extracted using a trained deep learning model. The module is used to perform step 4: import the extracted entities into the natural language processing tool through a dictionary; segment the unstructured text of the process production standard to be extracted into words on a sentence-by-sentence basis; and perform dependency analysis on each component based on the segmentation results to obtain the dependency relationships of the sentences. The transformation module is used to perform step 5: based on the dependency relations of the sentence, the components in the sentence are divided into superordinate and subordinate relationships to obtain the superordinate and subordinate relationships; based on the triple construction rules constructed by the superordinate and subordinate relationships, the sentence is transformed into a triple structure and imported into the graph database. The MacBERT-BiGRU-IDCNN-CRF entity extraction model is built using MacBERT layers, BIGRU layers, IDCNN layers, feature fusion layers, fully connected layers, and CRF models. The MacBERT layer is a pre-trained deep bidirectional Transformer model used to divide the input labeled data into sentences and characters, and then convert them into character features and sentence features respectively. The BiGRU layer uses two GRU layers to input the character features converted by the MacBERT layer into the BiGRU layer for processing. To achieve dimensionality enhancement of sentence features, the sentence features output from the MacBERT layer are concatenated to character features. The concatenated character features are then input into an IDCNN layer that uses iterative dilated convolution. The character features output from IDCNN, which contain syntactic and sentence structure information, and the character features output from BiGRU, which contain hierarchical semantic information, are then fused in a feature fusion layer using a weighted average method. The fused character features are then input into a fully connected layer to convert them into the dimension required for the output. Finally, the CRF model outputs the predicted label for each character.
5. A terminal, characterized in that: The method includes a processor, a memory, and a computer program stored in the memory and executable on the processor, the processor being configured to perform the process-standard entity relation extraction method that integrates deep learning and dependency syntax as described in any one of claims 1-3.