Rapid construction method for knowledge graph of railway bridge design standards

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
By combining BERT and Bi-LSTM models with conditional random fields and graph neural networks, a knowledge graph of railway bridge design standards is constructed. The structure is optimized using a dynamic topology optimization algorithm, which solves the problems of low data integration efficiency and excessive redundant information, and enables rapid response to changes in design standards and efficient querying.

WO2026129760A1PCT designated stage Publication Date: 2026-06-25CHINA RAILWAY DESIGN GRP CO LTD

View PDF 0 Cites 0 Cited by

Patent Information

Authority / Receiving Office: WO · WO
Patent Type: Applications
Current Assignee / Owner: CHINA RAILWAY DESIGN GRP CO LTD
Filing Date: 2025-09-10
Publication Date: 2026-06-25

Application Information

Patent Timeline

10 Sep 2025

Application

25 Jun 2026

Publication

WO2026129760A1

IPC: G06N5/022; G06F30/13

CPC: Y02D10/00

AI Tagging

Application Domain

Geometric CAD Knowledge based models

Technology Topics

Conditional random fieldTheoretical computer science

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

Text similarity matching method, device, equipment and medium
CN121542777BImprove robustness improve accuracy Semantic analysisConditional random fieldAlgorithm
Three-dimensional non-gaussian cross-correlated conditional simulation method based on copula and gibbs sampling
CN122413684AConditional random fieldGibbs sampling
Event causality detection method fusing lexical and dependency features
CN116796727BEnhanced Semantic RepresentationMathematical models Semantic analysisConditional random fieldPart of speech
A method, device, and electronic device for marking fingering in piano music scores.
CN116486763BImprove the effect of fingering annotationElectrophonic musical instruments Manufacturing computing systems PianoConditional random field
A privacy protection method and system for unstructured teacher comment text
CN122389082AConditional random fieldInformatization

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

AI Technical Summary

Technical Problem

Existing technologies for constructing knowledge graphs of railway bridge design standards suffer from problems such as low data integration efficiency, excessive redundant information, and difficulty in quickly responding to changes in design standards.

Method used

We employ BERT pre-trained models and Bi-LSTM models for deep semantic analysis, combining conditional random fields and graph neural networks to identify entities and relationships, construct a knowledge graph, and optimize the graph structure through a dynamic topology optimization algorithm to reduce redundant nodes and edges.

Benefits of technology

It enables the rapid integration and optimization of railway bridge design standard data, improves the construction speed and query efficiency of knowledge graphs, and ensures the simplicity and reliability of the graphs.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure CN2025120443_25062026_PF_FP_ABST

Patent Text Reader

Abstract

Disclosed in the present invention is a rapid construction method for a knowledge graph of railway bridge design standards, comprising: S1, acquiring data and preprocessing same; and S2, using a BERT pre-training model and a Bi-LSTM model to convert a text sequence in the processed data into an annotated tag sequence, i.e., {y1,y2…yn}; (S3) using conditional random field (CRF) and graph neural network (GNN) technology to identify and annotate an entity, an attribute, and a relationship thereof to form a node and an edge of a knowledge graph, so as to obtain a complete knowledge graph (KG); and (S4) using a dynamic topology optimization algorithm to adjust a structure of the complete knowledge graph (KG) in real time. The method rapidly integrates and optimizes design standard data, reduces data redundancy, and effectively processes data additions and changes, thereby improving knowledge graph construction speed and maintenance efficiency.

Need to check novelty before this filing date? Find Prior Art

Description

A method for rapid construction of a knowledge graph of railway bridge design standards Technical Field

[0001] This invention belongs to the field of engineering design information management, specifically involving a method for rapidly constructing a knowledge graph of railway bridge design standards. Background Technology

[0002] With the continuous development of modern railway engineering, railway bridge design standards and specifications have become increasingly complex. These design standards involve various technical requirements and specifications, including design parameters, structural requirements, material properties, and construction techniques. Because these design standards originate from various specifications and technical documents, and their data formats and sources differ, data integration and management in the field of railway bridge design face significant challenges. While current knowledge graph construction methods can handle complex data to some extent, they still have many shortcomings in rapidly constructing and maintaining knowledge graphs for railway bridge design standards.

[0003] Traditional knowledge graph construction methods typically require significant manual intervention to process and integrate data, making the construction and updating process inefficient and unable to adapt to rapidly changing design standards. Secondly, existing knowledge graph construction methods often contain duplicate records and redundant information related to standards and specifications, which not only increases the complexity of data management but also affects the query efficiency of the knowledge graph. Furthermore, the continuous updating of railway bridge design standards poses challenges to knowledge graphs in handling new and changed data; existing methods often require rebuilding the knowledge graph or making complex adjustments, making it difficult to efficiently respond to changes in new standards.

[0004] Therefore, there is an urgent need for an efficient method for rapidly constructing a knowledge graph of railway bridge design standards to solve the problems existing in current technologies. Summary of the Invention

[0005] To address the problems existing in the prior art, this invention provides a method for rapidly constructing a knowledge graph of railway bridge design standards. This method quickly integrates and optimizes design standard data, reduces data redundancy, and effectively handles data additions and changes, thereby improving the construction speed and maintenance efficiency of the knowledge graph.

[0006] The technical solution of the present invention is as follows:

[0007] A method for rapidly constructing a knowledge graph of railway bridge design standards includes the following steps:

[0008] S1. Data Collection and Preprocessing: Collect data from standard documents related to railway bridge design; remove redundant and noisy data from the collected data using data cleaning techniques to obtain processed data, including:

[0009] The processed data contains multiple text sequences, the first text sequence S = {w1, w2, ..., w...} n}, w i It is the i-th word in the corresponding text sequence S, where i = 1, 2, 3…n;

[0010] S2, using a BERT pre-trained model and a Bi-LSTM model, transforms the text sequence S into high-information-density structured data, including:

[0011] S2-1, Extracting the embedding vector: The text sequence S is embedded using a BERT pre-trained model to obtain the embedding vector X = {x1, x2, ..., x...} of the text sequence S. n};in, w is the i-th word in the text sequence S i Word vectors; The word embedding vector is used to represent the word w. i The basic semantic information is derived from the word embedding matrix W in the BERT pre-trained model. s generate; The sentence embedding vector is used to represent the word w. i The role in the text sequence S is generated by the multi-layer Transformer encoder in the BERT pre-trained model; The position embedding vector is used to represent the word w. i The position in the text sequence S is determined by the position embedding matrix W in the BERT pre-trained model. p generate;

[0012] S2-2, Extract the contextual features corresponding to each word in the text sequence S based on the embedding vector X:

[0013] S2-2-1, Traverse the embedded vector X = {x1, x2, ..., x...} n The word vector of each word is used as the input to the mapping function to calculate the initial context features corresponding to each word in the text sequence S, resulting in the initial context feature set H = {h0(w1), h0(w2), ..., h0(w...}). n )};

[0014] Wherein, the i-th word w in the text sequence S is calculated. i The initial context features h0(w i The calculation method for h0(w) is as follows: i )=f(x i )

[0015] In the formula, f is a mapping function used to process the word w. iEmbedded vector x i , i∈[1,n];

[0016] S2-2-2, Traverse the initial context feature set H = {h0(w1), h0(w2), ..., h0(w...} n The initial context features of each word in set H are input into the pre-trained BERT model. The N-layer Transformer encoder of the BERT model outputs the context features of each word, forming the context feature set H of each word. S ={h N (w1),h N (w2),…,h N (w n )};

[0017] S2-3, Extracting final contextual features based on the Bi-LSTM model:

[0018] Traverse set H S ={h N (w1),h N (w2),…,h N (w n )}, set H S Each context feature in the text sequence S is used as input to the Bi-LSTM model to capture the bidirectional contextual dependencies in the text sequence S, thereby obtaining the final context features corresponding to each word in the text sequence S, and forming the final context feature sequence C. S ={c1,c2,…,c n};

[0019] S2-4, Calculate the final context feature sequence C S The average of the values is used to obtain the average vector s. g s g The overall semantic features of the text sequence S: s g =avg{c1,c2,…,c n},in, 2×d he The average vector s g The dimension;

[0020] S2-5, Construct the relation set R:

[0021] In the process of pre-training the BERT model to process text sequences, entities are identified from the text sequence, semantic relationships between different entities are inferred based on contextual information, semantic relationships are embedded into a relation vector space to obtain relation vectors, and added to the relation set R, resulting in the relation set R = {r1, r2, ..., r...} k ,…rt}, where t is the total number of relation vectors in the relation set R, and r is the relation vector. k This represents a specific relation type or semantic relationship in a text sequence S. d represents spatial dimension. r Dimension;

[0022] S2-6, based on the final context feature sequence C S and average vector s g The attention mechanism is used to calculate the relationship vector r for each word in the text sequence S through linear mapping and nonlinear transformation. k The attention alignment scores are then used to form an attention alignment score set {e}. 1k ,e 2k ,…,e nk Attention alignment score e ik Used to measure the final contextual feature c i In relation vector r k The weights below;

[0023] S2-7, using the Softmax function to evaluate the attention alignment score set {e} 1k ,e 2k ,…,e nk Normalization is performed to obtain the value of each word in the current target relation r. k The attention weights are then assigned and combined to form an attention weight set {α}. 1k ,α 2k ,…,α nk};

[0024] S2-8, through the attention weight set {α 1k ,α 2k ,…,α nk} For the final context feature sequence C S ={c1,c2,…,c n Perform a weighted summation to obtain the relation vector r. k The feature vectors s of the text sequence S below k :

[0025] Wherein, the text sequence S is in the relation vector r k Specific semantic information will be integrated into the feature vector s k middle, n is the total number of words in the text sequence S, and d is the feature dimension of each word;

[0026] S2-9, the feature vector s k The input is fed into the relation gating unit, which then determines the relationship based on the feature vector s. kThe gating vector G is obtained by calculating the activation function; the gating vector G is then used to adjust the feature vector s. k Filtering is performed to obtain the filtered feature vector. Among them, the gate vector G filters out the relation vector r k Irrelevant feature information is retained while retaining the features most beneficial to entity annotation, and the activation function is Sigmoid.

[0027] S2-10, the filtered feature vector Each element in the algorithm is input into a hidden layer of a Bi-LSTM network, and the output is a filtered feature vector. The hidden state corresponding to each element in the text sequence, i.e., the hidden state corresponding to each word in the text sequence, yields the hidden state set.

[0028] S2-11, set the hidden states Each hidden state is input into the entity labeling layer of the Bi-LSTM network. The output of the entity labeling layer is the labels of all words in the text sequence S, forming a label sequence {y1, y2…y}. n};

[0029] S2-12, based on the label sequence {y1,y2…y... n}, determine the relationship vector r in the text sequence S. k The relevant head and tail entities, and the label sequence {y1, y2…y} based on the determined head and tail entities. n The annotation process is as follows: The method for identifying head and tail entities is:

[0030] When identifying head entities, start from the label sequence {y1, y2…y} n Find the first occurrence of B-Head as the starting position of the head entity, and continue until the beginning of a non-entity O or tail entity B-Tail is encountered. This process includes subsequent I-Heads. When identifying the tail entity, start from the label sequence {y1, y2…y}. n Find the position of the first B-Tail as the starting position of the tail entity, and continue until a non-entity O or B-Head is encountered, including subsequent I-Tails.

[0031] S3 utilizes Conditional Random Fields (CRF) and Graph Neural Networks (GNN) techniques to identify and label entities, attributes, and relationships in the design criteria, forming nodes and edges of the knowledge graph to construct a detailed knowledge graph. The specific steps are as follows:

[0032] S3-1, based on the label sequence {y1, y2…y} of the already labeled head and tail entities. nAdd all identified head and tail entities to the entity set, which includes all entities related to the relation vector r. k Related entities;

[0033] S3-2, Traversing the relation set R = {r1, r2, ..., r...} k ,…r t Each time, select a relation vector from the relation set R and repeat steps S2-6 to S3-1 to obtain the entity set E = {E1, E2, ..., E...} associated with each relation vector in the relation set R. u}, E i Let u be an entity, and u be the total number of entities;

[0034] S3-3, analyze the attributes of each entity in the entity set E to obtain the attribute set of each entity, and traverse the entity set E to obtain the attribute set A = A1, A2, ..., A... of all entities. u , where A i For entity E i The attribute set A i =a1,a2,…,a m ;

[0035] Among them, obtain entity E i The attribute set A i =a1,a2,…,a m The method is as follows:

[0036] (1) Extract entity E from text sequence S i The two words before and after the first word form the context word set N. i , where the set of context words N i ={w i-2 ,w i-1 E i ,w i+1 ,w i+2}, where w i-2 ,w i-1 ,w i+1 ,w i+2 Representing entity E respectively i Two words at the beginning and two at the end;

[0037] (2) Using a Conditional Random Field (CRF) model, compute the context word set N. i Lower entity E i The attribute set A i The conditional probability distribution is used to generate entity E. i The attribute set A i =a1,a2,…,a m Attribute set A iEach attribute in the data is for entity E. i A specific characteristic of this is that the formula for the conditional random field model is as follows:

[0038] Among them, Z(N) i ψ is a normalization factor used to make the sum of the probabilities of all attributes equal to 1. m (A i N i ) is a feature function used to capture the context N. i With attribute set A i The interrelationship between them This represents a chain multiplication of all characteristic functions;

[0039] S3-4, based on the entity set E = {E1, E2, ..., E...} u} and attribute set A = A1, A2, ..., A u The set of nodes that form the knowledge graph is N = {E1, E2, ..., E...} u A1, A2, ..., A u};

[0040] S3-5, traverse the node set N, add feature information to each node of the knowledge graph, and use the vectorization technology Word2Vec to convert each node in the node set N into a vector, to obtain the vector set V(N), which is convenient for efficient storage and retrieval in the graph database later;

[0041] Wherein, the vector set V(N) = {v(E1), v(E2), ..., v(E... u ),v(A1),v(A2),…,v(A u )},v(E i ) and v(A i ) are entities E i and attribute A i Convert to a vector representation;

[0042] S3-6, based on the entity set E = {E1, E2, ..., E...} u In the relation set R = {r1, r2, ..., r}, the head and tail entities are used to construct edges between nodes. These edges describe the relationships between entities and attributes, as well as the relationships between entities themselves. t Each relation vector in} is used as a representation of an edge, and each edge is vectorized to obtain the set V(R) = {v(r1), v(r2), ..., v(r... t The vector set V(N) and the set V(R) form a complete knowledge graph KG = (V(N), V(R)).

[0043] S3-7, Perform S2 to S3-6 on all text sequences in the processed data described in S1 to obtain the complete knowledge graph KG;

[0044] S4. The complete knowledge graph (KG) structure is adjusted in real time using a dynamic topology optimization algorithm. The specific steps are as follows:

[0045] S4-1, Perform a preliminary analysis on the complete knowledge graph KG. By calculating the degree of each node, evaluate the importance of each node in the complete knowledge graph KG. Calculate the weight of each edge based on the degree of the node. The weight of the edge reflects the strength of the relationship between the nodes.

[0046] Among them, node E is obtained. i degree d i The calculation formula is as follows:

[0047] In the formula, A i,j A is an element in the adjacency matrix. i,j Indicates node E i and node E j The connection between them;

[0048] Among them, node E is obtained. i →E j The weight w of the edge ij The calculation formula is as follows:

[0049] weight w ij Reflects node E i and node E j The strength of the relationship between them, f(d) i ,d j ) is the weight calculation function;

[0050] S4-2, Based on the degree of each node and the weight of each edge in the complete knowledge graph KG, formulate a dynamic topology adjustment strategy to optimize the structure of the knowledge graph:

[0051] In the complete knowledge graph KG, nodes with the same or similar attributes and a degree less than d are considered. min The redundant nodes are treated as redundant nodes. The redundant nodes in the complete knowledge graph KG are merged, the redundant nodes are deleted, and the connection relationship of the redundant nodes is transferred to the node with higher degree to obtain the entity set E′. According to the formula for calculating the degree of the node, the degree of the node in the complete knowledge graph KG at this time is calculated. The node with a degree of zero is treated as an isolated node to obtain the isolated node set I. The isolated nodes are removed from the knowledge graph according to the isolated node set I.

[0052] S4-3, Optimize the edges based on their weights in the complete knowledge graph KG by traversing the set V(R) = {v(r1), v(r2), ..., v(r...}} of all edges in the complete knowledge graph KG. k Edges with weights less than or equal to the threshold are deleted from the complete knowledge graph KG. If the deleted edge disconnects some nodes in the complete knowledge graph KG and the degree of these nodes becomes zero, these nodes are considered isolated nodes and deleted. After the edge deletion operation in the complete knowledge graph KG, the connection relationship between the remaining nodes will change, so S4 optimization of the knowledge graph KG will be performed again.

[0053] In S1, the standard documents related to railway bridge design include structural design specifications, material standards, construction specifications, testing and maintenance standards, and environmental and sustainability standards. Data cleaning techniques include deleting duplicate records, processing missing values, correcting erroneous information, and standardizing data formats.

[0054] In S2-2-2, the i-th word w in the text sequence S is obtained. i Contextual features h N (w i The method is as follows: Initialize the context features h0(w) i The input is fed into the first layer of an N-layer Transformer encoder. Each layer of the Transformer encoder performs self-attention calculation and nonlinear transformation on the output of the previous layer to obtain the context features of the current layer's output. The context features of the Nth layer's output are denoted as h. N (w i ); where the context features output by the l-th layer Transformer encoder are h l (w i ): h l (w i ) = Trans(h l-1 (w i ),l∈[1,N]

[0055] In the formula, Trans() represents the Transformer encoding operation, including self-attention computation and nonlinear transformation, used to update contextual features; h l-1 (w i ) represents the contextual features output by the (l-1)th layer Transformer encoder.

[0056] In S2-3, the Bi-LSTM model extracts the word w. i The corresponding final context feature c iThe method is as follows: using h N (w i ) is the input to the Bi-LSTM model. h is calculated using the forward LSTM and backward LSTM of the Bi-LSTM model, respectively. N (w i ) Contextual dependencies of positions in the text sequence S are used to obtain the hidden states of the feedforward LSTM output. and the hidden state output by the backward LSTM Will and Concatenate the words and output the word "w". i The corresponding final context feature c i The calculation formula is:

[0057] In the formula, i∈[1,n], the final context features It is a two-dimensional real matrix. For the set of all real numbers, 2×d he For the final context feature c i Dimension.

[0058] In S2-6, the attention alignment score e is obtained. ik The method is: e ik =v T tanh(W r r k +W g s g +W h c i )

[0059] In the formula, v is the attention vector; W is the linear mapping matrix. r Used to transfer relation vector r k Mapped to attention space; W g Used to average vector s g Mapped to attention space; W h Used to convert the final context feature c i Mapped to the attention space.

[0060] In S2-7, obtain the word w from the text sequence S. i The corresponding attention weight α ik The method is as follows:

[0061] Among them, the attention weight set {α 1k ,α 2k ,…,α nk}satisfy

[0062] S2-9 includes the following steps:

[0063] S2-9-1, Obtain the gating vector G as follows: G=σ(W g s k +b g )

[0064] Among them, W g Here is the weight matrix W for the gating mechanism. g Used to transform feature vector s k Mapped to the gate space, b g For the bias term of the gating mechanism, bias term b g The offset used to control the gating mechanism. Weight matrix W g and bias term b g The feature vector s is obtained through a linear transformation via a gating mechanism; σ is the sigmoid activation function, which maps the input values to the range [0,1]. The gating vector G determines the feature vector s. k The degree of passage of each element in the text. In the gate vector G, the value of each element is between [0,1]. The closer to 1, the higher the importance of the element's feature; the closer to 0, the lower the contribution of the element's feature.

[0065] S2-9-2, using the gating vector G to control the feature vector s k Filtering is performed by multiplying element by element to obtain the filtered feature vector.

[0066] in, Each element G in the gate vector G i Controlling the feature vector s k Each element s in ki Preservation or suppression; ⊙ denotes element-wise multiplication, for G and s k Multiply the corresponding elements in the vector, and then filter the resulting feature vector. Contained in relation vector r k The following information is meaningful.

[0067] In S2-10, obtain the word w from the text sequence S. i Corresponding hidden state The method is as follows:

[0068] In the formula, It contains the word w i The relation vector r in the text sequence S k The following context information, To hide the state space, `hidden` represents the hidden state. The length of the hidden layer is determined by the parameters of the hidden layer of the Bi-LSTM network.

[0069] In S2-11, the i-th word w in the text sequence S is obtained. i The method for tagging is as follows:

[0070] In the entity annotation layer, the Softmax function is used to hide the state. Transform into the probability distribution of each entity category [p1, p2, ..., p q The calculation formula is as follows:

[0071] In the formula, q represents the number of entity categories. For hidden state A linear transformation is performed to obtain the raw score for each entity category. The softmax function then converts the raw score for each entity category into a probability distribution [p1, p2, ..., p]. q In the probability distribution, each element is a word w. i W represents the probability of an entity category. e b is the weight matrix of the entity annotation layer; e For bias terms;

[0072] The word w i The entity category corresponding to the maximum probability in the probability distribution is taken as the word w. i The entity category label y i The calculation formula is as follows: y i =argmax(p1,p2,…,p q )

[0073] In the formula, argmax(p1,p2,…,p q ) refers to the label of the entity category corresponding to the maximum probability output.

[0074] Preferably, in S4-2, d min =5; In S4-3, the threshold is 0.2.

[0075] The present invention has the following beneficial effects:

[0076] 1. This invention integrates railway bridge design data from various sources, including technical specifications, industry standards, and historical documents, to form a comprehensive knowledge graph. This graph construction method achieves seamless integration of data from different sources, resulting in more comprehensive information coverage and richer relationships within the graph, thus ensuring its comprehensiveness.

[0077] 2. This invention utilizes a dynamic topology optimization algorithm to adjust the structure of the knowledge graph, optimizes the layout of nodes and edges, effectively reduces data redundancy, improves the query efficiency of the graph, and enhances the overall quality of data management.

[0078] 3. This invention automatically identifies and deletes redundant nodes and duplicate information through topology optimization, selectively retaining nodes with high retention rates to ensure that the knowledge graph contains only information with practical significance and query value, thereby increasing data reliability and ensuring the simplicity and readability of the graph; by automatically adjusting the weights and directions of edges, the query path of the knowledge graph becomes simpler and more efficient. Attached Figure Description

[0079] Figure 1 is a schematic diagram of the process of the present invention;

[0080] Figure 2 is a flowchart illustrating step S2 of the present invention;

[0081] Figure 3 is a schematic diagram of the knowledge graph KG of the present invention. Detailed Implementation

[0082] The construction method of the present invention will be described in detail below with reference to the accompanying drawings and embodiments.

[0083] Referring to Figures 1-3, the present invention provides a method for rapidly constructing a knowledge graph of railway bridge design standards, which specifically includes the following steps:

[0084] S1. Data Collection and Preprocessing: Collect data from standard documents related to railway bridge design; remove redundant and noisy data from the collected data using data cleaning techniques to obtain processed data. Among these:

[0085] The processed data contains multiple text sequences, where a text sequence S = {w1, w2, ..., w...} n}, w i Let be the i-th word in the text sequence S, where i = 1, 2, 3…n. Data cleaning techniques improve the quality of the original data, ensuring its accuracy and consistency.

[0086] The standard documents related to railway bridge design include structural design specifications, material standards, construction specifications, testing and maintenance standards, and environmental and sustainability standards. Data cleaning techniques include deleting duplicate records, processing missing values, correcting errors, and standardizing data formats.

[0087] S2, using a BERT pre-trained model and a Bi-LSTM model to perform deep semantic analysis on the processed data, extracting key elements, technical parameters, and interrelationships of the text sequence S in the processed data (design standard), transforming the processed data into structured data with high information density. This includes the following steps:

[0088] S2-1, Extracting the embedding vector: A text sequence S from the processed data is embedded using a BERT pre-trained model to obtain the embedding vector X = {x1, x2, ..., x...}. n};

[0089] in, w is the i-th word in the text sequence S i Word vectors; The word embedding vector is used to represent the word w. i The basic semantic information is derived from the word embedding matrix W in the BERT pre-trained model. s generate; The sentence embedding vector is used to represent the word w. i The role in the text sequence S is generated by the multi-layer Transformer encoder in the BERT pre-trained model; The position embedding vector is used to represent the word w. i The position in the text sequence S is determined by the position embedding matrix W in the BERT pre-trained model. p generate;

[0090] S2-2, Extract the contextual features corresponding to each word in the text sequence S based on the embedding vector X:

[0091] S2-2-1, Traverse the embedded vector X = {x1, x2, ..., x...} n The word vector of each word is used as the input to the mapping function to calculate the initial context features corresponding to each word in the text sequence S, resulting in the initial context feature set H = {h0(w1), h0(w2), ..., h0(w...}). n )};

[0092] Wherein, the i-th word w in the text sequence S is calculated. i The initial context features h0(w i The calculation method for h0(w) is as follows: i )=f(x i )

[0093] In the formula, f is a mapping function used to process the word w. i Embedded vector x i , i∈[1,n].

[0094] S2-2-2, Traverse the initial context feature set H = {h0(w1), h0(w2), ..., h0(w...} nThe initial context features of each word in set H are input into the pre-trained BERT model. The N-layer Transformer encoder of the BERT model outputs the context features of each word, forming the context feature set H of each word. S ={h N (w1),h N (w2),…,h N (w n )}.

[0095] Get the i-th word w in the text sequence S i Contextual features h N (w i The method is as follows: Initialize the context features h0(w) i The input is fed into the first layer of an N-layer Transformer encoder. Each layer of the Transformer encoder performs self-attention calculation and nonlinear transformation on the output of the previous layer to obtain the context features of the current layer's output. The context features of the Nth layer's output are denoted as h. N (w i ); where the context features output by the l-th layer Transformer encoder are h l (w i ): h l (w i ) = Trans(h l-1 (w i ),l∈[1,N]

[0096] In the formula, Trans() represents the Transformer encoding operation, including self-attention computation and nonlinear transformation, used to update contextual features; h l-1 (w i ) represents the contextual features output by the (l-1)th layer Transformer encoder.

[0097] S2-3, Extracting final contextual features based on the Bi-LSTM model:

[0098] Traverse set H S ={h N (w1),h N (w2),…,h N (w n )}, set H SEach context feature in the text sequence S is used as input to the Bi-LSTM model to capture the bidirectional contextual dependencies in the text sequence S, thereby obtaining the final context features corresponding to each word in the text sequence S, and forming the final context feature sequence C. S ={c1,c2,…,c n}

[0099] Among them, the Bi-LSTM model extracts the word w i The corresponding final context feature c i The method is as follows: using h N (w i ) is the input to the Bi-LSTM model. h is calculated using the forward LSTM and backward LSTM of the Bi-LSTM model, respectively. N (w i ) Contextual dependencies of positions in the text sequence S are used to obtain the hidden states of the feedforward LSTM output. and the hidden state output by the backward LSTM Will and Concatenate the words and output the word "w". i The corresponding final context feature c i The calculation formula is:

[0100] In the formula, i∈[1,n], the final context features It is a two-dimensional real matrix. For the set of all real numbers, 2×d he For the final context feature c i Dimension.

[0101] S2-4, Calculate the final context feature sequence C S The average of the values is used to obtain the average vector s. g s g The overall semantic features of the text sequence S: s g =avg{c1,c2,…,c n},in, 2×d he The average vector s g Dimensions.

[0102] S2-5, Construct the relation set R:

[0103] In the process of pre-training the BERT model to process text sequences, entities are identified from the text sequence, semantic relationships between different entities are inferred based on contextual information, semantic relationships are embedded into a relation vector space to obtain relation vectors, and added to the relation set R, resulting in the relation set R = {r1, r2, ..., r...}k ,…r t}, where t is the total number of relation vectors in the relation set R, and r is the relation vector. k This represents a specific relation type or semantic relationship in a text sequence S. d represents spatial dimension. r Let be the dimension.

[0104] The relation set R expands continuously as new relation vectors are discovered. Whenever a pre-trained BERT model discovers an unrecognized relation vector while processing text, it adds this new relation vector to the relation set R, thus maintaining the dynamic expansion of the knowledge graph.

[0105] S2-6, based on the final context feature sequence C S and average vector s g The attention mechanism is used to calculate the relationship vector r for each word in the text sequence S through linear mapping and nonlinear transformation. k The attention alignment scores are then used to form an attention alignment score set {e}. 1k ,e 2k ,…,e nk Attention alignment score e ik Used to measure the final contextual feature c i In relation vector r k The weights are as follows, where the attention alignment score e is obtained. ik The method is: e ik =v T tanh(W r r k +W g s g +W h c i )

[0106] In the formula, v is the attention vector. Linear mapping matrix W r Used to transfer relation vector r k Mapped to attention space, W g Used to average vector s g Mapped to attention space, W h Used to convert the final context feature c i Mapped to attention space, Let d be the spatial dimension of the attention vector v. att Let be the dimension of the attention space. For W g and W h Spatial dimension, 2×d heIt is the average vector s g The dimension of.

[0107] S2-7, using the Softmax function to evaluate the attention alignment score set {e} 1k ,e 2k ,…,e nk Normalization is performed to obtain the value of each word in the current target relation r. k The attention weights are then assigned and combined to form an attention weight set {α}. 1k ,α 2k ,…,α nk};

[0108] Get the word w from the text sequence S i The corresponding attention weight α ik The method is as follows:

[0109] Among them, the attention weight set {α 1k ,α 2k ,…,α nk}satisfy

[0110] S2-8, through the attention weight set {α 1k ,α 2k ,…,α nk} For the final context feature sequence C S ={c1,c2,…,c n Perform a weighted summation to obtain the relation vector r. k The feature vectors s of the text sequence S below k :

[0111] Wherein, the text sequence S is in the relation vector r k Specific semantic information will be integrated into the feature vector s k middle, n is the total number of words in the text sequence S, and d is the feature dimension of each word;

[0112] S2-9, the feature vector s k The input is fed into the relation gating unit, which then determines the relationship based on the feature vector s. k The gating vector G is obtained by calculating the activation function; the gating vector G is then used to adjust the feature vector s. k Filtering is performed to obtain the filtered feature vector. Among them, the gate vector G filters out the relation vector r k Irrelevant feature information is excluded, but the features most beneficial to entity annotation are retained. The activation function is Sigmoid. The process includes the following steps:

[0113] S2-9-1, Obtain the gating vector G as follows: G=σ(W g s k +b g )

[0114] Among them, W g Here is the weight matrix W for the gating mechanism. g Used to transform feature vector s k Mapped to the gate space, b g For the bias term of the gating mechanism, bias term b g The offset used to control the gating mechanism. Weight matrix W g and bias term b g The feature vector s is obtained through a linear transformation via a gating mechanism; σ is the sigmoid activation function, which maps the input values to the range [0,1]. The gating vector G determines the feature vector s. k The degree of passage of each element in the text. In the gate vector G, the value of each element is between [0,1]. The closer to 1, the higher the importance of the element's feature; the closer to 0, the lower the contribution of the element's feature.

[0115] S2-9-2, using the gating vector G to control the feature vector s k Filtering is performed by multiplying element by element to obtain the filtered feature vector.

[0116] in, Each element G in the gate vector G i Controlling the feature vector s k Each element s in ki Preservation or suppression; ⊙ denotes element-wise multiplication, for G and s k Multiply the corresponding elements in the vector, and then filter the resulting feature vector. Contained in relation vector r k The following information is meaningful.

[0117] S2-10, the filtered feature vector Each element in the algorithm is input into a hidden layer of a Bi-LSTM network, and the output is a filtered feature vector. The hidden state corresponding to each element in the text sequence, i.e., the hidden state corresponding to each word in the text sequence, yields the hidden state set.

[0118] Among them, the word w in the text sequence S is obtained. i Corresponding hidden state The method is as follows:

[0119] In the formula, It contains the word w i The relation vector r in the text sequence S k The following context information, To hide the state space, `hidden` represents the hidden state. The length of the hidden layer is determined by the parameters of the hidden layer of the Bi-LSTM network.

[0120] S2-11, set the hidden states Each hidden state is input into the entity labeling layer of the Bi-LSTM network. The output of the entity labeling layer is the labels of all words in the text sequence S, forming a label sequence {y1, y2…y}. n}

[0121] Among them, the i-th word w in the text sequence S is obtained. i The method for tagging is as follows:

[0122] In the entity annotation layer, the Softmax function is used to hide the state. Transform into the probability distribution of each entity category [p1, p2, ..., p q The calculation formula is as follows:

[0123] In the formula, q represents the number of entity categories. For hidden state A linear transformation is performed to obtain the raw score for each entity category. The softmax function then converts the raw score for each entity category into a probability distribution [p1, p2, ..., p]. q In the probability distribution, each element is a word w. i W represents the probability of an entity category. e b is the weight matrix of the entity annotation layer; e This is a bias term.

[0124] The word w i The entity category corresponding to the maximum probability in the probability distribution is taken as the word w. i The entity category label y i The calculation formula is as follows: y i =argmax(p1,p2,…,p q )

[0125] In the formula, argmax(p1,p2,…,p q ) refers to the label of the entity category corresponding to the maximum probability output.

[0126] S2-12, based on the label sequence {y1,y2…y... n}, determine the relationship vector r in the text sequence S. k The relevant head and tail entities, and the label sequence {y1, y2…y} based on the determined head and tail entities. n The annotation process is as follows: The method for identifying head and tail entities is:

[0127] When identifying head entities, start from the label sequence {y1, y2…y} n Find the first occurrence of B-Head as the starting position of the head entity, and continue until the beginning of O (non-entity) or the tail entity's starting position B-Tail is encountered. This process includes subsequent I-Heads (continuation of the head entity). When identifying the tail entity, start from the label sequence {y1, y2…y}. n Find the position of the first B-Tail as the starting position of the tail entity, until an O (non-entity) or B-Head is encountered, which includes subsequent I-Tails (continuation of the tail entity).

[0128] S3 utilizes Conditional Random Fields (CRF) and Graph Neural Networks (GNN) techniques to identify and label entities, attributes, and relationships within the design criteria, forming nodes and edges of the knowledge graph to construct a detailed knowledge graph. The specific steps are as follows:

[0129] S3-1, based on the label sequence {y1, y2…y} of the already labeled head and tail entities. n Add all identified head and tail entities to the entity set, which includes all entities related to the relation vector r. k Related entities.

[0130] S3-2, Traversing the relation set R = {r1, r2, ..., r...} k ,…r t Each time, select a relation vector from the relation set R and repeat steps S2-6 to S3-1 to obtain the entity set E = {E1, E2, ..., E...} associated with each relation vector in the relation set R. u}, E i Let u be an entity, and u be the total number of entities.

[0131] S3-3, analyze the attributes of each entity in the entity set E to obtain the attribute set of each entity, and traverse the entity set E to obtain the attribute set A = A1, A2, ..., A... of all entities. u , where A i For entity E i The attribute set A i =a1,a2,…,a m .

[0132] Among them, obtain entity E i The attribute set A i =a1,a2,…,a m The method is as follows:

[0133] (1) Extract entity E from text sequence S i The two words before and after the first word form the context word set N. i , where the set of context words N i ={w i-2 ,w i-1 E i ,w i+1 ,w i+2}, where w i-2 ,w i-1 ,w i+1 ,w i+2 Representing entity E respectively i Two words at the beginning and two at the end;

[0134] (2) Using a Conditional Random Field (CRF) model, compute the context word set N. i Lower entity E i The attribute set A i The conditional probability distribution is used to generate entity E. i The attribute set A i =a1,a2,…,a m Attribute set A i Each attribute in the data is for entity E. i A specific characteristic of this is that the formula for the conditional random field model is as follows:

[0135] Among them, Z(N) i ψ is a normalization factor used to make the sum of the probabilities of all attributes equal to 1. m (A i N i ) is a feature function used to capture the context N. i With attribute set A i The interrelationship between them This indicates a chain multiplication of all characteristic functions.

[0136] S3-4, based on the entity set E = {E1, E2, ..., E...} u} and attribute set A = A1, A2, ..., A u The set of nodes that form the knowledge graph is N = {E1, E2, ..., E...} u A1, A2, ..., A u}

[0137] S3-5: Traverse the node set N and add feature information (including node ID, type, and description) to each node of the knowledge graph to quickly locate and identify nodes during knowledge graph queries. Use the vectorization technique Word2Vec to transform each node in the node set N into a vector, obtaining a vector set V(N), which facilitates efficient storage and retrieval in the graph database later.

[0138] Wherein, the vector set V(N) = {v(E1), v(E2), ..., v(E... u ),v(A1),v(A2),…,v(A u )},v(E i ) and v(A i ) are entities E i and attribute A i Convert to a vector representation.

[0139] S3-6, based on the entity set E = {E1, E2, ..., E...} u In the relation set R = {r1, r2, ..., r}, edges are constructed between nodes for entity pairs (head entity and tail entity). These edges describe the relationships between entities and attributes, as well as the relationships between entities themselves. t Each relation vector in} is used as a representation of an edge, and each edge is vectorized to obtain the set V(R) = {v(r1), v(r2), ..., v(r... t The vector set V(N) and the set V(R) form a complete knowledge graph KG = (V(N), V(R)).

[0140] S3-7, Perform S2 to S3-6 on all text sequences in the processed data described in S1 to obtain the complete knowledge graph KG.

[0141] The construction of nodes in the complete knowledge graph (KG) ensures that all extracted information is included, while the construction of edges effectively connects this information, making the knowledge graph efficient and usable in querying and reasoning.

[0142] S4. The complete knowledge graph (KG) structure is adjusted in real time using a dynamic topology optimization algorithm. This algorithm optimizes the layout of nodes and edges, reduces data redundancy, and automatically handles the addition and modification of nodes in the complete knowledge graph KG, ensuring efficient querying and maintenance of the knowledge graph. The specific steps are as follows:

[0143] S4-1, Perform a preliminary analysis of the complete knowledge graph KG. By calculating the number of connections (degree) of each node, assess the importance of each node in the complete knowledge graph KG and identify potential structural problems. Calculate the weight of each edge based on the degree of the node. The weight of the edge reflects the strength of the relationship between nodes. The lower the weight of the edge, the lower the association strength between nodes in the complete knowledge graph KG.

[0144] Among them, node E is obtained. i degree d i The calculation formula is as follows:

[0145] In the formula, A i,j A is an element in the adjacency matrix. i,j Indicates node E i and node E j The connection between them.

[0146] Among them, node E is obtained. i →E j The weight w of the edge ij The calculation formula is as follows:

[0147] weight w ij Reflects node E i and node E j The strength of the relationship between them, f(d) i ,d j ) is the weight calculation function.

[0148] S4-2, Based on the degree of each node and the weight of each edge in the complete knowledge graph KG, formulate a dynamic topology adjustment strategy to optimize the structure of the knowledge graph:

[0149] In the complete knowledge graph KG, nodes with the same or similar attributes and a degree less than d are considered. min Nodes with a degree of 5 are considered redundant nodes. Redundant nodes in the complete knowledge graph KG are merged, deleted, and their connections are transferred to nodes with higher degrees to obtain entity set E′. The degree of nodes in the complete knowledge graph KG is calculated according to the formula for calculating the degree of nodes. Nodes with a degree of zero are considered isolated nodes to obtain isolated node set I. Isolated nodes are removed from the knowledge graph according to isolated node set I to reduce the complexity of the complete knowledge graph KG.

[0150] Redundant nodes are nodes in a knowledge graph that have the same or similar attributes. Redundant nodes are entities of the same type or have duplicate information.

[0151] S4-3, Optimize the edges based on their weights in the complete knowledge graph KG by traversing the set V(R) = {v(r1), v(r2), ..., v(r...}} of all edges in the complete knowledge graph KG. k Edges with weights less than or equal to a threshold of 0.2 are deleted from the complete knowledge graph KG. If a deleted edge disconnects some nodes in the complete knowledge graph KG and the degree of these nodes becomes zero, these nodes are considered isolated nodes and deleted. After the edge deletion operation, the connection relationships between the remaining nodes in the complete knowledge graph KG will change, and the S4 optimization of the knowledge graph KG will be re-executed. Deleting edges with low weights and redundant nodes helps reduce noise and the complexity of the complete knowledge graph KG, thereby improving query efficiency.

[0152] The performance of the optimized complete knowledge graph KG was verified by evaluating its query performance, data redundancy, and information aggregation.

[0153] Based on benchmark queries, the query response time of the complete knowledge graph KG is calculated to verify its query efficiency and maintenance cost. This includes calculating the total time T for executing all benchmark queries. q The method is as follows:

[0154] Among them, t a is the response time of the a-th query, h is the total number of queries, and the query performance is as follows: the response time of a simple query should be within 1 second, the response time of a complex query should not exceed 5 seconds, the query throughput should reach more than 1000 queries / second, and the query response time should be stable with a standard deviation of less than 0.5 seconds.

[0155] The data redundancy of the complete knowledge graph (KG) is controlled as follows: node redundancy and relation redundancy are controlled below 5%, and attribute redundancy is below 10%, thereby reducing storage waste and improving query efficiency.

[0156] The information clustering degree of the complete knowledge graph KG is as follows: the average degree of each node reaches more than 10, the edge clustering degree and information coverage are high, the clustering coefficient is close to 1, and the information of the complete knowledge graph KG is tightly connected and effectively covered.

[0157] In summary, the complete knowledge graph KG meets the set quality standards and supports efficient query and maintenance operations.

Claims

1. A method for rapidly constructing a knowledge graph of railway bridge design standards, characterized in that, Includes the following steps: S1. Data Collection and Preprocessing: Collect data from standard documents related to railway bridge design; remove redundant and noisy data from the collected data using data cleaning techniques to obtain processed data, including: The processed data contains multiple text sequences, the first text sequence S = {w1, w2, ..., w...} n }, w i It is the i-th word in the corresponding text sequence S, where i = 1, 2, 3…n; S2, using a BERT pre-trained model and a Bi-LSTM model, transforms the text sequence S into high-information-density structured data, including: S2-1, Extracting the embedding vector: The text sequence S is embedded using a BERT pre-trained model to obtain the embedding vector X = {x1, x2, ..., x...} of the text sequence S. n }; where x i w is the i-th word in the text sequence S i Word vectors; S2-2, Extract the contextual features corresponding to each word in the text sequence S based on the embedding vector X: S2-2-1, Traverse the embedded vector X = {x1, x2, ..., x...} n The word vector of each word is used as the input to the mapping function to calculate the initial context features corresponding to each word in the text sequence S, resulting in the initial context feature set H = {h0(w1), h0(w2), ..., h0(w...}). n )}; S2-2-2, Traverse the initial context feature set H = {h0(w1), h0(w2), ..., h0(w...} n The initial context features of each word in set H are input into the pre-trained BERT model. The N-layer Transformer encoder of the BERT model outputs the context features of each word, forming the context feature set H of each word. S ={h N (w1),h N (w2),…,h N (w n )}; S2-3, Extracting final contextual features based on the Bi-LSTM model: Traverse set H S ={h N (w1),h N (w2),…,h N (w n )}, will set H S Each context feature in the text sequence S is used as input to the Bi-LSTM model to capture the bidirectional contextual dependencies in the text sequence S, thereby obtaining the final context features corresponding to each word in the text sequence S, and forming the final context feature sequence C. S ={c1,c2,…,c n }; S2-4, Calculate the final context feature sequence C S The average of the values is used to obtain the average vector s. g s g The overall semantic features of the text sequence S: s g =avg{c1,c2,…,c n }; S2-5, Construct the relation set R: In the process of pre-training the BERT model to process text sequences, entities are identified from the text sequence, semantic relationships between different entities are inferred based on contextual information, semantic relationships are embedded into a relation vector space to obtain relation vectors, and added to the relation set R, resulting in the relation set R = {r1, r2, ..., r...} k ,…r t }, where t is the total number of relation vectors in the relation set R, and r is the relation vector. k This represents a specific relation type or semantic connection in a text sequence S; S2-6, based on the final context feature sequence C S and average vector s g The attention mechanism is used to calculate the relationship vector r for each word in the text sequence S through linear mapping and nonlinear transformation. k The attention alignment scores are then used to form an attention alignment score set {e}. 1k ,e 2k ,…,e nk Attention alignment score e ik Used to measure the final contextual feature c i In relation vector r k The weights below; S2-7, using the Softmax function to evaluate the attention alignment score set {e} 1k ,e 2k ,…,e nk Normalization is performed to obtain the value of each word in the current target relation r. k The attention weights are then assigned and combined to form an attention weight set {α}. 1k ,α 2k ,…,α nk }; S2-8, through the attention weight set {α 1k ,α 2k ,…,α nk } For the final context feature sequence C S ={c1,c2,…,c n Perform a weighted summation to obtain the relation vector r. k The feature vectors s of the text sequence S below k : Wherein, the text sequence S is in the relation vector r k Specific semantic information will be integrated into the feature vector s k In this context, n represents the total number of words in the text sequence S, and d represents the feature dimension of each word. S2-9, the feature vector s k The input is fed into the relation gating unit, which then determines the relationship based on the feature vector s. k The gating vector G is obtained by calculating the activation function; the gating vector G is then used to adjust the feature vector s. k Filtering is performed to obtain the filtered feature vector. Among them, the gate vector G filters out the relation vector r k Irrelevant feature information is retained while retaining the features most beneficial to entity annotation, and the activation function is Sigmoid. S2-10, the filtered feature vector Each element in the algorithm is input into a hidden layer of a Bi-LSTM network, and the output is a filtered feature vector. The hidden state corresponding to each element in the text sequence, i.e., the hidden state corresponding to each word in the text sequence, yields the hidden state set. S2-11, set the hidden states Each hidden state is input into the entity labeling layer of the Bi-LSTM network. The output of the entity labeling layer is the labels of all words in the text sequence S, forming a label sequence {y1, y2…y}. n }; S2-12, based on the label sequence {y1,y2…y... n }, determine the relationship vector r in the text sequence S. k The relevant head and tail entities, and the label sequence {y1, y2…y} based on the determined head and tail entities. n } should be annotated; S3 utilizes Conditional Random Field (CRF) and Graph Neural Network (GNN) techniques to identify and label entities, attributes, and relationships in design standards, forming nodes and edges of a knowledge graph to construct a complete knowledge graph (KG). S4. The complete knowledge graph (KG) structure is adjusted in real time using a dynamic topology optimization algorithm. The specific steps are as follows: S4-1, Perform a preliminary analysis on the complete knowledge graph KG. By calculating the degree of each node, evaluate the importance of each node in the complete knowledge graph KG. Calculate the weight of each edge based on the degree of the node. The weight of the edge reflects the strength of the relationship between the nodes. S4-2, Based on the degree of each node and the weight of each edge in the complete knowledge graph KG, formulate a dynamic topology adjustment strategy to optimize the structure of the knowledge graph: In the complete knowledge graph KG, nodes with the same or similar attributes and a degree less than d are considered. min The redundant nodes are treated as redundant nodes. The redundant nodes in the complete knowledge graph KG are merged, the redundant nodes are deleted, and the connection relationship of the redundant nodes is transferred to the node with higher degree to obtain the entity set E′. According to the formula for calculating the degree of the node, the degree of the node in the complete knowledge graph KG at this time is calculated. The node with a degree of zero is treated as an isolated node to obtain the isolated node set I. The isolated nodes are removed from the knowledge graph according to the isolated node set I. S4-3, Optimize the edges based on their weights in the complete knowledge graph KG by traversing the set V(R) = {v(r1), v(r2), ..., v(r...}} of all edges in the complete knowledge graph KG. k Edges with weights less than or equal to the threshold are deleted from the complete knowledge graph KG. If the deleted edge disconnects some nodes in the complete knowledge graph KG and the degree of these nodes becomes zero, these nodes are considered isolated nodes and deleted. After the edge deletion operation in the complete knowledge graph KG, the connection relationship between the remaining nodes will change, so S4 optimization of the knowledge graph KG will be performed again.

2. The method for rapid construction of a railway bridge design standard knowledge graph according to claim 1, characterized in that: In S1, the standard documents related to railway bridge design include structural design specifications, material standards, construction specifications, testing and maintenance standards, and environmental and sustainability standards. Data cleaning techniques include deleting duplicate records, processing missing values, correcting erroneous information, and standardizing data formats.

3. The method for rapid construction of a railway bridge design standard knowledge graph according to claim 1, characterized in that: In S2-2-2, the i-th word w in the text sequence S is obtained. i Contextual features h N (w i The method is as follows: Initialize the context features h0(w) i The input is fed into the first layer of an N-layer Transformer encoder. Each layer of the Transformer encoder performs self-attention calculation and nonlinear transformation on the output of the previous layer to obtain the context features of the current layer's output. The context features of the Nth layer's output are denoted as h. N (w i ); where the context features output by the l-th layer Transformer encoder are h l (w i ): h l (w i ) = Trans(h l-1 (w i ),l∈[1,N] In the formula, Trans() represents the Transformer encoding operation, including self-attention computation and nonlinear transformation, used to update contextual features; h l-1 (w i ) represents the contextual features output by the (l-1)th layer Transformer encoder.

4. The method for rapid construction of a railway bridge design standard knowledge graph according to claim 1, characterized in that: In S2-3, the Bi-LSTM model extracts the word w. i The corresponding final context feature c i The method is as follows: using h N (w i ) is the input to the Bi-LSTM model. h is calculated using the forward LSTM and backward LSTM of the Bi-LSTM model, respectively. N (w i ) Contextual dependencies of positions in the text sequence S are used to obtain the hidden states of the feedforward LSTM output. and the hidden state output by the backward LSTM Will and Perform concatenation and output the word "w". i The corresponding final context feature c i The calculation formula is: In the formula, i∈[1,n], the final context features It is a two-dimensional real matrix. For the set of all real numbers, 2×d he For the final context feature c i Dimension.

5. The method for rapid construction of a railway bridge design standard knowledge graph according to claim 1, characterized in that: In S2-6, the attention alignment score e is obtained. ik The method is: e ik =v T tanh(W r r k +W g s g +Whc i ) In the formula, v is the attention vector; W is the linear mapping matrix. r Used to transfer relation vector r k Mapped to attention space; W g Used to average vector s g Mapped to attention space; Wh is used to map the final context features c i Mapped to the attention space.

6. The method for rapid construction of a railway bridge design standard knowledge graph according to claim 1, characterized in that: In S2-7, obtain the word w from the text sequence S. i The corresponding attention weight α ik The method is as follows: Among them, the attention weight set {α 1k ,α 2k ,…,α nk }satisfy 7. The method for rapid construction of a railway bridge design standard knowledge graph according to claim 1, characterized in that... S2-9 includes the following steps: S2-9-1, Obtain the gate vector G, the method is as follows: G=σ(W g s k +b g ) Among them, W g Here is the weight matrix W for the gating mechanism. g Used to transform feature vector s k Mapped to the gate space, b g For the bias term of the gating mechanism, bias term b g The offset used to control the gating mechanism. Weight matrix W g and bias term b g The feature vector s is obtained through a linear transformation via a gating mechanism; σ is the sigmoid activation function, which maps the input values to the range [0,1]. The gating vector G determines the feature vector s. k The degree of passage of each element in the text. In the gate vector G, the value of each element is between [0,1]. The closer to 1, the higher the importance of the element's feature; the closer to 0, the lower the contribution of the element's feature. S2-9-2, using the gating vector G to control the feature vector s k Filtering is performed by multiplying element by element to obtain the filtered feature vector. in, Each element G in the gate vector G i Controlling the feature vector s k Each element s in ki Preservation or suppression; ⊙ denotes element-wise multiplication, for G and s k Multiply the corresponding elements in the vector, and then filter the resulting feature vector. Contained in relation vector r k The following information is meaningful.

8. The method for rapid construction of a railway bridge design standard knowledge graph according to claim 1, characterized in that: In S2-10, obtain the word w from the text sequence S. i Corresponding hidden state The method is as follows: In the formula, It contains the word w i The relation vector r in the text sequence S k The following context information, To hide the state space, `hidden` represents the hidden state. The length of the hidden layer is determined by the parameters of the hidden layer of the Bi-LSTM network.

9. The method for rapidly constructing a knowledge graph of railway bridge design standards according to claim 1, characterized in that: In S2-11, the i-th word w in the text sequence S is obtained. i The method for tagging is as follows: In the entity annotation layer, the Softmax function is used to hide the state. Transform into the probability distribution of each entity category [p1, p2, ..., p q The calculation formula is as follows: In the formula, q represents the number of entity categories. For hidden state A linear transformation is performed to obtain the raw score for each entity category. The softmax function then converts the raw score for each entity category into a probability distribution [p1, p2, ..., p]. q In the probability distribution, each element is a word w. i W represents the probability of an entity category. e b is the weight matrix of the entity annotation layer; e For bias terms; The word w i The entity category corresponding to the maximum probability in the probability distribution is taken as the word w. i The entity category label y i The calculation formula is as follows: y i =argmax(p1,p2,…,p q ) In the formula, argmax(p1,p2,…,p q ) refers to the label of the entity category corresponding to the maximum probability output.

10. The method for rapid construction of a railway bridge design standard knowledge graph according to claim 1, characterized in that: In S4-2, d min =5; In S4-3, the threshold is 0.2.