A knowledge graph-based enterprise digital transformation demand identification method

By constructing a multidimensional knowledge graph and combining it with multi-head self-attention and graph convolutional networks, the problem of superficial identification in traditional methods is solved, enabling accurate identification and dynamic path generation of enterprise digital transformation needs, and improving the accuracy and quantification of identification.

CN122240807APending Publication Date: 2026-06-19NANJING LAIKE NETWORK TECHNOLOGY CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
NANJING LAIKE NETWORK TECHNOLOGY CO LTD
Filing Date
2026-03-16
Publication Date
2026-06-19

Smart Images

  • Figure CN122240807A_ABST
    Figure CN122240807A_ABST
Patent Text Reader

Abstract

This invention discloses a knowledge graph-based method for identifying enterprise digital transformation needs. It constructs a multi-dimensional business operation knowledge graph; extracts a semantic subgraph representing pain point obstacles in enterprise business flow; constructs a high-order semantic representation matrix of pain points; extracts a set of corresponding matching candidate digital transformation need nodes; obtains a comprehensive need feature graph; and generates a set of target digital transformation need paths with a quantified execution priority sequence. This invention ensures the establishment of a globally optimal mapping relationship with extremely high confidence between different modal features, significantly improving the fusion depth and lossless alignment capability of underlying business data.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of enterprise digital technology, and in particular to a method for identifying enterprise digital transformation needs based on knowledge graphs. Background Technology

[0002] With the rapid evolution of the digital economy and the increasing market competition, enterprise digital transformation has shifted from the localized automation of IT systems to a deep restructuring of global business logic. Accurately identifying and planning digital transformation needs has become a standard guideline for enterprises to achieve cost reduction, efficiency improvement, and innovative development.

[0003] In modern enterprise environments with complex business architectures and massive amounts of heterogeneous data, accurately and objectively identifying pain points in business processes and matching them with suitable digital technologies faces numerous challenges. Traditional methods for identifying business pain points often rely heavily on subjective human interviews with experts, questionnaires, or shallow data statistics from a single dimension. In multimodal business environments involving massive amounts of unstructured text, production images, and equipment operation logs, this can lead to significant information omissions and cognitive biases. It fails to break down underlying data silos and struggles to objectively and quantitatively capture deep-seated graph structure anomalies and business process obstructions from the complex global data flow.

[0004] As the complexity of digital technology solutions in various industries increases, traditional transformation demand matching mechanisms often rely on simple keyword searches or superficial expert experience rule mapping. These mechanisms struggle to deeply understand the graph topological isomorphism between the network of business bottlenecks and the network of technical solutions. Consequently, the extracted digital needs often remain superficial, failing to accurately address the underlying structural problems of the enterprise at a high-level semantic feature level. Traditional digital planning solutions typically only output static, fragmented technology procurement lists. They cannot consider the changes in resource status and the dependencies between technology implementation priorities during the long-term strategic evolution of the enterprise. They cannot limit the blind implementation of technology through time-series feature deduction, and they struggle to dynamically generate a systematic transformation evolution path with quantifiable execution priorities. Summary of the Invention

[0005] One objective of this invention is to propose a knowledge graph-based method for identifying enterprise digital transformation needs. This invention ensures that a globally optimal mapping relationship with extremely high confidence can be established between different modal features, significantly improving the fusion depth and lossless alignment capability of underlying business data.

[0006] A method for identifying enterprise digital transformation needs based on knowledge graphs according to an embodiment of the present invention includes: Acquire enterprise multimodal business operation data and perform joint entity relationship extraction and cross-modal entity alignment to construct a multidimensional business operation knowledge graph; The multidimensional business operation knowledge graph is subjected to graph topology anomaly detection and graph structure pruning based on information entropy to extract pain point semantic subgraphs that represent the characteristics of business flow obstruction. Based on the pain point semantic subgraph, a graph convolutional network based on multi-head self-attention is used to aggregate node neighborhood features and construct a high-order semantic representation matrix of pain points. The pain point semantic subgraph is combined with a pre-set industry digital technology ontology graph for cross-graph joint spatial computation based on subgraph isomorphic matching to extract the corresponding matching candidate digital demand node set. The high-order semantic representation matrix of the pain points is concatenated with the set of candidate digital demand nodes by tensor dimension concatenation at the feature level to obtain a comprehensive demand feature map. The comprehensive demand feature map is then input into a preset transformation time series prediction model for joint attention optimization to generate a set of target digital transformation demand paths with a quantified execution priority sequence.

[0007] Optionally, the step of acquiring enterprise multimodal business operation data and performing joint entity relationship extraction and cross-modal entity alignment to construct a multidimensional business operation knowledge graph includes: Multimodal business data, including text, images, and time series data, are extracted and their features are unified to generate an initial multimodal feature space. In the multimodal initial feature space, isomorphic aggregation based on distance clustering and association rule extraction is performed to establish a single-modal initial semantic subgraph; Based on the single-modal initial semantic subgraph, the cross-modal basic cost matrix is ​​calculated and edge distribution constraints are introduced to construct the cross-modal joint distance matrix; Information entropy regularization is introduced for the cross-modal joint distance matrix, and the optimal transmission matrix is ​​solved iteratively to form a set of cross-modal entity mapping anchor points; Based on the cross-modal entity mapping anchor point set, cross-modal node fusion and global feature standardization are performed on the single-modal initial semantic subgraph to construct a multi-dimensional business operation knowledge graph.

[0008] Optionally, the step of performing graph topology anomaly detection and graph structure pruning based on information entropy on the multidimensional business operation knowledge graph to extract pain point semantic subgraphs representing the characteristics of business flow obstruction includes: Calculate the topological importance probability and random walk transition probability of nodes in the multidimensional business operation knowledge graph, and generate the structural information entropy sequence of all nodes in the entire graph; Extract the global mean and standard deviation from the structural information entropy sequence to identify suspected blocking nodes, and combine the Mahalanobis distance to generate a set of local anomaly score distributions. The set of local anomaly score distributions is segmented using the Otsu's method to find the optimal segmentation point, which is set as the dynamic pruning threshold. In the multidimensional business operation knowledge graph, normal node networks with local anomaly scores lower than the dynamic pruning threshold are removed, core anomaly connectivity components are retained, and pain point semantic subgraphs representing the characteristics of business flow obstruction are extracted.

[0009] Optionally, based on the pain point semantic subgraph, a graph convolutional network based on multi-head self-attention is used to aggregate node neighborhood features and construct a high-order semantic representation matrix of the pain point, including: Extract the node initial vectors and topological connection relationships from the pain point semantic subgraph, and construct a diagonal normalized adjacency matrix containing self-connection information; The node features are projected using multiple sets of linear transformation weight matrices, and the inner product and nonlinear activation are calculated by combining the diagonal normalized adjacency matrix to form a multi-head attention coefficient distribution matrix. The node features are weighted and summed and fused with residual connections using the multi-head attention coefficient distribution matrix to generate a set of high-order feature representations for each independent head. The high-order feature representations of all heads are tensor-concatenated and fully connected for dimensionality reduction and recombination to construct a high-order semantic representation matrix of pain points.

[0010] Optionally, the step of performing cross-graph joint spatial calculation based on subgraph isomorphic matching between the pain point semantic subgraph and a pre-set industry digital technology ontology graph to extract a set of corresponding matching candidate digital demand nodes includes: By using graph isomorphic networks, the pain point semantic subgraph is mapped to the same space as the pre-built industry digital technology ontology graph, and a basic joint embedding metric space is constructed. In the basic joint embedding metric space, the internal graph topology cost and cross-graph distance deviation are calculated and linearly combined to generate a global matching cost matrix; For the global matching cost matrix, projective gradient descent is performed to update it with the goal of minimizing the topological difference, and the Gromov-Wasestein difference alignment matrix in the convergent state is output. The Gromov-Wasestein difference alignment matrix is ​​analyzed, the optimal matching column index that meets the lower confidence limit is extracted, and reverse technical entity query and deduplication summary are performed to extract the corresponding matching candidate digital demand node set.

[0011] Optionally, the step of concatenating the high-order semantic representation matrix of the pain points with the set of candidate digital demand nodes at the feature level using tensor dimensions to obtain a comprehensive demand feature map, and inputting the comprehensive demand feature map into a preset transformation time-series prediction model for joint attention optimization, to generate a set of target digital transformation demand paths with a quantified execution priority sequence, includes: Tensor Kronecker product operation is performed on the high-order semantic representation matrix of the pain point and the feature matrix corresponding to the set of candidate digital demand nodes to generate a high-dimensional bilinear pooling feature tensor. The high-dimensional bilinear pooling feature tensor is subjected to spatial feature compression and global attribute concatenation using a convolutional neural network to generate a comprehensive requirement feature map; The comprehensive demand feature map is input into the time series prediction model to perform gated feature interaction and time step evolution simulation, and the hidden layer time series evolution feature state is output. The alignment score of the hidden layer temporal evolution feature state is calculated based on the global spatiotemporal joint attention mechanism, and the set of relative priority values ​​of each demand reflecting the strategic urgency is output. Based on the relative priority value set, the nodes in the candidate digital demand node set are sorted in descending order and the underlying topology dependency is reversed to generate a target digital transformation demand path set with a quantified execution priority sequence.

[0012] The beneficial effects of this invention are: (1) This invention adopts an improved optimal transmission algorithm that combines information entropy regularization. By introducing a Sinkhorn iterative solution mechanism that combines cross-modal basic cost matrix and edge distribution constraints, it can adaptively achieve the unification of the underlying feature space according to the distribution characteristics of different modal business data, improve the integrity and global cognitive accuracy of the construction of multi-dimensional business operation knowledge graph, and accurately extract cross-modal entity mapping anchor points when facing massive unstructured text, production images and time-series logs with serious semantic heterogeneity and dimensional gap. By introducing a negative entropy penalty term related to the transmission probability matrix in the optimal transmission framework to smooth the transmission scheme, it ensures that a globally optimal mapping relationship with extremely high confidence can be established between different modal features, which significantly improves the fusion depth and lossless alignment capability of the underlying business data.

[0013] (2) This invention combines dynamic graph topology anomaly detection based on structural information entropy with a multi-head self-attention mechanism. It generates local anomaly scores by calculating random walk transition probabilities and first- and second-order structural information entropies, and performs weighted feature aggregation on the pain point semantic subgraph. This enables the model to objectively quantify and focus on the real obstruction areas and their high-order features in the enterprise business flow network. It can adaptively weight the feature maps output by graph convolution and dynamically prune normal nodes, so that the network can amplify attention at key business flow breakpoints and more accurately capture the real structural lesions at the enterprise's bottom layer. This significantly improves the accuracy of hidden business pain point identification when processing complex business graphs with high-density correlations, and is particularly outstanding in the location of inefficient nodes in cross-departmental data flow. Attached Figure Description

[0014] The accompanying drawings are provided to further illustrate the invention and form part of the specification. They are used in conjunction with embodiments of the invention to explain the invention and do not constitute a limitation thereof. In the drawings: Figure 1 This is a flowchart of a knowledge graph-based method for identifying enterprise digital transformation needs, as proposed in this invention. Detailed Implementation

[0015] Example 1: Reference Figure 1 A knowledge graph-based method for identifying enterprise digital transformation needs includes: Acquire enterprise multimodal business operation data and perform joint entity relationship extraction and cross-modal entity alignment to construct a multidimensional business operation knowledge graph; In this embodiment, enterprise multimodal business operation data is acquired, entity relationship joint extraction and cross-modal entity alignment are performed, a multidimensional business operation knowledge graph is constructed, and an improved optimal transmission algorithm architecture is adopted for implementation, specifically including: Textual business specification data, image-based production quality inspection data, and time-series equipment operation log data of enterprises within a preset historical time period are obtained respectively. A pre-trained natural language processing model is used to perform word embedding mapping on the textual business specification data to obtain text feature vectors. A deep residual convolutional neural network is used to perform gridded feature extraction on the image-based production quality inspection data to obtain image feature vectors. A recurrent neural network with a long short-term memory gating mechanism is used to extract the hidden state of the time-series equipment operation log data to obtain time-series feature vectors. The three feature vectors are then processed to unify their dimensions to generate a multimodal initial feature space. In the multimodal initial feature space, for data entities within a single modality, the square of the Euclidean distance between any two entity feature vectors is calculated as a local cost function. Based on the density-based spatial clustering algorithm, entities with a distance lower than a preset isomorphism threshold are aggregated into synonymous entity clusters. Business logic association rules are extracted between each synonymous entity cluster using a preset dependency parsing tree, thereby establishing single-modal initial semantic subgraphs with edge weights within the text, image, and temporal modalities respectively. Obtain any two different modal subgraphs from the initial semantic subgraph of a single modality, such as a text semantic subgraph and an image semantic subgraph. Treat all nodes in the text semantic subgraph as the source distribution space and all nodes in the image semantic subgraph as the target distribution space. Calculate the cosine similarity between the feature vector of each text node in the source distribution space and the feature vector of each image node in the target distribution space. Construct a cross-modal basic cost matrix using the difference between one and the cosine similarity. The cross-modal basic cost matrix represents the basic transmission cost required to transfer the semantic quality of the text modality to the image modality. Introduce a feature-level prior probability distribution as a marginal distribution constraint, such that the sum of the transmission quality of all nodes in the source distribution space equals the total quality received in the target distribution space. Based on the above marginal distribution constraint and the cross-modal basic cost matrix, construct a cross-modal joint distance matrix. To address the cross-modal joint distance matrix, a regularization penalty term based on information entropy is introduced to smooth the transmission scheme. Specifically, a negative entropy term related to the transmission probability matrix is ​​added to the objective function of the basic transmission cost. The Sinkhorn iterative algorithm is used to solve the alternating direction multipliers. In each iteration, the rows and columns of the transmission probability matrix are scaled using a diagonal matrix scaling operation at the level of pure scalar multiplication until the sum of the edge distribution errors of the rows and columns is less than the preset convergence tolerance value. At this point, the globally optimal transmission probability matrix is ​​output as the cross-modal optimal alignment transmission matrix. The coordinates of elements in this matrix whose probability values ​​are greater than the preset alignment confidence are extracted. The different modal nodes corresponding to these coordinates are determined as equivalent entities representing the same real-world business object, forming a cross-modal entity mapping anchor point set. Based on the cross-modal entity mapping anchor set, cross-modal node fusion and edge reorganization are performed on the initial semantic subgraph of a single modality. Nodes that are equivalent entities are physically merged, and their corresponding feature vectors are updated to the feature vector of the merged node by calculating the element-level arithmetic mean. At the same time, the relational edges that originally belonged to different modalities but pointed to equivalent entities are merged and deduplicated. Finally, the node features of the merged whole graph are subjected to a global normal distribution standardization process to generate a multi-dimensional business operation knowledge graph covering the enterprise's full-modal business objects and their interaction relationships.

[0016] Perform graph topology anomaly detection and graph structure pruning based on information entropy on the multidimensional business operation knowledge graph, and extract pain point semantic subgraphs that represent the characteristics of business flow obstruction. In this embodiment, graph topology anomaly detection and graph structure pruning based on information entropy are performed on the multidimensional business operation knowledge graph to extract pain point semantic subgraphs that represent the characteristics of business flow obstruction. A graph topology information entropy evaluation mechanism combined with random walk is adopted, specifically including: A multidimensional business operation knowledge graph is acquired, and the absolute degree of each node in the graph is calculated. The absolute degree includes the sum of the in-degree and out-degree of the node. The absolute degree of each node is divided by the sum of the absolute degrees of all nodes in the graph to calculate the topological importance probability of each node. The negative logarithm of the topological importance probability is calculated as the product of the probability itself and the natural logarithm as the base, which is used as the first-order structural information entropy of a single node. Then, starting from each node, a backtracking random walk strategy with a preset step size is executed in the graph. The visit frequency of all adjacent nodes traversed on the random walk path is recorded. The visit frequency is converted into a transition probability, and the second-order local neighborhood information entropy centered on the node is calculated using the transition probability. The first-order structural information entropy and the second-order local neighborhood information entropy are linearly added according to preset weight coefficients to generate a sequence of structural information entropy of all nodes in the entire graph. Based on the structural information entropy sequence, a topological anomaly judgment criterion for business flow obstruction is defined: In a normal enterprise business graph, information flow exhibits the power-law distribution characteristics of a scale-free network. When a business link experiences pain points such as data silos, approval blockages, or resource breaks, the structural information entropy of its corresponding node will fluctuate abnormally. Therefore, the global mean and global standard deviation of the structural information entropy sequence are calculated. For each node, the absolute value of the difference between its structural information entropy and the global mean is calculated. If the absolute value is greater than three times the global standard deviation, the node is judged as a suspected obstruction node. At the same time, the initial feature vector of the suspected obstruction node is extracted, and the Mahalanobis distance between its feature vectors and all one-hop neighbor nodes is calculated. The Mahalanobis distance and the absolute difference of the structural information entropy are normalized and multiplied to obtain a local anomaly score distribution set that integrates network topological anomalies and semantic feature deviations. All score values ​​in the local abnormal score distribution set are sorted in descending order. The maximum inter-class variance method is used to dynamically find the segmentation threshold. Specifically, all score values ​​after sorting are used as candidate segmentation points. The score set is divided into a high abnormal background group and a low abnormal background group. The score variance within the two groups and the mean variance between the two groups are calculated respectively. The candidate segmentation point that maximizes the mean variance between the two groups is found. The score value corresponding to the segmentation point is set as the dynamic clipping threshold. In the multidimensional business operation knowledge graph, normal business flow nodes and their directly connected relationship edges that have local anomaly scores below the dynamic pruning threshold are deleted. For multiple isolated subgraph connected components that result in graph breakage after the deletion of nodes, the sum of anomaly scores of all remaining nodes within each connected component is calculated. Noise fragment connected components with anomaly scores below the minimum threshold are filtered out, and the core connected component network with high-density anomaly aggregation characteristics is retained. These retained core networks, along with the anomaly topology and attribute features carried in the networks, are collectively encapsulated into a pain point semantic subgraph that characterizes the characteristics of business flow obstruction.

[0017] Based on the pain point semantic subgraph, a graph convolutional network based on multi-head self-attention is used to aggregate node neighborhood features and construct a high-order semantic representation matrix of pain points. In this embodiment, based on the pain point semantic subgraph, a graph convolutional network based on multi-head self-attention is used to aggregate node neighborhood features and construct a high-order semantic representation matrix of pain points, specifically including: Obtain the pain point semantic subgraph, extract the initial vector set of all nodes containing pain point features in the subgraph as the zeroth layer input feature matrix of the graph convolutional network, and extract the topological connection relationship of the subgraph. Add self-loop edges between nodes to construct a diagonal normalized adjacency matrix containing self-connection information. For the zeroth layer input feature matrix and the diagonal normalized adjacency matrix, multiple sets of independent linear transformation weight matrices are initialized to implement the multi-head attention mechanism. For the attention calculation process of the independent head, the parameter matrix of the current head is used to project all node features into a linear space. The projected features are decomposed into query feature vectors and key feature vectors. For any center node in the graph and its neighboring nodes with connecting edges, the inner product of the query feature vector of the center node and the key feature vector of the neighboring node is calculated. The inner product value is divided by the square root of the feature dimension to perform numerical scaling to prevent gradient vanishing. The scaled inner product value is input into the rectified linear unit activation function with negative half-axis slope leakage for nonlinear transformation. Finally, for all neighboring nodes of the center node, the relative attention weight is calculated using the exponential normalization function, that is, the exponent of the current nonlinear transformation value is obtained with the natural constant as the base, and then divided by the sum of the exponents of all neighboring nonlinear transformation values ​​to form the multi-head attention coefficient distribution matrix under the current independent head. Using the multi-head attention coefficient distribution matrix, a weighted summation and aggregation operation is performed on the value feature vectors extracted from the zero-level input feature matrix. That is, for each central node, the value feature vectors of each of its neighboring nodes are multiplied by the corresponding attention weight coefficient and then accumulated to obtain the local aggregated feature representation of the current node in the current independent head view. In order to prevent the feature over-smoothing phenomenon in the deep graph convolution process, a residual connection mechanism is introduced on the local aggregated feature representation. The original feature vector of the node after processing by the multilayer perceptron is fused with the current local aggregated feature representation by arithmetic addition of corresponding elements to obtain the head high-order feature representation set exclusive to each independent head. Iterate through all the independent attention heads and obtain the set of high-order feature representations of each head output. Perform tensor concatenation on the representations of all heads along the feature channel dimension so that the length of the concatenated node feature vector is equal to the length of a single head feature multiplied by the total number of heads. Feed the concatenated long feature vector into a linear fully connected network with batch normalization for dimensionality reduction and recombination. Output a comprehensive node feature that takes into account both local pain point details and global obstruction propagation correlation. Arrange the feature vectors generated by the above multi-head aggregation and dimensionality reduction and recombination of all nodes in the pain point semantic subgraph in the order of node number and row by row to generate the high-order semantic representation matrix of the pain point.

[0018] The pain point semantic subgraph is combined with the pre-built industry digital technology ontology graph for cross-graph joint spatial computation based on subgraph isomorphic matching to extract the corresponding matching candidate digital demand node set. In this embodiment, the pain point semantic subgraph and a pre-built industry digital technology ontology graph are subjected to cross-graph joint spatial computation based on subgraph isomorphic matching to extract the corresponding matching candidate digital demand node set, specifically including: The pain point semantic subgraph and the industry digital technology ontology graph pre-loaded from the external knowledge base are obtained respectively. The industry digital technology ontology graph contains a standard graph structure composed of technology solution entities, product entities, and application scenario entities. Using a pre-trained graph isomorphic network, each node in the pain point semantic subgraph is mapped to a source point vector in a metric space. At the same time, each node in the industry digital technology ontology graph is mapped to a target point vector in the same metric space, ensuring that the source point vector and the target point vector have the same feature dimension, thereby constructing a basic joint embedding metric space for cross-graph comparison. In the basic joint embedding metric space, the in-graph topological cost matrix within the pain point semantic subgraph and the in-graph topological cost matrix within the industry digital technology ontology graph are calculated independently. The in-graph topological cost matrix is ​​calculated by taking the weighted combination of the shortest path hop count between any two nodes in the graph and the L2 norm distance of the feature vectors. A joint node coupling matrix across the graph is initialized, and the fourth-order tensor distance deviation between the pain point in-graph topological cost matrix and the industry technology in-graph topological cost matrix under the current joint node coupling matrix is ​​calculated. This deviation represents the topological inconsistency between the obstructed path structure of the pain point business flow and the solution path structure of a certain digital technology. The fourth-order tensor distance deviation is linearly combined with the direct Euclidean distance matrix between cross-graph node features to generate a global matching cost matrix that integrates the dual constraints of semantic similarity and topological isomorphism. For the global matching cost matrix, with the optimization objective of minimizing the Gromov-Wasestein difference, under the premise that the quality distribution of pain point nodes and the quality distribution of technology nodes both satisfy the edge constraint of uniform distribution, the projection gradient descent method is used to iteratively update the joint node coupling matrix. In each gradient descent step, the partial derivative matrix of the global matching cost matrix with respect to the current coupling matrix is ​​calculated, and the parameters are updated with a fixed step size along the negative direction of the partial derivative. The updated matrix is ​​then forcibly projected back into the effective probability simplex space that satisfies the edge constraint using the Lagrange multiplier method. This gradient update and projection operation is continued until the difference of the Frobenius norm of the coupling matrix between two adjacent iterations is less than the preset minimum value. At this point, the Gromov-Wasestein difference alignment matrix in the convergent state is output. This process analyzes the Gromov-Wasestein difference alignment matrix. The row indices of the Gromov-Wasestein difference alignment matrix correspond to business obstruction nodes in the pain point semantic subgraph, while the column indices correspond to technical solution nodes in the industry digital technology ontology graph. The element values ​​in the matrix represent the best matching probability between pain points and technologies. For each row of the matrix, the top few matching probability elements with the largest values ​​are extracted, and it is determined whether these probability values ​​all exceed a preset confidence matching lower limit. The column indices corresponding to the elements that meet the probability value conditions are extracted. Based on the column indices, the corresponding technical node entity objects are queried in reverse within the industry digital technology ontology graph. All queried technical node entity objects are globally deduplicated and categorized to form a set of candidate digital demand nodes that can specifically address the current enterprise business pain point topology.

[0019] The high-order semantic representation matrix of pain points and the set of candidate digital demand nodes are concatenated at the feature level tensor dimension to obtain a comprehensive demand feature map. The comprehensive demand feature map is then input into a preset transformation time series prediction model for joint attention optimization to generate a set of target digital transformation demand paths with a quantified execution priority sequence.

[0020] In this embodiment, the high-order semantic representation matrix of pain points and the set of candidate digital demand nodes are concatenated at the feature level using tensor dimensions to obtain a comprehensive demand feature map. This comprehensive demand feature map is then input into a preset transformation time-series prediction model for joint attention optimization, generating a set of target digital transformation demand paths with a quantified execution priority sequence. Specifically, this includes: Obtain the high-order semantic representation matrix of pain points and the technical demand feature matrix composed of the feature vectors of all nodes in the candidate digital demand node set arranged in order. Since the pain point features and demand features have dimensional heterogeneity in semantic space, perform independent nonlinear affine transformations on these two matrices to map them to a new space with the same hidden layer dimension. Perform cross-dimensional tensor Kronecker product operation, and calculate the outer product of each row vector in the pain point matrix and each row vector in the technical demand matrix to generate a high-dimensional bilinear pooling feature tensor that captures all pairwise cross-combination feature information between each business pain point and each candidate technology. For high-dimensional bilinear pooling feature tensors, a two-dimensional convolutional neural network is used for spatial feature compression. Multiple convolutional kernels with different receptive field sizes are used to perform sliding window convolution calculations in the tensor space to extract local cross-correlation features. After each convolution, a max pooling layer is connected to remove redundant background information and retain core matching features with strong activation responses. After alternating processing of multiple convolution and pooling operations, the tensor is flattened into a one-dimensional comprehensive high-dimensional vector. Global scalar features such as the company's basic economic scale, industry attributes, and current capital budget are added and concatenated to the end of this one-dimensional vector to generate a comprehensive demand feature map that reflects the company's overall transformation capability and specific technology matching degree. The comprehensive demand feature map is used as the input state of the initial time step and input into the time series prediction model based on gated loop unit. The model contains update gate and reset gate structure. In each time step of the virtual evolution, the update gate is used to determine how much information of the transformation result state of the previous stage needs to be retained in the current stage. The reset gate is used to determine how the previous stage state and the current candidate demand features interact to generate candidate hidden states. Through continuous iteration of time steps, the evolution process of the enterprise implementing digital technology in stages is simulated, and the hidden time series evolution feature state output at the end of each virtual time step is recorded. The hidden layer temporal evolution feature states under all virtual time steps are obtained. A global spatiotemporal joint attention mechanism is introduced. A learnable context query vector is set, and the alignment score between the query vector and the hidden layer state of each time step is calculated. The alignment score is used to generate the attention weight allocation probability for different candidate demand nodes. The allocation probability represents the comprehensive evaluation value of the urgency and expected benefits of the implementation of various candidate technologies under the long-term transformation strategy. The features of each candidate demand node are multiplied with the corresponding attention weight allocation probability to reconstruct the features. The set of relative priority values ​​of each demand at the current time is output through a fully connected classifier with a normalized exponential function. Based on the priority value set, all requirement items in the candidate digital requirement node set are sorted in descending order from largest to smallest. For the core requirement nodes of a preset number before sorting, they are mapped back to the industry digital technology ontology graph to extract their implementation dependency edges. If a high-priority technology node has a prerequisite condition in the graph that must be executed first for a low-priority underlying technology node, the position of the underlying technology node is forcibly reversed and inserted before the high-priority node. After the above topological dependency correction and priority quantification sorting, the target digital transformation requirement path set is output, which is arranged in the order of implementation and has clear phased implementation steps.

[0021] Example 2: During a specific monthly operational settlement cycle for a company, the knowledge graph-based enterprise digital transformation demand identification system of this invention was deployed on a private cloud server of a heavy machinery manufacturing company with a highly automated production line. During the initial data acquisition phase of system startup, the system simultaneously monitored and retrieved raw micro-data from the company's three core business systems. Specifically, the system intercepted the "Assembly Process Specification for Type A High-Precision Gearbox" containing over 20,000 characters of plain text from the company's document management server, as well as over 300 after-sales maintenance work orders accumulated that month containing short text records of bearing noise and excessively rapid lubricating oil temperature rise. The natural language processing model transformed these texts into a set of text feature vectors with a dimension of 768. Simultaneously, an industrial camera deployed at the No. 3 high-frequency quenching station pushed a dataset of uncalibrated multi-view RGB images at 60 frames per second to the system. The system's underlying residual network extracted the consecutive images numbered IMG-8092 to IMG-8155. The frame image exhibits an abnormal 18% decrease in pixel contrast in a characteristic region and image feature vectors showing microscopic jagged textures at the edges. Furthermore, the spindle sensor of CNC machine tool No. 4 transmits a time-series log in real time via industrial Ethernet. The system's model, equipped with a long short-term memory gating mechanism, analyzed a 45-second time segment where the peak amplitude of the spindle's high-frequency vibration suddenly increased to 3.2g at 4800 revolutions per minute, significantly deviating from the historical baseline of 1.5g. This was accompanied by an abnormal rise in the internal temperature sensor curve from 45 degrees Celsius to 78 degrees Celsius within 12 minutes, thus generating a time-series feature vector. Based on this, the system utilizes an improved optimal transmission algorithm to begin calculating the cross-modal fundamental cost matrix in the multimodal initial feature space. The system discovered that the bearing noise node vector extracted from the maintenance work order text, the edge micro-jagged node vector extracted from the image, and the amplitude 3.2g node vector extracted from the sensor, all converged rapidly in the Sinkhorn iterative solution after introducing information entropy regularization penalty, reaching an extremely high alignment confidence of 92.5%. The system determined that these three heterogeneous data nodes actually represent the same physical entity object, and then performed feature arithmetic averaging and global standardization, successfully generating a comprehensive entity node in the underlying graph. This node was labeled as the A-type gearbox spindle wear heterogeneous fusion node - N2048, and physically meaningful relationship edges were established with surrounding nodes such as production batches and quality inspectors. This completed the construction of a multi-dimensional business operation knowledge graph covering all modal business objects. Traditional graph construction methods, relying solely on keyword matching, can only generate isolated maintenance text nodes, completely discarding image jagged features and 3.2g vibration temporal features, resulting in extremely low data correlation. This directly demonstrates the overwhelming advantage of this invention in the alignment accuracy of underlying heterogeneous data.

[0022] The system initiated a graph topology anomaly detection process based on information entropy for the constructed multidimensional business operation knowledge graph. The system engine began to calculate the absolute degree of all nodes in the graph and found that the sum of the in-degree and out-degree of the previously generated node -N2048 was as high as 145, far exceeding the graph's average absolute degree of 12. The system performed a random walk with a step size of 5 starting from this node, recording the transition probabilities of its neighboring nodes. The calculated first-order structural information entropy of this node was 4.82, and the second-order local neighborhood information entropy soared to 6.35. The weighted sum of the two resulted in a structural information entropy value of 5.60. The system then compared this value with the global mean of the entire graph's structural information entropy sequence (1.85) and the global standard deviation (0.45). It found that the absolute difference (3.75) was much greater than three times the standard deviation (1.35), and the system immediately triggered a topology anomaly alarm, determining that there was a serious business flow obstruction in this node and its local network. The system further extracted the initial feature vector of the node and calculated its Mahalanobis distance to its surrounding one-hop neighbor nodes (such as approval node-N2049 and rework node-N2050), resulting in a local anomaly score of 89.7. Faced with this high score, the system dynamically calculated the current pruning threshold of 62.5 points using the Otsu's method, pruning normal inventory transfer and regular leave nodes with scores of only 15 to 25 points in the surrounding local network. It then extracted and retained a high-density core connected component of the anomaly cluster, which the system encapsulated and labeled as the gearbox quality control pain point semantic subgraph SG-07. In a simulated comparative test during the same time period, traditional expert diagnostic methods, which only checked financial statements and found a 2% increase in scrap rate this month, were completely unable to accurately locate this microscopic pain point topology in a graph with millions of nodes.

[0023] After obtaining the pain point semantic subgraph SG-07, the system launched a graph convolutional network based on multi-head self-attention to thoroughly analyze its complex internal relationships. The system extracted the initial vectors of the 24 core nodes contained in SG-07 and constructed a diagonal normalized adjacency matrix containing self-loop edges. In the calculation of the first independent attention head, the system used the parameter matrix to perform linear projection decomposition on the features of node -N2048 (spindle wear), and performed an inner product operation on its query feature vector and the key-value feature vector of node -N2050 (repeated rework). After activation by rectified linear units, the calculated attention weight coefficient was as high as 0.88, which means that the system mathematically confirmed that machine wear is the absolute main cause of repeated rework on the production line. The system weights and sums the feature vectors of these 24 nodes according to their corresponding attention weights, and fuses the original features through residual connections. The outputs of the 8 independent heads are spliced ​​into tensors with a channel dimension of up to 2048 dimensions, and then reduced to 512 dimensions using fully connected layers. Finally, a set of high-order semantic representation matrices of pain points that take into account both the physical characteristics of spindle wear and the management characteristics of production flow obstruction are output.

[0024] The system bridges and loads a pre-built industry digital technology ontology graph containing over 1500 cutting-edge technologies. It utilizes a graph isomorphism network to map the pain point subgraph SG-07 and the technology ontology graph into the same joint embedding metric space. The system initializes a cross-graph joint node coupling matrix and updates it using projective gradient descent with the objective of minimizing the Gromov-Wasestein difference. After 120 iterations, the system monitors that the Frobenius norm difference of the coupling matrix decreases to a minimum of 0.0001, indicating convergence of the alignment matrix. The system analyzes the row and column indices of the alignment matrix and finds that the row corresponding to node -N2048 (spindle wear) in SG-07 shows a matching probability of 94.2% at the technical node with column index T-305. The system then performs a reverse lookup of the graph, confirming that node T-305 specifically represents a predictive maintenance algorithm for equipment based on multi-source sensor fusion. Simultaneously, the row corresponding to node -N2050 (rework) in SG-07 shows a matching probability of 89.6% at the technical node with column index T-112, representing a high-precision AI visual surface defect detection model on the edge side. The system extracts technical node entities with probabilities exceeding the 85% confidence threshold and aggregates them into a customized set of candidate digital demand nodes. Traditional retrieval methods at this point often blindly recommend quality management software (ERP) based on quality keywords, completely exposing the technical gap in cross-graph topological isomorphic computation.

[0025] In the final planning stage, the system did not simply package the aforementioned candidate technologies and throw them at the enterprise. Instead, it performed a rigorous time-series evolution simulation. The system performed a tensor Kronecker product operation on the 512-dimensional high-order semantic representation matrix of pain points and the candidate digital demand node set matrix, generating a massive number of cross-combined high-dimensional bilinear pooling feature tensors. After using a two-dimensional convolution kernel for sliding window compression, the system concatenated the global scalar features of the enterprise's current available budget of 5 million and the saturation level of technical personnel at 80% at the end of this tensor, and fused them to obtain a comprehensive demand feature map.

[0026] The system inputs the feature map into a time-series prediction model based on gated recurrent units, setting the time step to an annual quarter. In the simulated evolution time steps T1 (first quarter) and T2 (second quarter), the system's update and reset gates continuously calculate the information retention ratio. Through alignment score calculation using a global spatiotemporal joint attention mechanism, the system outputs a set of relative priority values: the urgency score for the predictive maintenance algorithm (T-305) is 0.92, and the score for the visual detection model (T-112) is 0.85. When mapping these high-priority nodes back to the underlying technology graph to extract dependencies, the system triggers an inversion correction mechanism: the system finds that deploying either T-305 or T-112 forces a dependency on a prerequisite technology node T-055 (Industrial IoT Edge Computing Data Acquisition Gateway) located at the bottom of the graph, while the gateway node's original priority calculation score is only 0.45. The system's underlying logic immediately initiates a topology dependency inversion forced queue-jumping instruction, directly elevating the position of the T-055 node to the absolute first position in time step T1.

[0027] The system outputs a set of target digital transformation requirement paths with a strict temporal causal relationship to the enterprise's management console: Phase 1 (T1) requires the full procurement and implementation of edge computing data acquisition gateways to establish data pathways; Phase 2 (T2) deploys predictive maintenance algorithms for devices based on gateway data to eliminate wear and tear at its source; Phase 3 (T3) involves deploying a visual defect detection model to improve end-of-line quality inspection efficiency. This systematic evolution path, characterized by quantitative features and completely avoiding technological silos and funding mismatches, achieved a feasibility score of 96 points in system backtesting, while the topological dependency conflict rate of static, disordered procurement lists generated by traditional methods during the same period reached as high as 74% in system backtesting.

[0028] The above description is only a preferred embodiment of the present invention, but the scope of protection of the present invention is not limited thereto. Any equivalent substitutions or modifications made by those skilled in the art within the scope of the technology disclosed in the present invention, based on the technical solution and inventive concept of the present invention, should be covered within the scope of protection of the present invention.

Claims

1. A method for identifying enterprise digital transformation needs based on knowledge graphs, characterized in that, include: Acquire enterprise multimodal business operation data and perform joint entity relationship extraction and cross-modal entity alignment to construct a multidimensional business operation knowledge graph; The multidimensional business operation knowledge graph is subjected to graph topology anomaly detection and graph structure pruning based on information entropy to extract pain point semantic subgraphs that represent the characteristics of business flow obstruction. Based on the pain point semantic subgraph, a graph convolutional network based on multi-head self-attention is used to aggregate node neighborhood features and construct a high-order semantic representation matrix of pain points. The pain point semantic subgraph is combined with a pre-set industry digital technology ontology graph for cross-graph joint spatial computation based on subgraph isomorphic matching to extract the corresponding matching candidate digital demand node set. The high-order semantic representation matrix of the pain points is concatenated with the set of candidate digital demand nodes by tensor dimension concatenation at the feature level to obtain a comprehensive demand feature map. The comprehensive demand feature map is then input into a preset transformation time series prediction model for joint attention optimization to generate a set of target digital transformation demand paths with a quantified execution priority sequence.

2. The method for identifying enterprise digital transformation needs based on knowledge graphs according to claim 1, characterized in that, The process of acquiring enterprise multimodal business operation data and performing joint entity relationship extraction and cross-modal entity alignment to construct a multidimensional business operation knowledge graph includes: Multimodal business data, including text, images, and time series data, are extracted and their features are unified to generate an initial multimodal feature space. In the multimodal initial feature space, isomorphic aggregation based on distance clustering and association rule extraction is performed to establish a single-modal initial semantic subgraph; Based on the single-modal initial semantic subgraph, the cross-modal basic cost matrix is ​​calculated and edge distribution constraints are introduced to construct the cross-modal joint distance matrix; Information entropy regularization is introduced for the cross-modal joint distance matrix, and the optimal transmission matrix is ​​solved iteratively to form a set of cross-modal entity mapping anchor points; Based on the cross-modal entity mapping anchor point set, cross-modal node fusion and global feature standardization are performed on the single-modal initial semantic subgraph to construct a multi-dimensional business operation knowledge graph.

3. The method for identifying enterprise digital transformation needs based on knowledge graphs according to claim 1, characterized in that, The step of performing graph topology anomaly detection and graph structure pruning based on information entropy on the multidimensional business operation knowledge graph to extract pain point semantic subgraphs representing the characteristics of business flow obstruction includes: Calculate the topological importance probability and random walk transition probability of nodes in the multidimensional business operation knowledge graph, and generate the structural information entropy sequence of all nodes in the entire graph; Extract the global mean and standard deviation from the structural information entropy sequence to identify suspected blocking nodes, and combine the Mahalanobis distance to generate a set of local anomaly score distributions. The set of local anomaly score distributions is segmented using the Otsu's method to find the optimal segmentation point, which is set as the dynamic pruning threshold. In the multidimensional business operation knowledge graph, normal node networks with local anomaly scores lower than the dynamic pruning threshold are removed, core anomaly connectivity components are retained, and pain point semantic subgraphs representing the characteristics of business flow obstruction are extracted.

4. The method for identifying enterprise digital transformation needs based on knowledge graphs according to claim 1, characterized in that, Based on the pain point semantic subgraph, a graph convolutional network based on multi-head self-attention is used to aggregate node neighborhood features and construct a high-order semantic representation matrix for the pain points, including: Extract the node initial vectors and topological connection relationships from the pain point semantic subgraph, and construct a diagonal normalized adjacency matrix containing self-connection information; The node features are projected using multiple sets of linear transformation weight matrices, and the inner product and nonlinear activation are calculated by combining the diagonal normalized adjacency matrix to form a multi-head attention coefficient distribution matrix. The node features are weighted and summed and fused with residual connections using the multi-head attention coefficient distribution matrix to generate a set of high-order feature representations for each independent head. The high-order feature representations of all heads are tensor-concatenated and fully connected for dimensionality reduction and recombination to construct a high-order semantic representation matrix of pain points.

5. The method for identifying enterprise digital transformation needs based on knowledge graphs according to claim 1, characterized in that, The step of performing cross-graph joint spatial computation based on subgraph isomorphic matching between the pain point semantic subgraph and a pre-set industry digital technology ontology graph to extract a set of corresponding matching candidate digital demand nodes includes: By using graph isomorphic networks, the pain point semantic subgraph is mapped to the same space as the pre-built industry digital technology ontology graph, and a basic joint embedding metric space is constructed. In the basic joint embedding metric space, the internal graph topology cost and cross-graph distance deviation are calculated and linearly combined to generate a global matching cost matrix; For the global matching cost matrix, projective gradient descent is performed to update it with the goal of minimizing the topological difference, and the Gromov-Wasestein difference alignment matrix in the convergent state is output. The Gromov-Wasestein difference alignment matrix is ​​analyzed, the optimal matching column index that meets the lower confidence limit is extracted, and reverse technical entity query and deduplication summary are performed to extract the corresponding matching candidate digital demand node set.

6. The method for identifying enterprise digital transformation needs based on knowledge graphs according to claim 1, characterized in that, The step of concatenating the high-order semantic representation matrix of the pain points with the set of candidate digital demand nodes at the feature level using tensor dimensions to obtain a comprehensive demand feature map includes: Tensor Kronecker product operation is performed on the high-order semantic representation matrix of the pain point and the feature matrix corresponding to the set of candidate digital demand nodes to generate a high-dimensional bilinear pooling feature tensor. The high-dimensional bilinear pooling feature tensor is subjected to spatial feature compression and global attribute concatenation using a convolutional neural network to generate a comprehensive requirement feature map; The comprehensive demand feature map is input into the time series prediction model to perform gated feature interaction and time step evolution simulation, and the hidden layer time series evolution feature state is output. The alignment score of the hidden layer temporal evolution feature state is calculated based on the global spatiotemporal joint attention mechanism, and the set of relative priority values ​​of each demand reflecting the strategic urgency is output. Based on the relative priority value set, the nodes in the candidate digital demand node set are sorted in descending order and the underlying topology dependency is reversed to generate a target digital transformation demand path set with a quantified execution priority sequence.