Method and system for constructing a model for evaluating scientific and technological achievements of multi-source heterogeneous data fusion

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
By using a technology achievement evaluation model that integrates multi-source heterogeneous data and knowledge graphs and cross-modal attention fusion technology, the objectivity and repeatability issues of existing evaluation models are solved, achieving high-precision market change-sensitive evaluation and improving the automation level of the evaluation.

CN122242276APending Publication Date: 2026-06-19JIANGSU ENTRY-EXIT INSPECTION & QUARANTINE BUREAU IND PROD TESTING CENT +3

View PDF 0 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Applications(China)
Current Assignee / Owner: JIANGSU ENTRY-EXIT INSPECTION & QUARANTINE BUREAU IND PROD TESTING CENT
Filing Date: 2026-05-14
Publication Date: 2026-06-19

Application Information

Patent Timeline

14 May 2026

Application

19 Jun 2026

Publication

CN122242276A

IPC: G06F30/27; G06F18/213; G06F18/22; G06F18/25; G06N3/042; G06N3/0499; G06N3/08; G06N5/022

AI Tagging

Application Domain

Design optimisation/simulation Neural learning methods

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

Smart Images

Figure CN122242276A_ABST

Patent Text Reader

Abstract

This invention proposes a method and system for constructing a scientific and technological achievement evaluation model based on multi-source heterogeneous data fusion, belonging to the technical field of achievement evaluation models. Addressing the problems of existing technologies in scientific and technological achievement evaluation, such as reliance on a single data source, lack of effective fusion of multi-source heterogeneous data, insufficient sensitivity to market changes, and low evaluation accuracy, this invention constructs a knowledge graph for scientific and technological achievement evaluation and uses entity recognition to uniformly identify the same entity. It extracts market potential evaluation feature vectors, as well as technical value features, economic value features, and risk features. A cross-modal attention fusion module generates a comprehensive representation vector, ultimately outputting a comprehensive evaluation result. This invention improves the sensitivity and accuracy of the evaluation model to dynamic market changes, reduces evaluation bias caused by ignoring complex market interactions, and enhances the objectivity and repeatability of the evaluation results.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of achievement evaluation model technology, and in particular to a method and system for constructing a scientific and technological achievement evaluation model based on the fusion of multi-source heterogeneous data. Background Technology

[0002] Technology achievement evaluation is a crucial link connecting technological innovation with industrial application, playing a vital supporting role in optimizing resource allocation, promoting technology transfer, and guiding investment decisions. With the rapid development of next-generation information technologies such as big data, artificial intelligence, and knowledge graphs, the ability to acquire and integrate multi-source heterogeneous data has significantly improved, providing an unprecedented technological foundation for constructing intelligent, multi-dimensional technology achievement evaluation models. Technology achievement evaluation models based on multi-source heterogeneous data fusion can comprehensively mine information from multiple channels, including technical literature, patent data, market reports, policy documents, and industry dynamics, to achieve a systematic and quantitative assessment of the technological value, economic potential, market prospects, and transformation risks of scientific and technological achievements. This is expected to become an important technological tool for promoting the efficient transformation of scientific and technological achievements, reducing investment risks, and accelerating innovation-driven development, possessing broad application prospects and industrial value.

[0003] However, existing technologies have significant shortcomings. First, traditional assessments often rely on a single data source, making it difficult to comprehensively reflect the true state of scientific and technological achievements in multi-dimensional environments such as technology, market, and policy. Furthermore, assessment results are easily influenced by subjective factors, resulting in poor objectivity and repeatability. Second, existing methods lack effective mechanisms for integrating multi-source heterogeneous data. Data from different sources fails to undergo unified semantic alignment and entity identification, leading to prominent information silos and hindering the construction of global knowledge connections. Third, most assessment models fail to fully extract market potential characteristics, particularly key dimensions such as dynamic market supply and demand relationships, technological substitutability, and market maturity, which are often ignored or simplistically treated, resulting in insufficient sensitivity of assessment results to market changes. In addition, existing technologies lack effective means for cross-modal interactive integration of market potential characteristics with technological value, economic value, and risk characteristics. They cannot utilize complementary information between different features to generate highly representative comprehensive vectors, leading to insufficient assessment accuracy.

[0004] Therefore, there is an urgent need for a method and system for constructing a scientific and technological achievement evaluation model that integrates multi-source heterogeneous data with higher accuracy and greater sensitivity to market changes. Summary of the Invention

[0005] To address these issues, this invention provides a method and system for constructing a scientific and technological achievement evaluation model based on the fusion of multi-source heterogeneous data. This method overcomes the problems in existing technologies, such as evaluation results being susceptible to subjective interference, poor objectivity and repeatability, prominent information silos, difficulty in constructing global knowledge connections, insufficient sensitivity to market changes, and insufficient evaluation accuracy.

[0006] To achieve the above objectives, this invention provides a method for constructing a scientific and technological achievement evaluation model based on multi-source heterogeneous data fusion, comprising: S1, acquire multi-source heterogeneous data related to the target scientific and technological achievements; S2, Based on the multi-source heterogeneous data, construct a knowledge graph for the evaluation of scientific and technological achievements, and use entity recognition to uniformly identify the same entity in different data sources; S3, extract market potential assessment feature vectors from the knowledge graph. The market potential assessment feature vectors include at least market supply and demand characteristics, market maturity characteristics, substitutability characteristics, profitability characteristics, and transformation path adaptability characteristics. S4, extract the technical value characteristics, economic value characteristics, and risk characteristics of the target scientific and technological achievements from the knowledge graph; S5, input the market potential assessment feature vector along with the technological value feature, economic value feature and risk feature into the cross-modal attention fusion module to generate a comprehensive representation vector; S6. Input the comprehensive representation vector into the comprehensive evaluation network of scientific and technological achievements. The network contains at least one fully connected hidden layer and an output layer. The output layer adopts the Sigmoid activation function and outputs the comprehensive evaluation result between 0 and 1.

[0007] Furthermore, the methods for extracting the characteristics of market supply and demand include: From the knowledge graph of scientific and technological achievements evaluation, based on the technical field to which the target scientific and technological achievements belong, the supply-side entity set and the demand-side entity set are retrieved and extracted; wherein, the supply-side entities are the organizations and product nodes that provide similar or alternative technologies, and the demand-side entities are the industry and user groups that apply the technology. Read the supply capacity attribute value of each supply-side entity and the demand scale attribute value of each demand-side entity respectively; The supply-demand ratio index is calculated based on the total supply capacity of all supply-side entities and the total demand scale of all demand-side entities; at the same time, the demand growth rate is calculated based on the historical time-series data stored in the knowledge graph. Construct a graph structure with supply-side entities and demand-side entities as nodes and competition, supply or substitution relationships between entities as edges; use a graph attention network to perform attention-weighted aggregation on the features of each node and its neighboring nodes, iteratively update the node representation, and then obtain the dynamic supply-demand balance vector through global pooling. The supply-demand ratio index, demand growth rate, and supply-demand dynamic balance vector are fused to generate market supply-demand relationship characteristics.

[0008] Furthermore, the extraction method for the market maturity characteristics includes: From the knowledge graph of scientific and technological achievements evaluation, extract the set of entities related to the market where the target scientific and technological achievements are located. The set of entities includes market rule makers, standard issuing agencies, existing competitors, and upstream and downstream supporting entities. Extract the preset attribute features of each entity related to the market where the target scientific and technological achievement is located. The attribute features include at least: policy support, standard completeness, competition concentration, and supporting facilities completeness. Based on the aforementioned attribute characteristics, the market access index is calculated by weighted summation, the competition intensity index is calculated by analyzing the number of entities and market share distribution, and the industrial chain completeness index is calculated by standard completeness and supporting completeness. Entities related to the market where the target scientific and technological achievement is located are used as nodes, and the regulatory, competitive, and collaborative relationships between entities are used as edges to construct a graph structure. A graph convolutional network is used to perform multi-layer neighborhood aggregation on the initial feature vector of each node to update the node representation. Then, global average pooling is used to aggregate all node representations into a graph-level representation vector. The market access index, competition intensity index, industry chain integrity index, and graph-level representation vector are weighted and fused to generate market maturity features.

[0009] Furthermore, the extraction method of the substitutability feature includes: From the knowledge graph for evaluating scientific and technological achievements, extract the initial feature vector of the target scientific and technological achievement node, as well as the initial feature vectors of all alternative scientific and technological achievement nodes with similar functions to the target scientific and technological achievement. Starting from each alternative technological achievement node, a random walk with restart is performed with a preset step size and restart probability, and the sequence of nodes visited during the walk is recorded and generated; The Skip-gram model is used to train the generated node sequence, mapping each node to a low-dimensional embedding vector, so that nodes with high co-occurrence probability in the graph are closer in the embedding space; The cosine similarity between the embedding vector of the target scientific and technological achievement and the embedding vector of each alternative scientific and technological achievement node is calculated to obtain the functional similarity score. Calculate a comprehensive threat score for each alternative technological achievement, which is a weighted sum of functional similarity score, market share, and cost advantage; The comprehensive threat scores of all alternative nodes are sorted in descending order, and the top number of nodes are selected as the main alternative threats. The weighted average of the comprehensive threat scores of the main alternative threats is calculated as the competitive threat intensity. At the same time, the standard deviation of the comprehensive threat scores of the main alternative threats is calculated as the alternative risk dispersion. The competitive threat intensity and the dispersion of alternative risks are concatenated into vectors, and then nonlinearly transformed through a fully connected layer to generate substitutability features.

[0010] Furthermore, the methods for extracting the policy support level include: From the knowledge graph of scientific and technological achievements evaluation, retrieve a set of policy document entities related to the technical field of the target scientific and technological achievement. Each policy document entity includes policy type, issuing agency level, and policy text content attributes. The text content of each policy document is segmented into words, and each word is mapped to a policy text vector using a pre-trained word vector model. The cosine similarity between the policy text vector and the preset set of keyword vectors for scientific and technological achievements is calculated to obtain the text matching score. Basic weights are assigned based on policy type; level weights are assigned based on the level of the issuing organization. The support level score for each policy document is obtained by multiplying its text matching score, basic weight, and level weight together. The support scores of all policy documents are summed and normalized to generate the policy support level.

[0011] Furthermore, the processing procedure of the cross-modal attention fusion module includes: The market potential assessment feature vector, technical value feature, economic value feature and risk feature are each mapped to the same embedding space through linear transformation to obtain the corresponding feature embedding sequence; Using market potential assessment feature embedding as the query, and using technology value feature embedding, economic value feature embedding, and risk feature embedding as keys and values, we calculate the multi-head cross-attention output. The multi-head cross-attention output and the market potential assessment feature embedding sequence are residually connected and layer normalized, and then a comprehensive representation vector is generated through a feedforward neural network layer.

[0012] Furthermore, the entity recognition employs an entity alignment method based on graph neural networks, including: The semantic embedding vectors of each entity's name, attributes, and context information are extracted using a pre-trained language model; the Euclidean distance between the semantic embedding vectors of different entities is calculated as the semantic similarity. By combining the neighborhood structure of entities in the knowledge graph, the structural similarity between entities is calculated through a graph attention network; Semantic similarity and structural similarity are weighted and fused to construct an entity similarity matrix. A greedy matching algorithm is then used to merge entity nodes from different data sources that have similarity exceeding a preset threshold and assign them a unified identifier.

[0013] Furthermore, the technical value characteristics are used to characterize the intrinsic value of the target scientific and technological achievement at the technical level, including its degree of advancement relative to existing technologies in the same field.

[0014] Furthermore, the risk characteristics are used to characterize the degree of adverse factors faced by the target scientific and technological achievements during the transformation and application process, including the market acceptance risk of users or the market resisting the scientific and technological achievements.

[0015] This invention also provides a system for constructing a scientific and technological achievement evaluation model based on multi-source heterogeneous data fusion. The system is used to implement any of the methods for constructing a scientific and technological achievement evaluation model based on multi-source heterogeneous data fusion. The system includes: The data acquisition module is used to acquire multi-source heterogeneous data related to the target scientific and technological achievements; The knowledge graph construction module, connected to the data acquisition module, is used to construct a knowledge graph for scientific and technological achievements evaluation based on the multi-source heterogeneous data, and to uniformly identify the same entity in different data sources through entity recognition. The market potential feature extraction module, connected to the knowledge graph construction module, is used to extract market potential assessment feature vectors from the knowledge graph. The market potential assessment feature vectors include at least market supply and demand relationship features, market maturity features, substitutability features, profitability features, and conversion path adaptability features. The value and risk feature extraction module, connected to the market potential feature extraction module, is used to extract the technical value features, economic value features, and risk features of the target scientific and technological achievements from the knowledge graph. The cross-modal attention fusion module, connected to the value and risk feature extraction module, is used to input the market potential assessment feature vector and the technical value feature, economic value feature and risk feature into the cross-modal attention fusion module to generate a comprehensive representation vector; The comprehensive evaluation module for scientific and technological achievements, connected to the cross-modal attention fusion module, is used to input the comprehensive representation vector into the comprehensive evaluation network for scientific and technological achievements. The network contains at least one fully connected hidden layer and an output layer. The output layer uses the Sigmoid activation function and outputs a comprehensive evaluation result between 0 and 1.

[0016] Compared with the prior art, the beneficial effects of the present invention are as follows: Firstly, this invention extracts market potential assessment feature vectors from knowledge graphs. These feature vectors include at least market supply and demand characteristics, market maturity characteristics, substitutability characteristics, profitability characteristics, and transformation path adaptability characteristics. Specifically, the market supply and demand characteristics utilize graph attention networks to dynamically balance and model the supply-side and demand-side entities and their competition, supply, and substitution relationships. The market maturity characteristics employ graph convolutional networks to represent the regulatory, competitive, and collaborative relationships among market rule-makers, standard-issuing institutions, existing competitors, and upstream and downstream supporting entities at the graph level. The substitutability characteristics utilize a restarted random walk and Skip-gram model to mine the functional similarity between alternative technologies and calculate the intensity of competitive threats and the dispersion of substitution risks. This addresses the problems of existing assessment models' insufficient sensitivity to dynamic market changes and difficulty in quantifying the impact of technological substitution threats and market ecosystem structures. It improves the responsiveness and predictive accuracy of assessment results to market supply and demand fluctuations, the evolution of alternative technologies, and the maturity of the industrial chain, while reducing assessment bias caused by ignoring complex market interactions. Secondly, this invention inputs the market potential assessment feature vector along with technical value features, economic value features, and risk features into a cross-modal attention fusion module. Using the market potential assessment feature embedding as the query and the technical value, economic value, and risk feature embeddings as keys and values, a multi-head cross-attention output is calculated. This output is then processed through residual connections, layer normalization, and a feedforward neural network to generate a comprehensive representation vector. This addresses the problems in traditional assessment methods where simple concatenation or linear weighting of various features leads to ineffective cross-modal information interaction and the neglect of complementary relationships between market potential features and technical / economic / risk features. It improves the fusion depth and representational ability of the comprehensive representation vector for different modalities, and reduces assessment information loss and decision-making bias caused by information isolation between features. Third, this invention constructs a knowledge graph for evaluating scientific and technological achievements from multi-source heterogeneous data and uses an entity alignment method based on graph neural networks to unify the identification. It then systematically extracts market potential, technological value, economic value, and risk characteristics, and outputs a comprehensive evaluation result between 0 and 1 through cross-modal attention fusion and a comprehensive evaluation network for scientific and technological achievements. This achieves end-to-end automated evaluation from raw multi-source heterogeneous data to the final evaluation score. It solves the problems of existing technologies relying on a single data source, lacking unified entity alignment, being highly subjective, and having insufficient evaluation accuracy. It improves the objectivity, repeatability, automation, and multi-dimensional comprehensive evaluation accuracy of scientific and technological achievements, and reduces the cost of manual intervention and the subjective uncertainty of evaluation results. Attached Figure Description

[0017] To more clearly illustrate the specific embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the specific embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are some embodiments of the present invention. For those skilled in the art, other drawings can be obtained from these drawings without creative effort.

[0018] Figure 1 A flowchart illustrating the method for constructing a scientific and technological achievement evaluation model based on multi-source heterogeneous data fusion, as provided in this embodiment of the invention. Figure 2 The structural block diagram of the scientific and technological achievement evaluation model construction system based on multi-source heterogeneous data fusion provided in the embodiments of the present invention. Detailed Implementation

[0019] To make the objectives and advantages of the present invention clearer, the present invention will be further described below with reference to embodiments; it should be understood that the specific embodiments described herein are merely for explaining the present invention and are not intended to limit the present invention.

[0020] Preferred embodiments of the present invention will now be described with reference to the accompanying drawings. Those skilled in the art should understand that these embodiments are merely illustrative of the technical principles of the present invention and are not intended to limit the scope of protection of the present invention.

[0021] It should be noted that in the description of this invention, the terms "upper", "lower", "left", "right", "inner", "outer", etc., which indicate directions or positional relationships, are based on the directions or positional relationships shown in the accompanying drawings. This is only for the convenience of description and is not intended to indicate or imply that the device or element must have a specific orientation, or be constructed and operated in a specific orientation. Therefore, it should not be construed as a limitation of this invention.

[0022] Furthermore, it should be noted that, in the description of this invention, unless otherwise explicitly specified and limited, the terms "installation," "connection," and "linking" should be interpreted broadly. For example, they can refer to a fixed connection, a detachable connection, or an integral connection; they can refer to a mechanical connection or an electrical connection; they can refer to a direct connection or an indirect connection through an intermediate medium; and they can refer to the internal connection of two components. Those skilled in the art can understand the specific meaning of the above terms in this invention according to the specific circumstances.

[0023] Example 1 like Figure 1 As shown, this invention provides a method for constructing a scientific and technological achievement evaluation model based on multi-source heterogeneous data fusion, comprising: S1, acquire multi-source heterogeneous data related to the target scientific and technological achievements; S2, Based on the multi-source heterogeneous data, construct a knowledge graph for the evaluation of scientific and technological achievements, and use entity recognition to uniformly identify the same entity in different data sources; Entity recognition employs an entity alignment method based on graph neural networks, including: The semantic embedding vectors of each entity's name, attributes, and context information are extracted using a pre-trained language model; the Euclidean distance between the semantic embedding vectors of different entities is calculated as the semantic similarity. By combining the neighborhood structure of entities in the knowledge graph, the structural similarity between entities is calculated through a graph attention network; Semantic similarity and structural similarity are weighted and fused to construct an entity similarity matrix. A greedy matching algorithm is then used to merge entity nodes from different data sources that have similarity exceeding a preset threshold and assign them a unified identifier.

[0024] In one possible implementation, taking a composite solid-state electrolyte technology for a solid-state lithium battery as an example, the first step is to obtain raw data related to the technology from multiple heterogeneous data sources. This includes acquiring patent documents related to "solid-state electrolyte," "composite electrolyte," "sulfide electrolyte," "oxide electrolyte," and "polymer electrolyte," including fields such as patent number, applicant, inventor, IPC classification number, and citation information; and obtaining time-series statistical data such as the scale of the lithium battery industry, the output of electrolyte materials, and the installed capacity of new energy vehicles from the websites of the National Bureau of Statistics and the Ministry of Industry and Information Technology, as structured data sources.

[0025] We retrieved journal and conference papers on solid-state lithium battery electrolytes published in the past five years from academic databases such as Wanfang and CNKI, extracting paper titles, abstracts, keywords, author affiliations, and citation frequencies; and extracted market forecast data, mainstream enterprise technology routes, and electrolyte cost composition from industry research reports as semi-structured data sources.

[0026] We crawled policy documents and development plans related to "solid-state batteries," "new energy vehicles," and "power batteries"; we scraped news reports about solid-state electrolyte technology progress, corporate collaborations, and product launches from industry news websites; and we obtained evaluations and discussions of composite solid-state electrolyte technology by researchers and industry experts from social media as unstructured data sources.

[0027] The distributed crawler framework (Scrapy) is used to periodically crawl publicly available data from the above sources. The patent database and statistical yearbook data are called through the API interface. The raw data obtained is stored in the distributed file system (HDFS) in JSON, CSV and text file formats to form a multi-source heterogeneous data pool.

[0028] A BERT-BiLSTM-CRF-based named entity recognition model is used to extract technical entities, such as sulfide electrolytes (e.g., silver-germanium sulfide electrolytes, PEO-based composite electrolytes, and LLZO oxide electrolytes), from patent titles, abstracts, and paper texts. Organization names, such as "Tesla New Energy Technology Co., Ltd.", "Institute of Physics, Chinese Academy of Sciences", and "Tesla Motor Co., Ltd.", are extracted from patent applicant fields, paper author affiliation fields, and news texts. Inventors, paper authors, and policy drafters are extracted as personnel entities. Product entities are extracted from market reports and news articles. Policy entities are extracted from policy document titles and texts.

[0029] Extract attributes for technology entities, such as ionic conductivity, electrochemical window, interfacial impedance, and technology maturity level; extract attributes for organizational entities, such as registered capital, number of patents, and the proportion of R&D investment related to solid-state batteries; extract attributes for policy entities, such as policy type, issuing agency level, and release time.

[0030] A combination of remote supervision and manual rule-making is used to extract relationships between entities, such as the Institute of Physics, Chinese Academy of Sciences (organization) developing LLZO electrolyte (technology), a certain company purchasing sulfide electrolytes (product), and the "New Energy Vehicle Development Plan" encouraging solid-state battery technology (technology). Relationship types include R&D, production, procurement, competition, substitution, citation, funding, and regulation. The extracted entities, attributes, and relationships are stored in a graph database (Neo4j) to form an initial knowledge graph, where nodes represent entities, edges represent relationships, and both nodes and edges have type and attribute labels.

[0031] Because many instances of the same entity being described differently exist in multi-source data (e.g., the Institute of Physics, Chinese Academy of Sciences, and the Institute of Physics, Chinese Academy of Sciences, refer to the same organization), a unified identifier needs to be assigned through entity alignment. For each entity, an input text sequence is constructed: if the entity has a name attribute, the entity name is used; simultaneously, the entity's key attributes and a representative text from its context are appended. The constructed text sequence is then input into the BERT model, and the 768-dimensional output vector of the last layer of labels is taken as the semantic embedding vector for that entity.

[0032] A subgraph centered on each entity node is extracted from the initial knowledge graph. This subgraph includes the entity's one-hop and two-hop neighbor nodes and their connecting edges. For each node's initial feature vector, one-hot encoding (node type) is concatenated with a pre-trained semantic vector to construct a two-layer graph attention network: the first layer takes the node's initial features as input and aggregates neighbor node information through multi-head attention (number of heads = 4) to update the node representation; the second layer further aggregates higher-order neighbor information. Graph reconstruction loss is used during training of the graph attention network (GAT) to ensure that entities with similar structures are close in the embedding space. After GAT, each entity obtains a structural embedding vector, and the structural similarity between two entities is calculated. For example, the Institute of Physics, Chinese Academy of Sciences, and the Institute of Physics, Chinese Academy of Sciences have similar neighbor structures, with a structural cosine similarity of 0.92.

[0033] Semantic similarity and structural similarity are weighted and fused to obtain a comprehensive similarity. A greedy matching algorithm is used, with a similarity threshold of 0.85. All entity pairs are traversed, and two entities with a comprehensive similarity greater than the similarity threshold are merged and assigned a unified globally unique identifier. During the merging process, entity names, attributes, and relationships from different data sources are all merged under the unified identifier to form an aligned knowledge graph.

[0034] This invention addresses the challenge of automatically identifying and aligning entities in multi-source heterogeneous data by employing a pre-trained language model to extract semantic embedding vectors containing entity names, attributes, and contextual information, and calculating Euclidean distance as semantic similarity. This improves the recall and accuracy of entity alignment. Furthermore, by combining the neighborhood structure of entities in a knowledge graph and using a graph attention network to calculate structural similarity between entities, it solves the problem of incorrect matching due to insufficient or ambiguous textual descriptions when relying solely on semantic similarity. This enhances the robustness and anti-interference ability of entity alignment and reduces the misalignment rate. Finally, by using a greedy matching algorithm to merge entity nodes from different data sources with similarity exceeding a preset threshold and assigning them a unified identifier, this invention solves the problems of entity redundancy, duplicate storage, and broken cross-source associations in multi-source data. This improves the compactness and data consistency of the knowledge graph and reduces computational errors and evaluation biases caused by entity duplication or omission in subsequent feature extraction stages.

[0035] S3, extract market potential assessment feature vectors from the knowledge graph. The market potential assessment feature vectors include at least market supply and demand characteristics, market maturity characteristics, substitutability characteristics, profitability characteristics, and transformation path adaptability characteristics. Methods for extracting characteristics of market supply and demand relationships include: From the knowledge graph of scientific and technological achievements evaluation, based on the technical field to which the target scientific and technological achievements belong, the supply-side entity set and the demand-side entity set are retrieved and extracted; wherein, the supply-side entities are the organizations and product nodes that provide similar or alternative technologies, and the demand-side entities are the industry and user groups that apply the technology. Read the supply capacity attribute value of each supply-side entity and the demand scale attribute value of each demand-side entity respectively; The supply-demand ratio index is calculated based on the total supply capacity of all supply-side entities and the total demand scale of all demand-side entities; at the same time, the demand growth rate is calculated based on the historical time-series data stored in the knowledge graph. Construct a graph structure with supply-side entities and demand-side entities as nodes and competition, supply or substitution relationships between entities as edges; use a graph attention network to perform attention-weighted aggregation on the features of each node and its neighboring nodes, iteratively update the node representation, and then obtain the dynamic supply-demand balance vector through global pooling. The supply-demand ratio index, demand growth rate, and supply-demand dynamic balance vector are fused to generate market supply-demand relationship characteristics.

[0036] In one possible implementation, based on the technical field of the target technology, "solid-state electrolyte / composite electrolyte," a supply-side entity set and a demand-side entity set are retrieved and extracted from a knowledge graph. The supply-side entity set includes organizational nodes and product nodes that provide similar or alternative technologies. For example, a certain company's supply capacity attribute is 200 tons of sulfide electrolyte annual production capacity, a certain company's supply capacity is 80 tons of oxide electrolyte annual production capacity, and a certain company's supply capacity is 10 tons of sulfide electrolyte pilot line production capacity. The demand-side entity set includes industry nodes and user group nodes that apply the technology. For example, the demand scale attribute of the new energy vehicle industry is 20 GWh for solid-state batteries, the consumer electronics industry is 2 GWh, and the energy storage industry is 5 GWh.

[0037] The supply capacity attribute values of each supply-side entity are read and summed to obtain the total supply capacity. The demand scale attribute values of each demand-side entity are read and summed to obtain the total demand scale. The supply-demand ratio index is calculated. In this embodiment, the supply-demand ratio index is 0.0093, indicating that the current supply capacity is far lower than the potential demand. The demand growth rate is calculated from the historical time-series data stored in the knowledge graph, which shows that it is 8 GWh in 2021 and 32 GWh in 2025, with an average annual compound growth rate of 41.4%.

[0038] A graph structure is constructed using supply-side and demand-side entities as nodes, and competition, supply, or substitution relationships between entities as edges. For example: There is a competitive relationship between a certain Deshidai and a certain Tao Energy; There is a supply side between a certain company and the new energy vehicle industry; There is a substitution edge between the sulfide electrolyte of a certain field and the target technology.

[0039] A feature vector is initialized for each node: supply-side node features include supply capacity, technological maturity, and cost; demand-side node features include demand size, demand growth rate, and price sensitivity. A two-layer graph attention network (GAT) is used, with four attention heads in each layer to perform attention-weighted aggregation of the features of each node and its neighboring nodes, iteratively updating the node representation. After three rounds of iteration, global average pooling is used to aggregate all node representations into a 128-dimensional supply-demand dynamic balance vector.

[0040] The supply-demand ratio index (0.0093), demand growth rate (0.414), and supply-demand dynamic balance vector are concatenated and then fused through a fully connected layer (input dimension 130, output dimension 64, activation function ReLU) to finally generate a 64-dimensional market supply and demand relationship feature vector. This feature vector comprehensively reflects the information that the current supply of the target technology is far less than the demand, the demand is growing rapidly, and the supply and demand network is dynamically adjusted, providing input for subsequent cross-modal fusion.

[0041] This invention addresses the problem of traditional assessment methods that rely solely on macro-level statistical data and cannot pinpoint the correspondence between supply-side and demand-side entities within specific technological fields. It improves the precision and specificity of supply-demand relationship quantification, reducing assessment bias caused by neglecting supply-demand mismatches at specific technology levels. Furthermore, by calculating the demand growth rate based on historical time-series data stored in the knowledge graph and incorporating both the supply-demand ratio index and the demand growth rate as components of market supply-demand relationship characteristics, it overcomes the limitations of relying solely on static supply and demand data. The ratio fails to reflect the dynamic trend of market demand changes, improving the ability to predict future market trends and reducing the lag in assessment results caused by demand fluctuations. By constructing a graph structure with supply-side and demand-side entities as nodes and competition, supply, or substitution relationships as edges, and using a graph attention network to perform attention-weighted aggregation and global pooling of node and neighborhood features to obtain a dynamic supply-demand balance vector, the problem of traditional methods ignoring the complex interaction relationships between supply and demand entities and the impact of network structure on market balance is solved. This improves the ability of supply and demand relationship characteristics to represent the inherent game mechanism of the market and reduces the risk of losing key information due to simplified processing of the supply and demand network.

[0042] Methods for extracting market maturity characteristics include: From the knowledge graph of scientific and technological achievements evaluation, extract the set of entities related to the market where the target scientific and technological achievements are located. The set of entities includes market rule makers, standard issuing agencies, existing competitors, and upstream and downstream supporting entities. Extract the preset attribute features of each entity related to the market where the target scientific and technological achievement is located. The attribute features include at least: policy support, standard completeness, competition concentration, and supporting facilities completeness. Based on the aforementioned attribute characteristics, the market access index is calculated by weighted summation, the competition intensity index is calculated by analyzing the number of entities and market share distribution, and the industrial chain completeness index is calculated by standard completeness and supporting completeness. Entities related to the market where the target scientific and technological achievement is located are used as nodes, and the regulatory, competitive, and collaborative relationships between entities are used as edges to construct a graph structure. A graph convolutional network is used to perform multi-layer neighborhood aggregation on the initial feature vector of each node to update the node representation. Then, global average pooling is used to aggregate all node representations into a graph-level representation vector. The market access index, competition intensity index, industry chain integrity index, and graph-level representation vector are weighted and fused to generate market maturity features.

[0043] Methods for extracting policy support include: From the knowledge graph of scientific and technological achievements evaluation, retrieve a set of policy document entities related to the technical field of the target scientific and technological achievement. Each policy document entity includes policy type, issuing agency level, and policy text content attributes. The text content of each policy document is segmented into words, and each word is mapped to a policy text vector using a pre-trained word vector model. The cosine similarity between the policy text vector and the preset set of keyword vectors for scientific and technological achievements is calculated to obtain the text matching score. Basic weights are assigned based on policy type; level weights are assigned based on the level of the issuing organization. The support level score for each policy document is obtained by multiplying its text matching score, basic weight, and level weight together. The support scores of all policy documents are summed and normalized to generate the policy support level.

[0044] In one possible implementation, entities related to the market in which the target technology is located are retrieved from the knowledge graph, including: Market rule-makers: Ministry of Industry and Information Technology, National Development and Reform Commission, and State Administration for Market Regulation; The standard issuing bodies are the National Automotive Standardization Technical Committee and the China Electronics Technology Standardization Institute. Existing competitors include: Dede Times, Tao Energy, and Tian; The upstream and downstream supporting entities include upstream lithium mining companies, electrolyte raw material suppliers, downstream battery packaging companies, and end-user vehicle manufacturers.

[0045] Extract the following attribute features for each entity: Policy support score is calculated for each entity. For example, the Ministry of Industry and Information Technology has a policy support score of 0.92, and the National Development and Reform Commission has a score of 0.88. Standard completeness is determined by retrieving the number and coverage of solid-state battery-related standards that the entity has participated in developing or publishing from the knowledge graph, and quantified as a score of 0 to 1. For example, the National Automotive Standardization Technical Committee has a standard completeness score of 0.45, and the China Electronics Technology Standardization Institute has a score of 0.38. For competitive concentration, the Herfindahl-Hirschman Index (HHI) is calculated and normalized based on the market share of existing competitors in the electrolyte market. In this example, the market shares of the top four companies are 35%, 28%, 20%, and 12%, respectively, resulting in a competitive concentration of 0.72 (slightly high).

[0046] The completeness of supporting facilities is assessed by comprehensively evaluating upstream and downstream supporting entities based on their technological capabilities, production capacity matching, and geographical distribution. For example, the completeness of supporting facilities for upstream lithium mining is 0.68, electrolyte raw material supply is 0.52, battery packaging is 0.71, and vehicle application is 0.63.

[0047] When extracting policy support attributes, taking the Ministry of Industry and Information Technology as an example, policy documents related to the technologies of "solid-state lithium battery" or "composite electrolyte" were retrieved from the knowledge graph, resulting in 10 policy documents. Each document includes the policy type (plan, guidance, standard), issuing agency level, and policy text content. Each policy document text was segmented, and a pre-trained Word2Vec model was used to map each word to a 300-dimensional vector. The average of these vectors was used to obtain the policy text vector. A pre-set set of scientific and technological achievement keyword vectors included the average vectors of 10 keywords such as "solid-state electrolyte," "composite electrolyte," "ionic conductivity," and "interface stability." Cosine similarity was calculated; for example, the text matching score for the "New Energy Vehicle Industry Development Plan" was 0.87.

[0048] Basic weights are assigned based on policy type: Planning 1.0, Guiding Opinions 0.8, and Standards 0.6. Level weights are assigned based on the issuing agency: National level 1.0, Ministerial level 0.7, and Provincial level 0.4. For the "Lithium Battery Industry Standard Conditions" issued by the Ministry of Industry and Information Technology, the basic weight is 0.6, and the level weight is 0.7.

[0049] The support strength score for each policy is calculated as follows: Text matching score × Base weight × Level weight. The total score is obtained by summing all policy support scores and then normalizing it, resulting in a policy support score of 0.46. If the cumulative policy strength of the Ministry of Industry and Information Technology (MIIT) as the policy issuing body is considered, the normalized score is 0.92.

[0050] The market access index is calculated as 0.62 (moderately high) by weighting and summing attributes such as policy support, standard completeness, and industry licensing requirements, with preset weights of 0.5, 0.3, and 0.2. Based on the number of competitors and their market share distribution, the modified HHI index is used to calculate the normalized competition intensity index, which is 0.78 (intense competition). The standard completeness and supporting infrastructure completeness (average of 0.64 for each link) are weighted and combined with weights of 0.4 and 0.6, respectively, resulting in an industry chain completeness index of 0.56 (under development and not yet perfect).

[0051] Construct a graph structure using all extracted market-related entities as nodes and the regulatory, competitive, and collaborative relationships between entities as edges. For example: Ministry of Industry and Information Technology (regulator) → MDE Technology (regulatory relationship); A certain German era (competition) → A certain Tao Energy (competitive relationship); National Automotive Standardization Technical Committee (Collaboration) → China Electronics Technology Standardization Institute (Collaboration Relationship).

[0052] A feature vector is initialized for each node, containing the aforementioned attribute features and entity type encoding. A two-layer Graph Convolutional Network (GCN) is used. The first layer maps the input features (32 dimensions) to 64 dimensions, and the second layer maps them to 128 dimensions, updating the node representation through neighborhood aggregation. Finally, global average pooling aggregates all node representations into a 128-dimensional graph-level representation vector. The market access index (0.62), competition intensity index (0.78), and industry chain completeness index (0.56) are concatenated with the graph-level representation vector, and then weighted and fused through a fully connected layer (input dimension 131, output dimension 64, activation function ReLU) to generate a 64-dimensional market maturity feature vector. This feature comprehensively reflects the maturity of the policy environment, standards development, competitive landscape, and industry chain support in the market where the target technology is located.

[0053] This invention extracts a set of entities from a knowledge graph, including market rule-makers, standard-issuing bodies, existing competitors, and upstream and downstream supporting entities. It then extracts policy support, standard completeness, competition concentration, and supporting infrastructure completeness as attribute features. This addresses the problem of traditional assessment methods focusing only on market size and growth rate while neglecting soft factors such as the market institutional environment, standard system, and industrial chain support. This improves the multi-dimensional comprehensiveness of market maturity assessment and reduces the risk of assessment results being out of touch with reality due to the neglect of institutional and supporting factors. Furthermore, it constructs a graph structure with market-related entities as nodes and regulatory, competitive, and collaborative relationships as edges, and uses a graph convolutional network to process the initial feature vectors of the nodes. Layered neighborhood aggregation and global average pooling yield graph-level representation vectors, addressing the problem of traditional methods neglecting the impact of complex interactions between market participants on the overall market maturity. This improves the ability of feature extraction to capture market ecosystem structure information and reduces the loss of key information due to simplified processing of market relationships. By weightedly fusing market access index, competition intensity index, and industry chain integrity index with graph-level representation vectors to generate market maturity features, this solves the problem that a single statistical indicator or a single graph vector cannot simultaneously take into account macro-indices and micro-structural information. This improves the comprehensive representation ability and robustness of market maturity features and reduces the probability of misjudgment in subsequent evaluation stages due to one-sided feature information.

[0054] Methods for extracting substitutable features include: From the knowledge graph for evaluating scientific and technological achievements, extract the initial feature vector of the target scientific and technological achievement node, as well as the initial feature vectors of all alternative scientific and technological achievement nodes with similar functions to the target scientific and technological achievement. Starting from each alternative technological achievement node, a random walk with restart is performed with a preset step size and restart probability, and the sequence of nodes visited during the walk is recorded and generated; The Skip-gram model is used to train the generated node sequence, mapping each node to a low-dimensional embedding vector, so that nodes with high co-occurrence probability in the graph are closer in the embedding space; The cosine similarity between the embedding vector of the target scientific and technological achievement and the embedding vector of each alternative scientific and technological achievement node is calculated to obtain the functional similarity score. Calculate a comprehensive threat score for each alternative technological achievement, which is a weighted sum of functional similarity score, market share, and cost advantage; The comprehensive threat scores of all alternative nodes are sorted in descending order, and the top number of nodes are selected as the main alternative threats. The weighted average of the comprehensive threat scores of the main alternative threats is calculated as the competitive threat intensity. At the same time, the standard deviation of the comprehensive threat scores of the main alternative threats is calculated as the alternative risk dispersion. The competitive threat intensity and the dispersion of alternative risks are concatenated into vectors, and then nonlinearly transformed through a fully connected layer to generate substitutability features.

[0055] In one possible implementation, an initial feature vector of the target technology node is extracted from the knowledge graph, including: ionic conductivity 1.2 × 10⁻⁶. -3 S / cm, electrochemical window, interfacial impedance, and cost of 150 yuan / kg were used to search for all alternative technological achievements with similar functions to the target technology, resulting in the following four alternative nodes: Sulfide electrolyte, ionic conductivity 2.5 × 10⁻⁶ -2 S / cm, cost 80 yuan / kg; Oxide electrolyte, ionic conductivity 5.0 × 10⁻⁶ -4 S / cm, cost 200 yuan / kg; Polymer electrolyte, ionic conductivity 1.0 × 10⁻⁶ -4 S / cm, cost 50 yuan / kg; Halogenated electrolyte, ionic conductivity 1.0 × 10⁻⁶ -3 S / cm, cost 300 yuan / kg.

[0056] For each alternative technology node, perform a random walk with restart (Node2Vec strategy). Set the step size L to 80 and the restart probability p to 0.15. Each walk starts from the alternative node, returns to the starting node with probability p, and then returns with probability 1. p randomly walks to its neighboring nodes. Each replacement node is walked 30 times, and the sequence of nodes visited in each walk is recorded (a total of 30 × 80 = 2400 nodes / replacement nodes). For example, starting from sulfide electrolyte, the walk sequence might be sulfide electrolyte - a certain company - sulfide patent group - a certain era - sulfide electrolyte - ..., eventually generating a set of walk sequences for all replacement nodes.

[0057] The node sequences generated by all replacement nodes are merged and input into the Skip-gram model for training. The embedding dimension d=128, window size w=5, negative sampling number k=10, and training is iterated for 10 rounds. After training, each node (including the target node and all replacement nodes) obtains a 128-dimensional low-dimensional embedding vector, ensuring that nodes with high co-occurrence probabilities in the graph are close to each other in the embedding space.

[0058] The cosine similarity between the target technology embedding vector and the embedding vector of each alternative node is calculated to obtain the functional similarity score. In this embodiment, the scores are 0.91 for sulfide electrolyte, 0.72 for oxide electrolyte, 0.58 for polymer electrolyte, and 0.83 for halide electrolyte.

[0059] The market share and cost advantage of each alternative technology were retrieved from the knowledge graph. The weights for functional similarity (0.5), market share (0.3), and cost advantage (0.2) were set, and a comprehensive threat score was calculated for each alternative node. The calculated comprehensive threat scores were: 0.684 for sulfide electrolytes, 0.354 for oxide electrolytes, 0.454 for polymer electrolytes, and 0.230 for halide electrolytes.

[0060] The overall threat scores of all alternative nodes were sorted in descending order, and the top two were selected as the main alternative threats, namely sulfide electrolytes and polymer electrolytes. The weighted average of the overall threat scores of the main alternative threats (with market share as the weight) was calculated as the competitive threat intensity, with a result of 0.642. The standard deviation of the overall threat scores of the main alternative threats was calculated as the substitution risk dispersion, with a result of 0.115.

[0061] The competitive threat intensity (0.642) and the alternative risk dispersion (0.115) are concatenated into a 2D vector and then nonlinearly transformed through a fully connected layer (2D input, 32D output, ReLU activation function) to obtain a 32-dimensional alternativeity feature vector. This feature vector quantitatively reflects the intensity of the main alternative threats faced by the target technology (0.642) and the concentration / dispersion of the sources of alternative threats (0.115, a lower dispersion indicates that the threat mainly comes from a few alternative technologies), which can be used for subsequent cross-modal fusion.

[0062] This invention extracts initial feature vectors from target scientific and technological achievement nodes and all functionally similar alternative scientific and technological achievement nodes from a knowledge graph, and uses a random walk with restart to generate node sequences. This solves the problem that traditional evaluation methods rely solely on subjective experience or simple technical indicator comparisons to identify alternative technologies, making it difficult to comprehensively cover potential alternative sources. This improves the systematicness and completeness of alternative technology identification and reduces the risk of evaluation bias due to the omission of key alternative technologies. Furthermore, by training the node sequences using a Skip-gram model, each node is mapped to a low-dimensional embedding vector, ensuring that nodes with high co-occurrence probabilities in the graph are closer together in the embedding space, thus addressing the issue of... Traditional similarity calculation methods cannot utilize the rich structural information and contextual relationships in knowledge graphs. This method improves the semantic expressiveness and accuracy of functional similarity assessment and reduces misjudgments caused by ignoring indirect connections between technologies. By calculating the cosine similarity between the target technology's embedding vector and the embedding vector of each alternative node to obtain the functional similarity score, and by comprehensively calculating the weighted comprehensive threat score by integrating market share and cost advantage, this method solves the problem that a single-dimensional functional similarity cannot reflect the actual market competitiveness and economic viability of alternative technologies. This improves the multi-factor fusion capability of the comprehensive threat score and reduces the distortion of threat assessment caused by ignoring commercial factors.

[0063] S4, extract the technical value characteristics, economic value characteristics, and risk characteristics of the target scientific and technological achievements from the knowledge graph; Technological value characteristics are used to characterize the intrinsic value of a target scientific and technological achievement at the technological level, including its degree of advancement relative to existing technologies in the same field.

[0064] Risk characteristics are used to characterize the degree of adverse factors faced by the target scientific and technological achievements during the transformation and application process, including the market acceptance risk of users or the market resisting the scientific and technological achievements.

[0065] In one possible implementation, technological value characteristics are used to characterize the intrinsic value of the target scientific and technological achievement at the technological level. This embodiment focuses on extracting the core indicator of "the degree of advancement relative to existing technologies in the same field." Specifically, all existing technology nodes belonging to the "solid electrolyte / composite electrolyte" technology field with the target technology are retrieved from the knowledge graph. Excluding the target technology itself, 15 representative existing technology nodes are obtained, including sulfide electrolytes, oxide electrolytes, polymer electrolytes, halide electrolytes, and other composite electrolytes. The attribute values of the target technology and each existing technology node on the following key technical indicators are read from the knowledge graph, such as room temperature ionic conductivity, electrochemical stability window, interfacial impedance, lithium-ion transference number, and mechanical strength. The indicator value of the target technology is ionic conductivity of 1.2 × 10⁻⁶. -3 S / cm, electrochemical window 5.2 V, interfacial impedance 80 Ω·cm2 It has a lithium-ion transference number of 0.68 and a mechanical strength of 12 MPa.

[0066] The optimal value for existing technology nodes (taken from the best performance among all nodes): Ionic conductivity, 2.5 × 10⁻⁶ -2 S / cm (sulfide electrolyte); Electrochemical window, 6.0 V (halogenated electrolyte); Interface impedance, 50 Ω·cm 2 (Polymer electrolyte); Lithium-ion transference number, 0.85 (oxide electrolyte); Mechanical strength: 20 MPa (oxide electrolyte).

[0067] For each indicator, a normalized comparison method was used to calculate the advancement score of the target technology relative to the best existing technology. Indicator directions were defined (positive indicators are better the larger, negative indicators are better the smaller), with ionic conductivity, electrochemical window, lithium-ion transference number, and mechanical strength as positive indicators, and interfacial impedance as a negative indicator. Considering all indicators and weighting them (with preset weights of 0.4 for ionic conductivity, 0.2 for electrochemical window, 0.2 for interfacial impedance, 0.1 for lithium-ion transference number, and 0.1 for mechanical strength), the overall advancement score of the target technology was calculated to be 0.62 (out of 1.0), indicating that while the target technology is inferior to sulfide electrolytes in ionic conductivity, it possesses a moderately high level of advancement in overall performance, including interfacial impedance and mechanical strength.

[0068] In addition to the degree of advancement, the technology maturity level, number of patent families, and technology uniqueness are further extracted from the knowledge graph, and all technology value-related features are spliced together into a 64-dimensional technology value feature vector.

[0069] In one possible implementation, risk characteristics are used to characterize the degree of adverse factors faced by the target technological achievement during its transformation and application. This embodiment focuses on extracting the "market acceptance risk of users or the market resisting the technological achievement." Specifically, potential user groups related to the target technology and their feedback information on solid-state electrolyte technology are retrieved from a knowledge graph. Data sources include survey data on "acceptance intention of solid-state battery technology" in industry research reports, evaluation texts of composite electrolyte technology by researchers and industry experts, and technical docking feedback recorded in the enterprise cooperation intention database.

[0070] A pre-trained sentiment analysis model (BERT-based-finetuned on Chinese sentiment) was used to score the sentiment of each user feedback text. A total of 326 valid feedback samples were collected, with an average sentiment score of 0.23 (leaning towards neutral to slightly positive, but still containing a significant number of negative voices). Typical negative feedback examples: "The interfacial impedance of composite electrolytes is still too high, making it difficult to replace liquid electrolytes in the short term" (sentiment score -0.4); "The cost of solid-state batteries cannot be reduced, and automakers are not willing to use them" (sentiment score -0.3). Typical positive feedback examples: "Composite electrolytes are a compromise solution and are expected to be mass-produced first" (sentiment score +0.7).

[0071] The user acceptance score is mapped to a market acceptance risk score. This is combined with historical conversion cases stored in the knowledge graph: among the top 5 technologies with a similarity exceeding 0.6 to the target technology, 3 failed to convert after entering the market due to low user acceptance, resulting in a historical conversion failure rate of 60%. Combining these two factors, a weighted fusion (user sentiment weight 0.6, historical failure rate weight 0.4) is applied to obtain a final market acceptance risk score of 0.702, which serves as a risk characteristic.

[0072] This invention addresses the problems of traditional technology value assessment relying on subjective expert scoring and lacking objective comparable benchmarks by retrieving existing technology nodes from a knowledge graph and using a normalized comparison method to calculate the degree of advancement of the target scientific and technological achievement relative to the best existing technology. It improves the objectivity and repeatability of quantifying technological advancement and reduces inconsistencies in assessment caused by differences in expert preferences. Furthermore, by retrieving feedback information from potential user groups from the knowledge graph and using a pre-trained sentiment analysis model to score user evaluation texts and calculate the average acceptance score, it solves the problems of high cost, limited sample size, and difficulty in automatically updating user acceptance data in traditional market research methods. This improves the automation and timeliness of market acceptance risk extraction and reduces data lag and collection costs caused by reliance on manual research. Finally, by mapping user sentiment analysis scores to market acceptance risk scores and combining them with the failure rates of historical conversion cases stored in the knowledge graph for weighted fusion, it solves the problem that current user sentiment alone cannot reflect the historical conversion patterns of similar technologies. This improves the predictive ability and robustness of market acceptance risk assessment and reduces the probability of underestimating risk due to ignoring historical experience.

[0073] S5, input the market potential assessment feature vector along with the technological value feature, economic value feature and risk feature into the cross-modal attention fusion module to generate a comprehensive representation vector; The processing steps of the cross-modal attention fusion module include: The market potential assessment feature vector, technical value feature, economic value feature and risk feature are each mapped to the same embedding space through linear transformation to obtain the corresponding feature embedding sequence; Using market potential assessment feature embedding as the query, and using technology value feature embedding, economic value feature embedding, and risk feature embedding as keys and values, we calculate the multi-head cross-attention output. The multi-head cross-attention output and the market potential assessment feature embedding sequence are residually connected and layer normalized, and then a comprehensive representation vector is generated through a feedforward neural network layer.

[0074] In one possible implementation, the following four feature vectors are obtained from the previous steps: The market potential assessment feature vector has 64 dimensions and includes market supply and demand characteristics (supply-demand ratio index 0.0093, demand growth rate 0.414, supply-demand dynamic balance vector), market maturity characteristics (market access index 0.62, competition intensity index 0.78, industrial chain integrity index 0.56, graph-level representation vector), substitutability characteristics (competitive threat intensity 0.642, substitution risk diversification 0.115), profitability characteristics (expected gross profit margin 35%, omitted in this example), and transformation path adaptability characteristics (technology-industry matching degree 0.58). The technology value feature vector has 64 dimensions and includes an advancement score of 0.62, a technology maturity level (TRL) of 4, a number of patent families of 8, and a technology uniqueness score of 0.47. The economic value feature vector has 64 dimensions and includes the expected market size (32 billion yuan in 2030), return on investment (18%), net present value (1.2 billion yuan), and cost recovery period (5.2 years). The risk feature vector has 64 dimensions and includes market acceptance risk (0.702) and technological substitution risk (competitive threat intensity (0.642)).

[0075] The four feature vectors are mapped to the same embedding space through independent linear transformation layers, with an embedding dimension of 128, resulting in corresponding feature embedding sequences. The market potential assessment feature embedding sequence is used as the query, with the technological value, economic value, and risk feature embeddings serving as both key and value. These three key-value pairs are stacked into a sequence. A multi-head attention mechanism is employed, calculating attention weights for each head, calculating a weighted output, and concatenating the outputs of all heads to obtain the multi-head cross-attention output. This calculation allows the market potential feature to proactively query relevant information from the technological, economic, and risk features, and then aggregate them in a weighted manner. For example, when market supply and demand are imbalanced (supply is far less than demand), the attention mechanism will increase the weight given to technological advancement (the ability to rapidly expand production) and risk (supply risk).

[0076] The multi-head cross-attention output is residually connected to the original market potential assessment feature embedding, followed by layer normalization. Layer normalization calculates the mean and variance along the feature dimensions to stabilize the training process. The normalized output is input into a two-layer fully connected feedforward network (FFN). The first layer maps 128 dimensions to 512 dimensions, and the second layer maps 512 dimensions back to 128 dimensions, using ReLU as the activation function. This FFN introduces a non-linear transformation to enhance the model's expressive power. Finally, residual connections and layer normalization are performed again, resulting in a 1×128 comprehensive representation vector. In this embodiment, after processing by the cross-modal attention fusion module, the specific value of the generated target technology comprehensive representation vector is [0.23, 0.45, 0.67, 0.12, [0.08,...,0.91] (128 dimensions in total), this comprehensive representation vector integrates market potential (supply and demand, maturity, substitutability, etc.), technological value, economic value, and risk characteristics. Among them, market potential characteristics, as the query, dominate the fusion direction, ensuring that the final representation is highly sensitive to market dynamics. At the same time, it absorbs complementary information from the technological, economic, and risk dimensions through cross-attention. This vector will be input into the comprehensive evaluation network of scientific and technological achievements in step S6, and the output will be a comprehensive evaluation result between 0 and 1.

[0077] This invention maps market potential assessment feature vectors, technological value features, economic value features, and risk features to the same embedding space using linear transformations. It calculates a multi-head cross-attention output using the market potential assessment feature embedding as the query and the other three types of feature embeddings as keys and values. The output is then processed through residual connections, layer normalization, and a feedforward neural network to generate a comprehensive representation vector. This addresses the problems in traditional assessment methods where simple concatenation or linear weighting of features leads to ineffective cross-modal information interaction and neglects the complementary correlation between market potential features and technological / economic / risk features. It improves the fusion depth and representational ability of the comprehensive representation vector for different modalities, and reduces assessment information loss and decision bias caused by information isolation between features.

[0078] S6. Input the comprehensive representation vector into the comprehensive evaluation network of scientific and technological achievements. The network contains at least one fully connected hidden layer and an output layer. The output layer adopts the Sigmoid activation function and outputs the comprehensive evaluation result between 0 and 1.

[0079] In one possible implementation, the comprehensive evaluation network for scientific and technological achievements adopts a fully connected feedforward neural network structure, with the following specific configuration: The input layer receives a 128-dimensional comprehensive representation vector; The first fully connected hidden layer has 64 neurons and uses ReLU as the activation function to extract higher-order nonlinear features; The second fully connected hidden layer has 32 neurons and uses ReLU activation function to further compress the feature dimension. The output layer has one neuron and uses the Sigmoid activation function to map the output to the (0, 1) interval, representing the comprehensive evaluation score of the target scientific and technological achievement.

[0080] The final overall evaluation result, calculated to be 0.775 (between 0 and 1), is based on the preset evaluation level classification criteria. [0.0, 0.4), low conversion value, investment is not recommended; [0.4, 0.6), with moderate transformation value, it can be promoted cautiously; [0.6, 0.8), with high conversion value, it is recommended to focus on supporting it; [0.8, 1.0], extremely high conversion value, priority investment.

[0081] The target technology scored 0.775, which is considered a high level of conversion value. This indicates that the solid-state lithium battery composite electrolyte technology has good market potential and techno-economic value, but the risks of market acceptance and the threat of substitutability still need to be considered.

[0082] Example 2 like Figure 2 As shown, the present invention also provides a system for constructing a scientific and technological achievement evaluation model based on multi-source heterogeneous data fusion. The system is used to implement the method for constructing a scientific and technological achievement evaluation model based on multi-source heterogeneous data fusion as described in any of Embodiment 1. The system includes: The data acquisition module is used to acquire multi-source heterogeneous data related to the target scientific and technological achievements; The knowledge graph construction module, connected to the data acquisition module, is used to construct a knowledge graph for scientific and technological achievements evaluation based on the multi-source heterogeneous data, and to uniformly identify the same entity in different data sources through entity recognition. The market potential feature extraction module, connected to the knowledge graph construction module, is used to extract market potential assessment feature vectors from the knowledge graph. The market potential assessment feature vectors include at least market supply and demand relationship features, market maturity features, substitutability features, profitability features, and conversion path adaptability features. The value and risk feature extraction module, connected to the market potential feature extraction module, is used to extract the technical value features, economic value features, and risk features of the target scientific and technological achievements from the knowledge graph. The cross-modal attention fusion module, connected to the value and risk feature extraction module, is used to input the market potential assessment feature vector and the technical value feature, economic value feature and risk feature into the cross-modal attention fusion module to generate a comprehensive representation vector; The comprehensive evaluation module for scientific and technological achievements, connected to the cross-modal attention fusion module, is used to input the comprehensive representation vector into the comprehensive evaluation network for scientific and technological achievements. The network contains at least one fully connected hidden layer and an output layer. The output layer uses the Sigmoid activation function and outputs a comprehensive evaluation result between 0 and 1.

[0083] The technical solution of the present invention has been described above with reference to the preferred embodiments shown in the accompanying drawings. However, it will be readily understood by those skilled in the art that the scope of protection of the present invention is obviously not limited to these specific embodiments. Without departing from the principles of the present invention, those skilled in the art can make equivalent changes or substitutions to the relevant technical features, and the technical solutions after these changes or substitutions will all fall within the scope of protection of the present invention.

Claims

1. A method for constructing a scientific and technological achievement evaluation model based on multi-source heterogeneous data fusion, characterized in that, include: S1, acquire multi-source heterogeneous data related to the target scientific and technological achievements; S2, Based on the multi-source heterogeneous data, construct a knowledge graph for the evaluation of scientific and technological achievements, and use entity recognition to uniformly identify the same entity in different data sources; S3, extract market potential assessment feature vectors from the knowledge graph. The market potential assessment feature vectors include at least market supply and demand characteristics, market maturity characteristics, substitutability characteristics, profitability characteristics, and transformation path adaptability characteristics. S4, extract the technical value characteristics, economic value characteristics, and risk characteristics of the target scientific and technological achievements from the knowledge graph; S5, input the market potential assessment feature vector along with the technological value feature, economic value feature and risk feature into the cross-modal attention fusion module to generate a comprehensive representation vector; S6. Input the comprehensive representation vector into the comprehensive evaluation network of scientific and technological achievements. The network contains at least one fully connected hidden layer and an output layer. The output layer adopts the Sigmoid activation function and outputs the comprehensive evaluation result between 0 and 1.

2. The method for constructing a scientific and technological achievement evaluation model based on multi-source heterogeneous data fusion according to claim 1, characterized in that, The methods for extracting the characteristics of market supply and demand include: From the knowledge graph of scientific and technological achievements evaluation, based on the technical field to which the target scientific and technological achievements belong, the supply-side entity set and the demand-side entity set are retrieved and extracted; wherein, the supply-side entities are the organizations and product nodes that provide similar or alternative technologies, and the demand-side entities are the industry and user groups that apply the technology. Read the supply capacity attribute value of each supply-side entity and the demand scale attribute value of each demand-side entity respectively; The supply-demand ratio index is calculated based on the total supply capacity of all supply-side entities and the total demand scale of all demand-side entities; at the same time, the demand growth rate is calculated based on the historical time-series data stored in the knowledge graph. Construct a graph structure with supply-side entities and demand-side entities as nodes and competition, supply or substitution relationships between entities as edges; use a graph attention network to perform attention-weighted aggregation on the features of each node and its neighboring nodes, iteratively update the node representation, and then obtain the dynamic supply-demand balance vector through global pooling. The supply-demand ratio index, demand growth rate, and supply-demand dynamic balance vector are fused to generate market supply-demand relationship characteristics.

3. The method for constructing a scientific and technological achievement evaluation model based on multi-source heterogeneous data fusion according to claim 1, characterized in that, The methods for extracting the market maturity characteristics include: From the knowledge graph of scientific and technological achievements evaluation, extract the set of entities related to the market where the target scientific and technological achievements are located. The set of entities includes market rule makers, standard issuing agencies, existing competitors, and upstream and downstream supporting entities. Extract the preset attribute features of each entity related to the market where the target scientific and technological achievement is located. The attribute features include at least: policy support, standard completeness, competition concentration, and supporting facilities completeness. Based on the aforementioned attribute characteristics, the market access index is calculated by weighted summation, the competition intensity index is calculated by analyzing the number of entities and market share distribution, and the industrial chain completeness index is calculated by standard completeness and supporting completeness. Entities related to the market where the target scientific and technological achievement is located are used as nodes, and the regulatory, competitive, and collaborative relationships between entities are used as edges to construct a graph structure. A graph convolutional network is used to perform multi-layer neighborhood aggregation on the initial feature vector of each node to update the node representation. Then, global average pooling is used to aggregate all node representations into a graph-level representation vector. The market access index, competition intensity index, industry chain integrity index, and graph-level representation vector are weighted and fused to generate market maturity features.

4. The method for constructing a scientific and technological achievement evaluation model based on multi-source heterogeneous data fusion according to claim 1, characterized in that, The extraction methods for the substitutability features include: From the knowledge graph for evaluating scientific and technological achievements, extract the initial feature vector of the target scientific and technological achievement node, as well as the initial feature vectors of all alternative scientific and technological achievement nodes with similar functions to the target scientific and technological achievement. Starting from each alternative technological achievement node, a random walk with restart is performed with a preset step size and restart probability, and the sequence of nodes visited during the walk is recorded and generated; The Skip-gram model is used to train the generated node sequence, mapping each node to a low-dimensional embedding vector, so that nodes with high co-occurrence probability in the graph are closer in the embedding space; The cosine similarity between the embedding vector of the target scientific and technological achievement and the embedding vector of each alternative scientific and technological achievement node is calculated to obtain the functional similarity score. Calculate a comprehensive threat score for each alternative technological achievement, which is a weighted sum of functional similarity score, market share, and cost advantage; The comprehensive threat scores of all alternative nodes are sorted in descending order, and the top number of nodes are selected as the main alternative threats. The weighted average of the comprehensive threat scores of the main alternative threats is calculated as the competitive threat intensity. At the same time, the standard deviation of the comprehensive threat scores of the main alternative threats is calculated as the alternative risk dispersion. The competitive threat intensity and the dispersion of alternative risks are concatenated into vectors, and then nonlinearly transformed through a fully connected layer to generate substitutability features.

5. The method for constructing a scientific and technological achievement evaluation model based on multi-source heterogeneous data fusion according to claim 3, characterized in that, The methods for extracting the policy support level include: From the knowledge graph of scientific and technological achievements evaluation, retrieve a set of policy document entities related to the technical field of the target scientific and technological achievement. Each policy document entity includes policy type, issuing agency level, and policy text content attributes. The text content of each policy document is segmented into words, and each word is mapped to a policy text vector using a pre-trained word vector model. The cosine similarity between the policy text vector and the preset set of keyword vectors for scientific and technological achievements is calculated to obtain the text matching score. Basic weights are assigned based on policy type; level weights are assigned based on the level of the issuing organization. The support level score for each policy document is obtained by multiplying its text matching score, basic weight, and level weight together. The support scores of all policy documents are summed and normalized to generate the policy support level.

6. The method for constructing a scientific and technological achievement evaluation model based on multi-source heterogeneous data fusion according to claim 1, characterized in that, The processing procedure of the cross-modal attention fusion module includes: The market potential assessment feature vector, technical value feature, economic value feature and risk feature are each mapped to the same embedding space through linear transformation to obtain the corresponding feature embedding sequence; Using market potential assessment feature embedding as the query, and using technology value feature embedding, economic value feature embedding, and risk feature embedding as keys and values, we calculate the multi-head cross-attention output. The multi-head cross-attention output and the market potential assessment feature embedding sequence are residually connected and layer normalized, and then a comprehensive representation vector is generated through a feedforward neural network layer.

7. The method for constructing a scientific and technological achievement evaluation model based on multi-source heterogeneous data fusion according to claim 1, characterized in that, The entity recognition employs an entity alignment method based on graph neural networks, including: The semantic embedding vectors of each entity's name, attributes, and context information are extracted using a pre-trained language model; the Euclidean distance between the semantic embedding vectors of different entities is calculated as the semantic similarity. By combining the neighborhood structure of entities in the knowledge graph, the structural similarity between entities is calculated through a graph attention network; Semantic similarity and structural similarity are weighted and fused to construct an entity similarity matrix. A greedy matching algorithm is then used to merge entity nodes from different data sources that have similarity exceeding a preset threshold and assign them a unified identifier.

8. The method for constructing a scientific and technological achievement evaluation model based on multi-source heterogeneous data fusion according to claim 1, characterized in that, The technical value characteristics are used to characterize the intrinsic value of the target scientific and technological achievement at the technical level, including its degree of advancement relative to existing technologies in the same field.

9. The method for constructing a scientific and technological achievement evaluation model based on multi-source heterogeneous data fusion according to claim 1, characterized in that, The risk characteristics are used to characterize the degree of adverse factors faced by the target scientific and technological achievements in the process of transformation and application, including the market acceptance risk of users or the market resisting the scientific and technological achievements.

10. A system for constructing a scientific and technological achievement evaluation model based on multi-source heterogeneous data fusion, characterized in that, The system is used to implement the method for constructing a scientific and technological achievement evaluation model based on multi-source heterogeneous data fusion as described in any one of claims 1 to 9, and the system comprises: The data acquisition module is used to acquire multi-source heterogeneous data related to the target scientific and technological achievements; The knowledge graph construction module, connected to the data acquisition module, is used to construct a knowledge graph for scientific and technological achievements evaluation based on the multi-source heterogeneous data, and to uniformly identify the same entity in different data sources through entity recognition. The market potential feature extraction module, connected to the knowledge graph construction module, is used to extract market potential assessment feature vectors from the knowledge graph. The market potential assessment feature vectors include at least market supply and demand relationship features, market maturity features, substitutability features, profitability features, and conversion path adaptability features. The value and risk feature extraction module, connected to the market potential feature extraction module, is used to extract the technical value features, economic value features, and risk features of the target scientific and technological achievements from the knowledge graph. The cross-modal attention fusion module, connected to the value and risk feature extraction module, is used to input the market potential assessment feature vector and the technical value feature, economic value feature and risk feature into the cross-modal attention fusion module to generate a comprehensive representation vector; The comprehensive evaluation module for scientific and technological achievements, connected to the cross-modal attention fusion module, is used to input the comprehensive representation vector into the comprehensive evaluation network for scientific and technological achievements. The network contains at least one fully connected hidden layer and an output layer. The output layer uses the Sigmoid activation function and outputs a comprehensive evaluation result between 0 and 1.