A knowledge recommendation method, system and device based on a research and development process
By constructing and updating knowledge graphs, and combining vector similarity and association path retrieval, the problem of not being able to identify deep logical connections in R&D documents in existing technologies has been solved, enabling accurate knowledge recommendation and cross-project experience sharing, thereby improving R&D efficiency and resource utilization.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- 成都天地直方发动机有限公司
- Filing Date
- 2026-02-03
- Publication Date
- 2026-06-19
AI Technical Summary
Existing technologies struggle to identify and extract deep logical connections within R&D documents, resulting in limited applicability of knowledge recommendation systems in highly specialized R&D scenarios with complex logical relationships. This leads to low efficiency in knowledge reuse and an inability to effectively support cross-project technological innovation and experience transfer.
By acquiring historical R&D documents, extracting technical entities and their semantic relationships, forming structured semantic triples, constructing a knowledge graph and dynamically updating it, and combining vector similarity and knowledge graph association paths for retrieval, a recommended knowledge list is generated.
It enables precise delivery of relevant knowledge, reduces information retrieval time, improves R&D efficiency, and the dynamic update mechanism promotes experience sharing across projects, avoids redundant R&D, and saves resources.
Smart Images

Figure CN122240846A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of enterprise knowledge management and intelligent recommendation technology, and in particular to a knowledge recommendation method, system and device based on the R&D process. Background Technology
[0002] In R&D-oriented companies, as projects accumulate, a large amount of unstructured text data is generated, including technical documents, design reports, experimental records, and problem solutions. These documents contain valuable R&D experience and technical knowledge. Currently, most internal knowledge recommendation systems in enterprises use technologies such as keyword matching, collaborative content filtering, or manually labeled tag classification. While these methods achieve information retrieval and delivery to a certain extent, their core limitation is that they can only process explicit information at the literal level and cannot deeply understand the rich semantics and deep logical connections contained in the document content.
[0003] Specifically, traditional methods struggle to identify and extract complex relationships between technical descriptions, such as the correspondence between "problems-phenomena-solutions," the interrelationships between "technical parameters-performance indicators," the causal chains between "technical routes-achievement effects," and structural knowledge like "core concepts-derived applications." This results in coarse-grained knowledge recommendations, weak correlations, and poor scenario adaptability. The system often only returns documents containing the same vocabulary, failing to intelligently and accurately recommend deep knowledge from other projects with similar logical relationships or solutions to the specific technical challenges or design tasks currently faced by R&D personnel. Therefore, existing technologies are not highly applicable to highly specialized and logically complex R&D scenarios, exhibiting low knowledge reuse efficiency and failing to effectively support cross-project technological innovation and experience transfer. Summary of the Invention
[0004] This invention provides a knowledge recommendation method, system, and device based on the research and development process to overcome at least one of the aforementioned technical problems in the prior art.
[0005] To achieve the above objectives, the embodiments of the present invention adopt the following technical solutions: In a first aspect, the present invention provides a knowledge recommendation method based on the research and development process, comprising: Obtain historical R&D documents from different R&D projects, extract the technical entities and semantic relationships between the technical entities from the historical R&D documents, form structured semantic triples, and convert the historical R&D documents into semantic vectors and store them in a vector database; A knowledge graph is constructed using the structured semantic triples as relation edges, and the knowledge graph is dynamically updated when there are incremental changes in the historical R&D documents. Listen for user-triggered new R&D document writing events and obtain the text content entered by the user in the new R&D document; Extract the semantic vector and technical entity to be retrieved from the text content, perform semantic retrieval based on vector similarity and graph retrieval based on knowledge graph association path in parallel, and fuse and sort the several retrieval results obtained by the semantic retrieval and the graph retrieval to generate a recommended knowledge list; The recommended knowledge list will be pushed to you.
[0006] In one possible implementation of the first aspect, the step of extracting technical entities and semantic relationships between the technical entities from the historical R&D documents to form structured semantic triples includes: The historical R&D documents are identified using a natural language processing model fine-tuned based on professional R&D corpus to obtain the technical entities and the semantic relationships between them, and the semantic relationships between the technical entities are converted into structured semantic triples.
[0007] In one possible implementation of the first aspect, the step of dynamically updating the knowledge graph when incremental changes occur in the historical R&D documents includes: Extract incremental technical entities from incremental historical R&D documents, calculate the multi-dimensional comprehensive similarity between the incremental technical entities and existing nodes in the knowledge graph, and determine whether the multi-dimensional comprehensive similarity exceeds a preset similarity threshold. If so, the incremental technical entity is merged with the existing node, and the attributes of the merged node are managed in a versioned manner. If not, create a new node.
[0008] In one possible implementation of the first aspect, the step of dynamically updating the knowledge graph when incremental changes occur in the historical R&D documents includes: Calculate the initial weight of the newly added relation edge using the following formula: W new =λ1×F 权威性 +λ2×F 置信度 +λ3×F 一致性 +λ4×F 频率 +λ5×F 时效性 ; Where λ1 to λ5 represent weighting factors; F 权威性 Indicates the authority level of the source document; F 置信度 F represents the output confidence of the information extraction model. 一致性 This indicates the degree of logical compatibility between the relation edge and existing adjacent relation edges in the knowledge graph; F 频率 Indicates the frequency of relation edges appearing in different documents; F时效性 The exponential decay factor representing the timestamp of the source document; The weights of existing relational edges are updated using the exponential moving average method.
[0009] In one possible implementation of the first aspect, the fusion and sorting of the several search results obtained from the semantic retrieval and the graph retrieval includes: The overall score for each search result is calculated using the following formula: Score = a × S vector + β×S graph ; Where Score represents the overall score, a and β represent the fusion coefficients, and S vector S represents the score based on vector cosine similarity. graph This represents the relevance score calculated based on the weight of the knowledge graph's associated paths and the number of hops.
[0010] In one possible implementation of the first aspect, after pushing the recommended knowledge list, it further includes: The system acquires user interaction behavior based on the recommended knowledge list and optimizes the structure of the knowledge graph and the fusion ranking strategy based on the interaction behavior.
[0011] In one possible implementation of the first aspect, optimizing the structure of the knowledge graph based on the interaction behavior includes: Based on the frequency of adoption of knowledge in the recommended knowledge list, adjust the centrality metric of the corresponding node or the weight of the corresponding relation edge in the knowledge graph.
[0012] In one possible implementation of the first aspect, optimizing the fusion sorting strategy based on the interaction behavior includes: Using the interaction behavior as training samples, the fusion coefficient in the fusion ranking is adjusted periodically.
[0013] Compared with the prior art, the present invention has at least the following beneficial effects: This invention provides a knowledge recommendation method based on the R&D process. By sensing the developer's context, it proactively and accurately pushes related knowledge, which greatly reduces the time consumption of information retrieval and improves R&D efficiency. The knowledge graph adopts a dynamic update mechanism, which gives knowledge strength, timeliness and credibility. It can evolve dynamically, automatically link cross-project experience, break down R&D information silos, effectively promote the reuse of technical achievements, avoid redundant R&D, and save R&D resources and costs.
[0014] Secondly, the present invention provides a knowledge recommendation system based on the research and development process, comprising: The document structuring module is used to obtain historical R&D documents from different R&D projects, extract the technical entities and semantic relationships between the technical entities from the historical R&D documents, form structured semantic triples, and convert the historical R&D documents into semantic vectors and store them in a vector database. The knowledge graph construction and dynamic update module is used to construct a knowledge graph using the structured semantic triples as relation edges, and to dynamically update the knowledge graph when there are incremental changes in the historical R&D documents; The listening module is used to listen for new R&D document writing events triggered by users and obtain the text content entered by the user in the new R&D document; The multi-source fusion knowledge retrieval module is used to extract the semantic vector and technical entity to be retrieved from the text content, perform semantic retrieval based on vector similarity and graph retrieval based on knowledge graph association path in parallel, and fuse and sort the several retrieval results obtained by the semantic retrieval and the graph retrieval to generate a recommended knowledge list. The push module is used to push the recommended knowledge list.
[0015] Thirdly, the present invention provides an electronic device comprising: at least one processor and at least one memory, wherein the memory stores computer-readable instructions; the computer-readable instructions are executed by one or more of the processors, causing the electronic device to implement a knowledge recommendation method based on the research and development process as in any implementation of the first aspect.
[0016] Fourthly, the present invention provides a storage medium having a computer-executable program stored thereon, the computer-executable program being used to cause a computer to execute a knowledge recommendation method based on the research and development process as in any implementation of the first aspect.
[0017] Understandably, the beneficial effects achieved by the system of the second aspect, the electronic device of the third aspect, and the storage medium of the fourth aspect provided above can be referred to in light of the beneficial effects of the first aspect and any of its possible design embodiments, which will not be repeated here. Attached Figure Description
[0018] To more clearly illustrate the technical solutions in the embodiments of this application or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are some embodiments of this application. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.
[0019] Figure 1 This is a schematic diagram of the structure of an electronic device provided in an embodiment of the present invention; Figure 2 A flowchart of a knowledge recommendation method based on the R&D process provided in this embodiment of the invention; Figure 3 This is a structural block diagram of a knowledge recommendation system based on the R&D process, provided as an embodiment of the present invention. Detailed Implementation
[0020] The technical solutions of the embodiments of the present invention will be described below with reference to the accompanying drawings. In the description of the present invention, unless otherwise stated, " / " indicates that the objects before and after are in an "or" relationship. For example, A / B can represent A or B. The "or" in the present invention is merely a description of the relationship between the related objects, indicating that three relationships can exist. For example, A or B can represent: A alone, A and B simultaneously, and B alone. A and B can be singular or plural. Furthermore, in the description of the present invention, unless otherwise stated, "multiple" refers to two or more. "At least one of the following" or similar expressions refer to any combination of these items, including any combination of single or plural items.
[0021] Furthermore, to facilitate a clear description of the technical solutions of the embodiments of the present invention, the terms "first" and "second" are used in the embodiments of the present invention to distinguish identical or similar items with essentially the same function and effect. Those skilled in the art will understand that the terms "first" and "second" do not limit the quantity or execution order, and the terms "first" and "second" are not necessarily different.
[0022] In this embodiment of the invention, the terms "exemplary" or "for example" are used to indicate that something is an example, illustration, or description. Any embodiment or design described as "exemplary" or "for example" in this embodiment of the invention should not be construed as superior or more advantageous than other embodiments or designs. Specifically, the use of terms such as "exemplary" or "for example" is intended to present the relevant concepts in a concrete manner for ease of understanding.
[0023] In R&D-oriented companies, as projects accumulate, a large amount of unstructured text data is generated, including technical documents, design reports, experimental records, and problem solutions. These documents contain valuable R&D experience and technical knowledge. Currently, most internal knowledge recommendation systems in enterprises use technologies such as keyword matching, collaborative content filtering, or manually labeled tag classification. While these methods achieve information retrieval and delivery to a certain extent, their core limitation is that they can only process explicit information at the literal level and cannot deeply understand the rich semantics and deep logical connections contained in the document content.
[0024] Specifically, traditional methods struggle to identify and extract complex relationships between technical descriptions, such as the correspondence between "problems-phenomena-solutions," the interrelationships between "technical parameters-performance indicators," the causal chains between "technical routes-achievement effects," and structural knowledge like "core concepts-derived applications." This results in coarse-grained knowledge recommendations, weak correlations, and poor scenario adaptability. The system often only returns documents containing the same vocabulary, failing to intelligently and accurately recommend deep knowledge from other projects with similar logical relationships or solutions to the specific technical challenges or design tasks currently faced by R&D personnel. Therefore, existing technologies are not highly applicable to highly specialized and logically complex R&D scenarios, exhibiting low knowledge reuse efficiency and failing to effectively support cross-project technological innovation and experience transfer.
[0025] In view of this, on the one hand, embodiments of the present invention provide a knowledge recommendation method based on the R&D process, including: acquiring historical R&D documents of different R&D projects, extracting technical entities and semantic relationships between the technical entities from the historical R&D documents to form structured semantic triples, and converting the historical R&D documents into semantic vectors and storing them in a vector database; constructing a knowledge graph using the structured semantic triples as relation edges, and dynamically updating the knowledge graph when there are incremental changes in the historical R&D documents; listening to new R&D document writing events triggered by users, and acquiring the text content entered by the user in the new R&D document; extracting the semantic vectors and technical entities to be retrieved from the text content, performing semantic retrieval based on vector similarity and graph retrieval based on knowledge graph association paths in parallel, and fusing and sorting the several retrieval results obtained by the semantic retrieval and the graph retrieval to generate a recommended knowledge list; and pushing the recommended knowledge list.
[0026] The knowledge recommendation method based on the R&D process provided by this invention actively and accurately pushes related knowledge by sensing the developer's context, which greatly reduces the time consumption of information retrieval and improves R&D efficiency. The knowledge graph adopts a dynamic update mechanism, which makes the knowledge have strength, timeliness and credibility, can evolve dynamically, automatically associate cross-project experience, break down R&D information silos, effectively promote the reuse of technical achievements, avoid redundant R&D, and save R&D resources and costs.
[0027] In some embodiments, the knowledge recommendation method based on the R&D process provided by the present invention can be executed by any electronic device 20 with data processing capabilities, such as a general-purpose computer, personal computer, laptop computer, switch, or tablet computer, etc. The specific implementation of the electronic device 20 is not limited here.
[0028] Figure 1A schematic diagram of the hardware structure of an electronic device provided in an embodiment of the present invention is shown. The electronic device 20 includes a processor 210, a memory 220, and a communication interface 230.
[0029] Processor 210 may include one or more processing cores. Processor 210 connects to various parts within electronic device 200 using various interfaces and lines, and performs various functions and processes data of electronic device 200 by running or executing instructions, programs, code sets, or instruction sets stored in memory 220, and by calling data stored in memory 220. Optionally, processor 210 may be implemented using at least one of the following hardware forms: Central Processing Unit (CPU), Graphics Processing Unit (GPU), Digital Signal Processing (DSP), Field-Programmable Gate Array (FPGA), and Programmable Logic Array (PLA).
[0030] The memory 220 may include random access memory (RAM) or read-only memory (ROM). Optionally, the memory 220 may include a non-transitory computer-readable storage medium. The memory 220 may be used to store instructions, programs, code, code sets, or instruction sets. The memory 220 may include a program storage area. This program storage area may store instructions for implementing an operating system, instructions for implementing at least one function, instructions for implementing the various method embodiments described above, etc.
[0031] Communication interface 230 is used to communicate with other devices, equipment or communication networks, such as data storage devices, image processing devices or Ethernet, wireless access network (RAN), wireless local area network (WLAN), etc.
[0032] In terms of physical implementation, the aforementioned devices (such as processor 210, memory 220, and communication interface 230) can each be devices within the same device (such as a laptop computer). Alternatively, at least two of these devices can be located within the same device, i.e., as different devices within the same device, similar to the deployment of devices or components in a distributed system.
[0033] It is understood that the structure illustrated in this embodiment does not constitute a specific limitation on the electronic device 20. In other embodiments of the present invention, the electronic device 20 may include more or fewer components than illustrated, or combine some components, or split some components, or have different component arrangements. The illustrated components may be implemented in hardware, software, or a combination of software and hardware.
[0034] The following description, in conjunction with the accompanying drawings, illustrates a knowledge recommendation method based on the research and development process provided by an embodiment of the present invention.
[0035] like Figure 2 As shown, this embodiment of the invention provides a knowledge recommendation method based on the R&D process, which may include, but is not limited to: S1: Obtain historical R&D documents from different R&D projects, extract the technical entities and semantic relationships between the technical entities from the historical R&D documents, form structured semantic triples, and convert the historical R&D documents into semantic vectors and store them in a vector database.
[0036] In specific implementation, the embodiments of the present invention can obtain historical R&D documents by user manual upload or by directly and actively obtaining historical R&D documents by accessing the enterprise document library. There is no limitation on this. The historical R&D documents mentioned in the embodiments of the present invention may include, but are not limited to, experimental reports, design documents, fault records and other R&D-related documents and materials. There is no limitation on this either.
[0037] In one feasible implementation, the extraction of technical entities and semantic relationships between the technical entities from the historical R&D documents to form structured semantic triples in this embodiment of the invention may include, but is not limited to: The historical R&D documents are identified using a natural language processing model fine-tuned based on professional R&D corpus to obtain the technical entities and the semantic relationships between them, and the semantic relationships between the technical entities are converted into structured semantic triples.
[0038] In specific implementation, the natural language processing model in this embodiment of the invention can adopt a pre-trained model based on the Transformer architecture, such as BERT, RoBERTa, ERNIE, etc., as the basic foundation. Then, professional R&D corpora, such as patent documents, technical standards, experimental reports, design documents, technical papers, etc., are collected to form a large and high-quality professional corpus text library. The natural language processing model is then further pre-trained or fine-tuned for downstream tasks on the professional corpus text library, so that its internal representation is closer to the language style and knowledge distribution of the professional field, thereby improving its accuracy in recognizing R&D documents.
[0039] In specific implementation, embodiments of the present invention can define a set of technical entity tagging systems based on the characteristics of the research and development field, for example: MATERIAL: such as "GH4169 alloy", "carbon fiber composite material"; PROCESS (process / method): such as "laser cladding" and "annealing treatment"; PROBLEM (problems / faults): such as "coating cracking", "signal distortion", "overfitting"; PARAMETER (parameters): such as "coefficient of thermal expansion", "learning rate", "tensile strength"; EQUIPMENT (equipment / tools): such as "scanning electron microscope", "TensorFlow framework"; PROJECT: such as "Project A" or "Development of Model XX".
[0040] Define the semantic relationships that may exist between technical entities, for example: Causal relationship: <problem, cause, problem>, <parameter, impact, performance>; Application relationship: <method, applied to, scenario>, <material, used for, component>; Attribute relationships: <device, with parameters, values>, <method, containing steps, steps>; Similarity / substitution relationship: <Option A is similar to Option B>.
[0041] Then, a batch of R&D documents are annotated manually or semi-automatically to form high-quality training data of triples (text, entity label sequence, relation list) to train the natural language processing model.
[0042] In specific implementation, the natural language processing model in this embodiment of the invention may, but is not limited to, a pipelined model or a joint extraction model. The pipelined model can be understood as first running a fine-tuned NER (Named Entity Recognition) model to identify all technical entities in the text; then, inputting all possible technical entity pairs into another fine-tuned RE (Relation Extraction) model to determine the relationships between them. The joint extraction model can be understood as using a unified model to simultaneously complete technical entity recognition and relation extraction. Common methods include transforming the task into sequence labeling or table filling, for example, constructing an n×n table (n is the sentence length), predicting entity types on the diagonal, and predicting the relationship between the technical entities corresponding to two tokens in the off-diagonal cells. A shared BERT encoder encodes the sentence once, and then two decoders predict the technical entities and relationships respectively. No further limitations are specified here.
[0043] In specific implementation, after identifying and obtaining the original data of technical entities and the relationships between them, the embodiments of the present invention convert the original data into structured semantic triples. The methods for converting the original data into structured semantic triples may include, but are not limited to: Technical entity normalization: The identified entity fragments (such as "GH4169" or "GH4169 high temperature alloy") are mapped to a standardized entity name (such as "GH4169 alloy"), in preparation for subsequent entity alignment in the knowledge graph.
[0044] Relationship purification: Filter out relation prediction results with low confidence.
[0045] Triple assembly: Assembles (Entity 1, Relation Type, Entity 2) according to a uniform format, and can include metadata such as source document, location, and confidence level. For example: enter: "Creep tests show that the mismatch in the coefficient of thermal expansion is the main cause of coating cracking." Output: Physical evidence: [Coating cracking: PROBLEM], [Coefficient of thermal expansion mismatch: PROBLEM], [Crease test: PROCESS]; Relationship: (Coating cracking, cause: thermal expansion coefficient mismatch), (Creep test detected: coating cracking); Structured semantic triples: <Coating cracking, cause: thermal expansion coefficient mismatch> | Source: doc_id_001 | Confidence level: 0.92; <Creep test detected coating cracking> | Source: doc_id_001 | Confidence level: 0.87.
[0046] In specific implementation, the natural language processing model in this embodiment of the invention can simultaneously convert the historical R&D documents into semantic vectors while identifying and obtaining the technical entities and the relationships between them. It can be understood that the natural language processing model in this embodiment of the invention includes a multi-task output head, one of which is a semantic vector (sentence / paragraph embedding) head, used to convert the historical R&D documents into semantic vectors, which is not limited here.
[0047] S2: Construct a knowledge graph using the structured semantic triples as relation edges, and dynamically update the knowledge graph when incremental changes occur in the historical R&D documents.
[0048] It should be noted that the nodes of the knowledge graph constructed in the embodiments of the present invention represent entities such as technical concepts, methods, parameters, projects or personnel, and the edges represent semantic relationships such as "cause", "use", "belong to" or "similar to".
[0049] In one feasible implementation, when incremental changes occur in the historical R&D documents, the knowledge graph is dynamically updated, which may include, but is not limited to: Extract incremental technical entities from incremental historical R&D documents, calculate the multi-dimensional comprehensive similarity between the incremental technical entities and existing nodes in the knowledge graph, and determine whether the multi-dimensional comprehensive similarity exceeds a preset similarity threshold. If so, the incremental technical entity is merged with the existing node, and the attributes of the merged node are managed in a versioned manner. If not, create a new node.
[0050] In the specific implementation process, the multi-dimensional comprehensive similarity in the embodiments of the present invention includes at least name similarity, attribute similarity and context similarity. Each dimension is assigned a certain weight ratio (which can be the same or different). Then, it is calculated whether the comprehensive similarity exceeds a preset similarity threshold (e.g., 0.85). If it exceeds the preset similarity threshold, the nodes are merged; otherwise, a new node is created.
[0051] In practical implementation, the versioning management of node attributes in this embodiment of the invention can be understood as follows: for example, the "application field" attribute of the same material can retain multiple values from different documents, and record the source evidence of each value. Through versioning management, the traceability of knowledge can be effectively ensured.
[0052] In one feasible implementation, when incremental changes occur in the historical R&D documents, the knowledge graph is dynamically updated, which may include, but is not limited to: Calculate the initial weight of the newly added relation edge using the following formula: W new =λ1×F 权威性 +λ2×F 置信度 +λ3×F 一致性 +λ4×F 频率 +λ5×F 时效性 ; Where λ1 to λ5 represent weighting factors; F 权威性 Indicates the authority level of the source document; F 置信度 F represents the output confidence of the information extraction model. 一致性 This indicates the degree of logical compatibility between the relation edge and existing adjacent relation edges in the knowledge graph; F 频率 Indicates the frequency of relation edges appearing in different documents; F 时效性 The exponential decay factor representing the timestamp of the source document; The weights of existing relational edges are updated using the exponential moving average method.
[0053] In specific implementation, λ1 to λ5 in the embodiments of the present invention can adopt fixed weights or adjustable weights, and no limitation is made here.
[0054] F in the embodiments of the present invention 权威性 The reliability of knowledge can be determined by the number of times the source document is cited and the importance level of the project it belongs to; knowledge from highly cited, core projects is more reliable. 置信度 The confidence score, which can be directly taken from the output of the natural language recognition model, reflects the technical reliability of the relation edge extraction process. 一致性 This can be obtained by evaluating the logical compatibility of the relation edge with existing adjacent relation edges in the graph. Relationships compatible with existing knowledge networks are more likely to be true. For example, if the graph already contains "material A is resistant to high temperatures," the newly extracted relation "material A melts at low temperatures" will receive a low score due to inconsistency. This gives the graph a preliminary logical verification capability. 频率 To represent the frequency of a relation edge across different documents, logarithmic smoothing can be used to introduce statistical evidence that relations mentioned more frequently are more prevalent. 时效性 The exponential decay factor based on document timestamps reflects the freshness of knowledge, making the system prefer new knowledge and achieving natural decay. This embodiment of the invention calculates the initial weights of newly added relationship edges through multi-evidence fusion, ensuring the objectivity and dynamism of the initial weights.
[0055] In the specific implementation process, the embodiments of the present invention use the exponential moving average method to update the weights of existing relational edges, which can avoid sudden changes in weights, effectively balance historical accumulation and the latest discoveries, and ensure stable evolution.
[0056] S3: Listen for the new R&D document writing event triggered by the user and obtain the text content entered by the user in the new R&D document.
[0057] In specific implementation, embodiments of the present invention can deploy plugins in VS Code, JetBrains, or web editors to listen for the writing event of new development documents and obtain the text content entered by the user in the new development documents.
[0058] S4: Extract the semantic vector and technical entity to be retrieved from the text content, perform semantic retrieval based on vector similarity and graph retrieval based on knowledge graph association path in parallel, and merge and sort the several retrieval results obtained by the semantic retrieval and the graph retrieval to generate a recommended knowledge list.
[0059] In specific implementation, the semantic retrieval based on vector similarity in this embodiment of the invention may include, but is not limited to, calculating the cosine similarity between the semantic vector to be retrieved and all historical document fragments in the vector database, and recalling the Top-K most similar knowledge fragments. The graph retrieval based on knowledge graph association paths in this embodiment of the invention may include, but is not limited to, performing multi-hop queries in the knowledge graph starting from the technical entity to be retrieved, discovering related nodes (such as different solutions to the same problem, different application cases of the same method, related parameter standards, etc.), and extracting knowledge fragments on the association paths. Then, the knowledge fragments obtained through semantic retrieval and the knowledge fragments obtained through graph retrieval are fused and sorted to generate a recommended knowledge list.
[0060] In one feasible implementation, the fusion and sorting of several search results obtained from the semantic retrieval and the graph retrieval in this embodiment of the invention may include, but is not limited to: The overall score for each search result is calculated using the following formula: S core =a×S vector + β×S graph ; Among them, S core S represents the overall score, a and β represent the fusion coefficients, and S vector S represents the score based on vector cosine similarity. graph This represents the relevance score calculated based on the weight of the knowledge graph's associated paths and the number of hops.
[0061] The embodiments of the present invention can effectively balance the accuracy and coverage of recommended knowledge by fusing and sorting several search results obtained from semantic retrieval and graph retrieval.
[0062] S5: Push the recommended knowledge list.
[0063] In specific implementation, embodiments of the present invention may, but are not limited to, displaying a recommendation list in the sidebar or floating window of the editor. The content of the recommendation list in embodiments of the present invention may, but is not limited to, include: 1) core excerpts from the original text; 2) source projects / documents; 3) a brief description of the matching criteria (such as "similar to the problem you are currently describing"); 4) a matching percentage, etc., which are not limited here.
[0064] In specific implementation, embodiments of the present invention can also provide intuitive interactive operation buttons such as "one-click insertion", "view details", and "ignore", thereby minimizing the cost for users to acquire knowledge.
[0065] In one feasible implementation, after pushing the recommended knowledge list, the embodiments of the present invention may, but are not limited to, further include: The system acquires user interaction behavior based on the recommended knowledge list and optimizes the structure of the knowledge graph and the fusion ranking strategy based on the interaction behavior.
[0066] In specific implementation, the interactive behaviors in the embodiments of the present invention may include, but are not limited to, the user's unconscious "accept", "view" or "ignore" behaviors. By collecting the user's unconscious "accept", "view" or "ignore" behaviors, it is more convenient than displaying ratings and the amount of data is larger, thereby providing more data support for the structure of the knowledge graph and the optimization of the fusion ranking strategy.
[0067] In one feasible implementation, optimizing the structure of the knowledge graph based on the interaction behavior in this embodiment of the invention may include, but is not limited to: Based on the frequency of adoption of knowledge in the recommended knowledge list, adjust the centrality metric of the corresponding node or the weight of the corresponding relation edge in the knowledge graph.
[0068] In the specific implementation process, if a piece of knowledge is frequently adopted, the degree centrality of its related entity nodes or the weight of its edges will be enhanced; if it is frequently ignored, the degree centrality will be weakened accordingly, thereby realizing the "use and disuse" principle of the graph structure. This makes widely adopted knowledge more prominent and easier to retrieve in the graph, improving retrieval efficiency and recommendation accuracy.
[0069] In one feasible implementation, the strategy for optimizing the fusion sorting based on the interaction behavior in this embodiment of the invention may include, but is not limited to: Using the interaction behavior as training samples, the fusion coefficient in the fusion ranking is adjusted periodically.
[0070] In the specific implementation process, the embodiments of the present invention use user interaction data as training data to continuously optimize the fusion coefficient (α, β), so that knowledge recommendation can adapt to the knowledge preferences and retrieval habits of a specific team or project.
[0071] The knowledge recommendation method based on the R&D process provided in this embodiment of the invention actively and accurately pushes relevant knowledge by sensing the developer's context, which greatly reduces the time consumption of information retrieval and improves R&D efficiency.
[0072] Furthermore, the knowledge graph in this embodiment of the invention adopts a dynamic update mechanism, which enables knowledge to have strength, timeliness and credibility, and can evolve dynamically, automatically link cross-project experience, break down R&D information silos, effectively promote the reuse of technological achievements, avoid redundant R&D, and save R&D resources and costs.
[0073] Furthermore, this embodiment of the invention employs a dual-channel fusion retrieval (semantic vector + knowledge graph), which not only finds documents with "similar text" but also provides logically related knowledge, greatly enhancing the comprehensiveness and scientific nature of decision-making.
[0074] Furthermore, based on the user feedback loop, the system continuously optimizes the graph and recommendation strategy, becoming more accurate with use, forming a positive cycle of "use-feedback-evolution", and ultimately solidifying and passing on scattered personal experiences into an organization's evolvable intelligent assets.
[0075] Based on the knowledge recommendation method based on the R&D process provided in the first aspect, embodiments of the present invention provide a knowledge recommendation system based on the R&D process, such as... Figure 3 As shown, the knowledge recommendation system based on the R&D process includes: The document structuring processing module 110 is used to obtain historical R&D documents of different R&D projects, extract the technical entities and semantic relationships between the technical entities in the historical R&D documents, form structured semantic triples, and convert the historical R&D documents into semantic vectors and store them in a vector database. The knowledge graph construction and dynamic update module 120 is used to construct a knowledge graph using the structured semantic triples as relation edges, and to dynamically update the knowledge graph when there are incremental changes in the historical R&D documents. The listening module 130 is used to listen for the new R&D document writing event triggered by the user and obtain the text content entered by the user in the new R&D document; The multi-source fusion knowledge retrieval module 140 is used to extract the semantic vector and technical entity to be retrieved from the text content, perform semantic retrieval based on vector similarity and graph retrieval based on knowledge graph association path in parallel, and fuse and sort the several retrieval results obtained by the semantic retrieval and the graph retrieval to generate a recommended knowledge list. The push module 150 is used to push the recommended knowledge list.
[0076] Based on the knowledge recommendation method based on the R&D process provided in the first aspect, this embodiment of the invention also provides a storage medium storing a computer-executable program. The computer-executable program is used to cause a computer to execute the knowledge recommendation method based on the R&D process as described in any implementation of the first aspect. Explanations of the relevant content and descriptions of the beneficial effects of any of the computer-readable storage media provided above can be found in the corresponding embodiments described above, and will not be repeated here.
[0077] Those skilled in the art will understand that the program for implementing all or part of the steps of the above embodiments, which can be executed by a program instructing related hardware, can be stored in a computer-readable storage medium. The storage medium mentioned above can be a read-only memory, a random access memory, etc. The processing unit or processor mentioned above can be a central processing unit, a general-purpose processor, an application-specific integrated circuit (ASIC), a microprocessor (DSP), a field-programmable gate array (FPGA), or other programmable logic devices, transistor logic devices, hardware components, or any combination thereof.
[0078] This invention also provides a computer program product containing instructions that, when executed on a computer, cause the computer to perform any of the methods described in the above embodiments. The computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on a computer, all or part of the flow or function according to the embodiments of this invention is generated. The computer can be a general-purpose computer, a special-purpose computer, a computer network, or other programmable device. The computer instructions can be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another. For example, computer instructions can be transmitted from one website, computer, server, or data center to another via wired (e.g., coaxial cable, fiber optic, digital subscriber line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.) means. The computer-readable storage medium can be any available medium accessible to a computer or a data storage device such as a server or data center that integrates one or more available media. The available medium can be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., SSD), etc.
[0079] It should be noted that the devices for storing computer instructions or computer programs provided in the embodiments of the present invention, such as, but not limited to, the aforementioned memory, computer-readable storage medium, and communication chip, are all non-transitory. Those skilled in the art should recognize that the functions described in the embodiments of the present invention in one or more of the above examples can be implemented using hardware, software, firmware, or any combination thereof. When implemented using software, these functions can be stored in a computer-readable storage medium or transmitted as one or more instructions or code on a computer-readable storage medium. Computer-readable storage media include computer storage media and communication media, wherein communication media include any medium that facilitates the transmission of computer programs from one place to another. Storage media can be any available medium accessible to general-purpose or special-purpose computers.
[0080] Although embodiments of the present invention have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present invention. Those skilled in the art can make changes, modifications, substitutions and variations to the above embodiments within the scope of the present invention.
Claims
1. A knowledge recommendation method based on the R&D process, characterized in that, include: Obtain historical R&D documents from different R&D projects, extract the technical entities and semantic relationships between the technical entities from the historical R&D documents, form structured semantic triples, and convert the historical R&D documents into semantic vectors and store them in a vector database; A knowledge graph is constructed using the structured semantic triples as relation edges, and the knowledge graph is dynamically updated when there are incremental changes in the historical R&D documents. Listen for user-triggered new R&D document writing events and obtain the text content entered by the user in the new R&D document; Extract the semantic vector and technical entity to be retrieved from the text content, perform semantic retrieval based on vector similarity and graph retrieval based on knowledge graph association path in parallel, and fuse and sort the several retrieval results obtained by the semantic retrieval and the graph retrieval to generate a recommended knowledge list; The recommended knowledge list will be pushed to you.
2. The knowledge recommendation method based on the R&D process according to claim 1, characterized in that, The step of extracting technical entities and semantic relationships between these entities from historical R&D documents to form structured semantic triples includes: The historical R&D documents are identified using a natural language processing model fine-tuned based on professional R&D corpus to obtain the technical entities and the semantic relationships between them, and the semantic relationships between the technical entities are converted into structured semantic triples.
3. The knowledge recommendation method based on the R&D process according to claim 1, characterized in that, The step of dynamically updating the knowledge graph when incremental changes occur in the historical R&D documents includes: Extract incremental technical entities from incremental historical R&D documents, calculate the multi-dimensional comprehensive similarity between the incremental technical entities and existing nodes in the knowledge graph, and determine whether the multi-dimensional comprehensive similarity exceeds a preset similarity threshold. If so, the incremental technical entity is merged with the existing node, and the attributes of the merged node are managed in a versioned manner. If not, create a new node.
4. The knowledge recommendation method based on the R&D process according to claim 3, characterized in that, The step of dynamically updating the knowledge graph when incremental changes occur in the historical R&D documents includes: Calculate the initial weight of the newly added relation edge using the following formula: IN new =λ1×F 权威性 +λ2×F 置信度 +λ3×F 一致性 +λ4×F 频率 +λ5×F 时效性 ; Where λ1 to λ5 represent weighting factors; F 权威性 Indicates the authority level of the source document; F 置信度 F represents the output confidence of the information extraction model. 一致性 This indicates the degree of logical compatibility between the relation edge and existing adjacent relation edges in the knowledge graph; F 频率 Indicates the frequency of relation edges appearing in different documents; F 时效性 The exponential decay factor representing the timestamp of the source document; The weights of existing relational edges are updated using the exponential moving average method.
5. The knowledge recommendation method based on the R&D process according to claim 1, characterized in that, The process of fusing and sorting the search results obtained from the semantic search and the graph search includes: The overall score for each search result is calculated using the following formula: S core =a×S vector + β×S graph ; Among them, S core S represents the overall score, a and β represent the fusion coefficients, and S vector S represents the score based on vector cosine similarity. graph This represents the relevance score calculated based on the weight of the knowledge graph's associated paths and the number of hops.
6. The knowledge recommendation method based on the R&D process according to claim 1, characterized in that, After pushing the recommended knowledge list, the following is also included: The system acquires user interaction behavior based on the recommended knowledge list and optimizes the structure of the knowledge graph and the fusion ranking strategy based on the interaction behavior.
7. The knowledge recommendation method based on the R&D process according to claim 6, characterized in that, The optimization of the knowledge graph structure based on the interaction behavior includes: Based on the frequency of adoption of knowledge in the recommended knowledge list, adjust the centrality metric of the corresponding node or the weight of the corresponding relation edge in the knowledge graph.
8. The knowledge recommendation method based on the R&D process according to claim 6, characterized in that, The optimization of the fusion sorting strategy based on the interaction behavior includes: Using the interaction behavior as training samples, the fusion coefficient in the fusion ranking is adjusted periodically.
9. A knowledge recommendation system based on the R&D process, characterized in that, include: The document structuring module is used to obtain historical R&D documents from different R&D projects, extract the technical entities and semantic relationships between the technical entities from the historical R&D documents, form structured semantic triples, and convert the historical R&D documents into semantic vectors and store them in a vector database. The knowledge graph construction and dynamic update module is used to construct a knowledge graph using the structured semantic triples as relation edges, and to dynamically update the knowledge graph when there are incremental changes in the historical R&D documents; The listening module is used to listen for new R&D document writing events triggered by users and obtain the text content entered by the user in the new R&D document; The multi-source fusion knowledge retrieval module is used to extract the semantic vector and technical entity to be retrieved from the text content, perform semantic retrieval based on vector similarity and graph retrieval based on knowledge graph association path in parallel, and fuse and sort the several retrieval results obtained by the semantic retrieval and the graph retrieval to generate a recommended knowledge list. The push module is used to push the recommended knowledge list.
10. An electronic device, characterized in that, include: A memory, one or more processors; the memory is coupled to the processors; wherein the memory stores computer program code, the computer program code including computer instructions, which, when executed by the processor, cause the electronic device to perform the knowledge recommendation method based on the research and development process as described in any one of claims 1 to 8.