Library reference consultation intelligent question and answer system and method based on knowledge graph reasoning

By constructing a fine-grained knowledge graph and multi-hop logical reasoning, the problem of insufficient knowledge representation in library reference services is solved, enabling accurate analysis and high-quality response to user inquiries, and improving the professionalism and user experience of intelligent library services.

CN122240786APending Publication Date: 2026-06-19ZHEJIANG TECH INST OF ECONOMY

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
ZHEJIANG TECH INST OF ECONOMY
Filing Date
2026-03-25
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

Existing intelligent question-answering technologies suffer from insufficient knowledge representation structure, weak reasoning mechanisms, and inadequate domain modeling so as to fail to meet the requirements of deep understanding and high-quality response to complex user intents in library reference consultation scenarios.

Method used

We construct a fine-grained knowledge graph based on knowledge graphs, combine it with multi-hop logical reasoning capabilities, generate multi-hop association reasoning results through entity recognition, relation extraction and semantic understanding, and provide natural language responses by combining context and user role information.

🎯Benefits of technology

It enables accurate analysis and high-quality response to library professional consultation services, enhances the professionalism and interactive experience of consultation services, and can handle complex user intents while maintaining the timeliness of knowledge graphs.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122240786A_ABST
    Figure CN122240786A_ABST
Patent Text Reader

Abstract

This invention relates to the field of artificial intelligence technology, specifically to an intelligent question-answering system and method for library reference consultation based on knowledge graph reasoning. It includes integrating multi-source heterogeneous data to construct a fine-grained library knowledge graph, parsing the user's complex intent through a pre-trained model and syntactic analysis, then performing multi-hop logical reasoning within the graph to filter legitimate paths based on this intent, and finally generating a professional and coherent natural language response text by combining the reasoning results with the context. This invention integrates multi-source library data to construct a fine-grained entity relationship graph, extracts core predicates and target constraints through semantic analysis to parse complex intents, and then performs path expansion and rule consistency verification within the graph to form a cross-domain reasoning chain. Subsequently, it reconstructs a logically coherent professional response by combining the dialogue context and user role, and finally generates differential triples based on business changes to maintain the continuous dynamic updating of the knowledge graph.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of artificial intelligence technology, and in particular to an intelligent question-and-answer system and method for library reference consultation based on knowledge graph reasoning. Background Technology

[0002] With the continuous development of artificial intelligence and knowledge service technologies, library reference and consultation systems are gradually evolving towards intelligence and personalization. Knowledge graph-based intelligent question-answering systems, due to their ability to achieve semantic understanding, associative reasoning, and structured knowledge representation, have become an important technological path to improve the efficiency and quality of library consultation services. Currently, there is an urgent need to build intelligent question-answering systems that can integrate multi-source heterogeneous knowledge resources, support complex semantic parsing and logical reasoning, and possess domain adaptability to meet users' needs for accurate, in-depth, and context-coherent reference and consultation services. However, existing intelligent question-answering technologies still have significant shortcomings in terms of knowledge organization granularity, reasoning mechanism design, and domain adaptability when facing specific library scenarios.

[0003] A search revealed a publicly available intelligent question-answering system based on business knowledge graph retrieval, with publication number CN106951470B. This system constructs a business knowledge graph, utilizes a graph retrieval engine to identify user input intent, and returns relevant business knowledge. While this approach introduces a knowledge graph structure to enhance question-answer relevance, it primarily relies on a static graph retrieval mechanism. This limitation makes it ineffective in handling questions involving multi-hop relationships or cross-topic associations. For example, it struggles to effectively develop associative reasoning for complex inquiries such as "recommend foreign language databases and access methods related to *Dream of the Red Chamber* research." Furthermore, the system does not specifically model entity types unique to libraries (such as collections, classification systems, subject matter experts, and open access policies), resulting in coarse-grained knowledge representation and limiting its applicability in professional reference consultations.

[0004] A search revealed a publicly available question-and-answer method for technical consultations in the agricultural field, with publication number CN116911311B. This method employs a two-layer similarity calculation mechanism, combining sentence-level semantic vectors with named entity word-level matching to improve the accuracy of question-and-answering in vertical domains. While this method achieves good semantic matching results in the agricultural field, its core remains based on vector similarity, failing to construct an explicit structured knowledge graph or integrate rule-based or path-based logical reasoning modules. In library reference consultation scenarios, user questions often involve resource location, explanation of borrowing rules, description of document delivery processes, or citation of academic norms. These typically require deduction based on factual chains; relying solely on semantic similarity matching may lead to incomplete or biased answers. Furthermore, this method relies on a large amount of labeled data to construct a domain-specific word tree. Given the frequent dynamic updates of library knowledge systems and the diverse and heterogeneous nature of data sources, the model maintenance cost is high, and its generalization ability is limited.

[0005] The aforementioned problems indicate that existing intelligent question-answering technologies, when applied to library reference consultation scenarios, generally suffer from deficiencies such as insufficient knowledge representation structure, weak reasoning mechanisms, and inadequate domain modeling refinement, thus failing to adequately support deep understanding and high-quality responses to complex user intentions. Therefore, this invention proposes an intelligent question-answering system for library reference consultation based on knowledge graph reasoning. This system aims to construct a fine-grained knowledge graph tailored to library operations, integrating semantic understanding and multi-hop logical reasoning capabilities to achieve accurate parsing, association expansion, and intelligent responses to user inquiries, thereby significantly improving the professionalism, accuracy, and interactive experience of intelligent library consultation services. Summary of the Invention

[0006] The purpose of this invention is to address the shortcomings of existing technologies by proposing an intelligent question-and-answer system and method for library reference consultation based on knowledge graph reasoning.

[0007] To achieve the above objectives, the present invention adopts the following technical solution: an intelligent question-and-answer system for library reference services based on knowledge graph reasoning, the system comprising:

[0008] The fine-grained knowledge graph construction module acquires multi-source heterogeneous data streams from the library, performs entity recognition and relation extraction on the data streams, normalizes entity types according to library business semantic rules, and establishes a four-tuple structured representation containing resource entities, rule entities, personnel entities, and policy entities to generate a fine-grained knowledge graph for the library domain.

[0009] The composite intent parsing module receives natural language query input from users, generates semantic vector representations through a pre-trained language model, and extracts the core predicates, target resource types, and constraints in the question by combining named entity recognition results and dependency parsing tree. It then calls the entity index in the fine-grained knowledge graph for preliminary matching and generates a composite intent structured description containing the main intent and auxiliary constraints.

[0010] The multi-hop logic reasoning engine module, based on the target resource type and constraints in the composite intent structured description, performs path expansion operations in the fine-grained knowledge graph, sequentially traverses the adjacent nodes that are directly related to the starting entity, and performs legality verification on each candidate path according to the preset reasoning rule set, filters out the set of reasoning paths that meet all constraints and whose path length does not exceed the preset hop limit, and generates a multi-hop association reasoning result set.

[0011] The dynamic response generation module calls the entity attribute values ​​and relationship descriptions corresponding to each reasoning path in the multi-hop association reasoning result set, combines the contextual dialogue state and user role identifier, and fills in the content and reorganizes the sentences according to the preset response template library to generate natural language response text that conforms to the professional context of the library and has logical coherence.

[0012] As a further aspect of the present invention, the fine-grained knowledge graph includes a library resource entity subgraph, a borrowing rule entity subgraph, a subject expert entity subgraph, and an open access policy entity subgraph. The composite intent structured description includes a core resource query intent, a rule interpretation intent, an expert recommendation intent, and a policy applicability judgment intent. The multi-hop association reasoning result set includes a single resource location path, a cross-database retrieval path, a rule chain derivation path, and a policy-resource mapping path. The natural language response text includes resource acquisition guidance, rule clause citations, expert contact information, and open access compliance suggestions.

[0013] As a further aspect of the present invention, the fine-grained knowledge graph construction module includes:

[0014] The multi-source data fusion submodule acquires the library catalog data stream output by the library integrated management system, the expert directory data stream published by the subject service platform, the open access policy document stream exported by the institutional knowledge base, and the user interaction sequence recorded by the reference consultation log system. It performs format standardization processing on each type of data stream, converts unstructured text into Unified Markup Language documents, and extracts the timestamps, source identifiers and update frequency metadata to generate a set of raw data units with spatiotemporal labels.

[0015] The entity type normalization submodule calls all text units in the original data unit set, performs type mapping on the identified named entities based on the entity type system defined by the library domain ontology, unifies the resource descriptions into the library collection resource entity type, unifies the rule descriptions into the borrowing rule entity type, unifies the personnel descriptions into the subject expert entity type, and unifies the policy descriptions into the open access policy entity type, and generates a type-normalized entity list.

[0016] The quadruple relation modeling submodule calls all entity pairs in the type-normalized entity list, extracts semantic relations between entities from the original text context according to the preset relation extraction rule set, and adds confidence scores and source evidence identifiers to each relation triple, constructing a quadruple structure consisting of four parts: subject, predicate, object, and evidence, and generating a fine-grained knowledge graph for the library domain.

[0017] As a further aspect of the present invention, the composite intent parsing module includes:

[0018] The semantic vector encoding submodule receives natural language consultation text input by the user, calls a pre-trained language model to perform word segmentation and embedding processing on the text, generates a sequence of word vectors with context awareness, and aggregates them into a global semantic vector through pooling operations. At the same time, it uses a named entity recognition model to annotate key entities and their type labels in the text, generating semantic vector representations with entity annotations.

[0019] The dependency structure parsing submodule calls the semantic vector representation with entity annotations to construct a dependency parsing tree, identifies the root node predicate, subject-object components and modification and limiting structures in the sentence, extracts the core action verb and target noun phrase, and combines entity type labels to determine the resource domain, rule domain or personnel domain involved in the question, and generates a preliminary intent framework.

[0020] The constraint extraction submodule calls the modification and limitation structure in the preliminary intent framework to identify constraint elements such as time range, geographical restrictions, language preferences, and access permissions, and aligns them with the corresponding attribute fields in the fine-grained knowledge graph to generate a composite intent structured description containing the main intent identifier and constraint key-value pairs.

[0021] As a further aspect of the present invention, the multi-hop logic reasoning engine module includes:

[0022] The starting node location submodule calls the main intent identifier and core entity name in the composite intent structured description, retrieves the matching set of starting nodes in the fine-grained knowledge graph, and performs preliminary filtering of the starting nodes according to the constraint key-value pairs to generate a list of valid starting nodes.

[0023] The path expansion and traversal submodule takes each node in the list of valid starting nodes as the starting point, executes a breadth-first search strategy, expands to adjacent nodes layer by layer, and applies a preset inference rule set during each layer expansion process to generate candidate paths for logical consistency verification.

[0024] The path validity determination submodule calls all candidate paths output by the path expansion traversal submodule, calculates the cumulative confidence score of each path, and removes paths whose length exceeds the preset hop count limit or contains conflicting relationships, retaining paths that meet all constraints and whose confidence score is higher than the preset threshold, and generating a multi-hop association reasoning result set.

[0025] As a further aspect of the present invention, the dynamic response generation module includes:

[0026] The template matching selection submodule calls the path type identifier in the multi-hop association reasoning result set, retrieves the matching template structure from the preset response template library, and loads the mapping relationship between placeholders in the template and knowledge graph attribute fields.

[0027] The context-aware fill submodule calls the mapping relationship, extracts the corresponding entity attribute values ​​and relationship description text from the multi-hop association reasoning result set, and dynamically fills the template placeholders by combining the current dialogue round number, user identity category and historical interaction preferences to generate a preliminary response draft.

[0028] The sentence fluency optimization submodule performs grammatical correction and connector insertion operations on the preliminary response draft to ensure that the resulting response text is logically coherent and conforms to the library's professional consultation service standards in terms of style, and finally outputs the natural language response text.

[0029] As a further aspect of the present invention, the system further includes:

[0030] The knowledge graph incremental update module monitors data change events in various library business systems. When it detects new additions to the collection, rule revisions, changes in expert information, or policy updates, it triggers the incremental extraction process, performs entity recognition and relationship comparison on the changed data, generates a set of difference triples, and merges them into the existing fine-grained knowledge graph to maintain the timeliness and integrity of the knowledge graph.

[0031] The set of difference triples includes triples for newly added entities, triples for updated attribute values, and triples for invalidated relations.

[0032] As a further aspect of the present invention, the knowledge graph incremental update module includes:

[0033] The change event capture submodule subscribes to data change notifications from the library integrated management system, subject service platform, and institutional knowledge base through the message queue interface, parses the change type, entity identifier, and new and old value fields in the notification message, and generates change event records with timestamps.

[0034] The difference triple generation submodule calls the entity identifier in the change event record, locates the corresponding entity node in the fine-grained knowledge graph, compares the old and new attribute values ​​or relation adjacency table, generates difference triples that reflect the change content, and attaches change source and effective time tags.

[0035] The graph fusion writing submodule categorizes the set of difference triples according to the change type, performs graph insertion operation on newly added triples, performs attribute overwrite operation on updated triples, performs relation disconnection operation on invalid triples, and synchronously updates the cache index of related inference paths to complete the incremental update of the knowledge graph.

[0036] A knowledge graph-based intelligent question-answering method for library reference services, which is executed based on the aforementioned knowledge graph-based intelligent question-answering system for library reference services, includes the following steps:

[0037] S1: Acquire multi-source heterogeneous data streams from the library, perform entity recognition and relation extraction on the data streams, normalize entity types according to library business semantic rules, and establish a four-tuple structured representation containing resource entities, rule entities, personnel entities, and policy entities to generate a fine-grained knowledge graph for the library domain.

[0038] S2: Receive natural language consultation input from the user, generate semantic vector representation through a pre-trained language model, and combine named entity recognition results with dependency parsing tree to extract the core predicates, target resource types and constraints in the question. Call the entity index in the fine-grained knowledge graph for preliminary matching to generate a composite intent structured description containing the main intent and auxiliary constraints.

[0039] S3: Based on the target resource type and constraints in the composite intent structured description, perform path expansion operation in the fine-grained knowledge graph, sequentially traverse the adjacent nodes that are directly related to the starting entity, and perform legality verification on each candidate path according to the preset inference rule set, filter out the inference path set that meets all constraints and whose path length does not exceed the preset hop limit, and generate a multi-hop association inference result set.

[0040] S4: Call the entity attribute values ​​and relationship descriptions corresponding to each reasoning path in the multi-hop association reasoning result set, combine the context dialogue state and user role identifier, fill in the content and reorganize the sentences according to the preset response template library, and generate natural language response text that conforms to the professional context of the library and has logical coherence.

[0041] S5: Monitor data change events in various library business systems. When new collection additions, rule revisions, changes in expert information, or policy updates are detected, trigger the incremental extraction process, perform entity recognition and relationship comparison on the changed data, generate a set of difference triples, and merge them into the existing fine-grained knowledge graph to maintain the timeliness and integrity of the knowledge graph.

[0042] Compared with the prior art, the advantages and positive effects of the present invention are as follows:

[0043] This invention integrates metadata from the library's collections, subject classification systems, expert resumes, open access policy texts, and historical consultation records to construct a fine-grained entity relationship structure. Entity recognition and relation extraction are used to form a unified semantic expression for resource entities, rule entities, personnel entities, and policy entities, enhancing the structural connections between various knowledge elements in the library. Semantic vector representation, named entity recognition, and dependency analysis are used to extract core predicates, target resource types, and constraints, constructing a composite semantic structure containing the main intent and auxiliary conditions. Path expansion and rule consistency verification are performed in the knowledge graph based on relational semantics, forming a reasoning chain across resource information, management rules, and expert information. Simultaneously, sentence organization and content reconstruction are combined with dialogue context and user role information to promote a coherent logical structure in consultation responses. Furthermore, differential triples are generated based on changes in business data and continuously incorporated into the knowledge structure, maintaining the ability to continuously update knowledge relationships. Attached Figure Description

[0044] Figure 1 This is a system flowchart of the present invention;

[0045] Figure 2 This is a flowchart of the system modules of the present invention;

[0046] Figure 3 This is a flowchart of the method of the present invention. Detailed Implementation

[0047] The technical solution of the present invention will now be described with reference to the accompanying drawings.

[0048] In embodiments of the present invention, words such as "exemplarily," "for example," etc., are used to indicate that something is an example, illustration, or description. Any embodiment or design described as "exemplary" in the present invention should not be construed as being more preferred or advantageous than other embodiments or designs. Specifically, the use of the word "exemplary" is intended to present the concept in a concrete manner. Furthermore, in embodiments of the present invention, the meaning expressed by "and / or" can be both, or either one.

[0049] In the embodiments of this invention, the terms "image" and "picture" may sometimes be used interchangeably. It should be noted that, without emphasizing the distinction between them, they convey the same meaning. Similarly, the terms "of," "corresponding (relevant)," and "corresponding" may sometimes be used interchangeably. It should be noted that, without emphasizing the distinction between them, they convey the same meaning.

[0050] In this embodiment of the invention, sometimes a subscript such as W1 may be written in a non-subscript form such as W1. When the difference is not emphasized, the meaning they express is the same.

[0051] To make the technical problems, technical solutions and advantages of the present invention clearer, a detailed description will be given below in conjunction with the accompanying drawings and specific embodiments.

[0052] Please see Figure 1 This invention provides a technical solution: an intelligent question-and-answer system for library reference services based on knowledge graph reasoning, the system comprising:

[0053] The fine-grained knowledge graph construction module acquires multi-source heterogeneous data streams from the library, including collection metadata, subject classification system, expert resume information, open access policy texts, and user historical consultation records. It performs entity recognition and relation extraction on the data streams, normalizes entity types according to library business semantic rules, and establishes a four-tuple structured representation containing resource entities, rule entities, personnel entities, and policy entities to generate a fine-grained knowledge graph for the library domain.

[0054] The composite intent parsing module receives natural language query input from users, generates semantic vector representations through a pre-trained language model, and extracts the core predicates, target resource types, and constraints from the question by combining named entity recognition results and dependency parsing trees. It then calls the entity index in the fine-grained knowledge graph for preliminary matching and generates a structured description of the composite intent containing the main intent and auxiliary constraints.

[0055] The multi-hop logic reasoning engine module, based on the target resource type and constraints in the composite intent structured description, performs path expansion operations in the fine-grained knowledge graph, sequentially traverses the adjacent nodes that are directly related to the starting entity, and performs legality verification on each candidate path according to the preset reasoning rule set, and selects the set of reasoning paths that meet all constraints and whose path length does not exceed the preset hop limit, and generates a multi-hop association reasoning result set.

[0056] The dynamic response generation module calls the entity attribute values ​​and relationship descriptions corresponding to each reasoning path in the multi-hop association reasoning result set, combines the contextual dialogue state and user role identifier, and fills in the content and reorganizes the sentences according to the preset response template library to generate natural language response text that conforms to the professional context of the library and has logical coherence.

[0057] The knowledge graph incremental update module monitors data change events in various library business systems. When it detects new additions to the collection, rule revisions, changes in expert information, or policy updates, it triggers an incremental extraction process. It performs entity recognition and relation comparison on the changed data, generates a set of difference triples, and merges them into the existing fine-grained knowledge graph to maintain the timeliness and integrity of the knowledge graph. The set of difference triples includes new entity triples, attribute value update triples, and relation invalidation triples.

[0058] The fine-grained knowledge graph includes entity subgraphs for library resources, borrowing rules, subject matter experts, and open access policies. The structured description of composite intents includes core resource query intents, rule interpretation intents, expert recommendation intents, and policy applicability judgment intents. The multi-hop associative reasoning result set includes single-resource location paths, cross-database retrieval paths, rule chain derivation paths, and policy-resource mapping paths. The natural language response text includes resource access guidance, rule clause citations, expert contact information, and open access compliance suggestions.

[0059] Please see Figure 2 The fine-grained knowledge graph construction module includes:

[0060] The multi-source data fusion submodule acquires the library catalog data stream output by the library integrated management system, the expert directory data stream published by the subject service platform, the open access policy document stream exported by the institutional knowledge base, and the user interaction sequence recorded by the reference consultation log system. It performs format standardization processing on each type of data stream, converts unstructured text into Unified Markup Language documents, and extracts the timestamps, source identifiers and update frequency metadata to generate a set of raw data units with spatiotemporal labels.

[0061] The system establishes a long-lived connection with the underlying database via a configured RESTful API interface, enabling real-time concurrent retrieval of data streams from the library's integrated management system (MARC format), expert directories from the subject service platform (JSON format), open access policy documents from the institutional knowledge base (XML format), and text interaction sequences from the reference consultation system. The system performs field parsing and format standardization on the acquired raw data, extracting the "200$a" field from MARC and the "title" field from JSON into a unified title identifier, and encapsulating it into a markup language document according to the XMLSchema specification. During the conversion process, the system reads millisecond-level timestamps provided by a high-precision time synchronization server to record the absolute time of data generation. For the extraction of update frequency metadata, the system calculates the time difference between two adjacent fetching tasks. If the previous capture time was Seconds, this time seconds, calculation Seconds. System sets frequency reference value. Seconds, by comparison and The size of the data source determines whether it is a high-frequency update source. The system encapsulates the timestamp, source IP identifier, and update frequency level into the packet header, and integrates them through a memory buffer to generate a set of raw data units with spatiotemporal tags.

[0062] The entity type normalization submodule calls all text units in the original data unit set, performs type mapping on the identified named entities based on the entity type system defined by the library domain ontology, and unifies the descriptions of book, journal, and dissertation resources into the library collection resource entity type, unifies the descriptions of borrowing period, overdue fines, and document delivery process rules into the borrowing rule entity type, unifies the descriptions of professor, researcher, and subject librarian into the subject expert entity type, and unifies the descriptions of gold OA, green OA, and CC protocol policies into the open access policy entity type, generating a type-normalized entity list;

[0063] The system calls upon the original set of data units and performs entity category mapping on the text units based on the library domain ontology framework. It uses the Word2Vec model to transform the identified entity representations into 256-dimensional feature vectors and calculates their cosine similarity to the standard categories defined in the ontology. For expressions such as "senior librarian," "professor," and "researcher," the system calculates the similarity between their vectors and the standard category "subject matter expert."

[0064] Table 1: Entity Type Mapping Similarity and Interval Determination Table

[0065]

[0066] As shown in Table 1, the system sets the normalization threshold to 0.88. If fall into For intervals, a forced merge operation is performed. For example, the system determines the similarity of "professor". The system uniformly updated its description to "subject matter expert". This result shows that the system eliminated redundancy of synonymous heterogeneous entities by numerically comparing fuzzy titles, and generated a type-normalized entity list.

[0067] The quadruple relation modeling submodule calls all entity pairs in the type-normalized entity list, extracts semantic relations between entities from the original text context according to the preset relation extraction rule set, including predicates of belonging, location, responsibility, application, and reference relations, and adds confidence scores and source evidence identifiers to each relation triple, constructs a quadruple structure consisting of four parts: subject, predicate, object, and evidence, and generates a fine-grained knowledge graph for the library domain;

[0068] The system iterates through the type-normalized entity list and extracts semantic relationships between entities based on a predefined set of relationship extraction rules. For the entity pair "Professor Zhang" and "Artificial Intelligence Database," the system identifies its dependency path in the original text, extracts the relation predicate "responsible," and calculates the confidence score for this relation. Confidence level is determined by the text distance coefficient. Co-occurrence weights with semantics The product is obtained. (Setting) ,in This represents the character distance between entities. If the distance is 2 characters, then... ; Based on the frequency setting of historical corpus, if the probability of this predicate appearing in similar entity pairs is 0.9, then The calculation process is as follows: The system sets the confidence threshold to 0.25, because... The system then determines that the triple is valid. Subsequently, the system extracts the source evidence identifier "G_DB_001" from the original metadata, combines it with the subject "Professor Zhang", the predicate "responsible", and the object "artificial intelligence database" to generate a quadruple structure, thus completing the fine-grained knowledge graph.

[0069] The composite intent parsing module includes:

[0070] The semantic vector encoding submodule receives natural language consultation text input by the user, calls a pre-trained language model to perform word segmentation and embedding processing on the text, generates a sequence of word vectors with context awareness, and aggregates them into a global semantic vector through pooling operations. At the same time, it uses a named entity recognition model to annotate key entities and their type labels in the text, generating semantic vector representations with entity annotations.

[0071] The system receives natural language inquiry text input from the user, such as "Does the database managed by Professor Zhang have an open access policy?". The system calls a word segmentation engine to divide the text into 14 terms and assigns an embedding vector to each term using a pre-trained model. The system performs average pooling, summing the values ​​of all word vectors in each dimension and dividing by 14 to generate a global semantic vector. During this process, the named entity recognition model scans the text using a sliding window mechanism, calculating the probability that each segment belongs to a specific entity. When scanning "Professor Zhang," the model outputs a probability of 0.992 that it belongs to "subject matter expert." The system sets the annotation judgment benchmark value to 0.95, because... The system inserts a "subject matter expert" type label at the corresponding position in the global vector. Finally, the system outputs a semantic vector representation with entity annotations, which provides a composite data foundation containing semantic features and entity attributes for subsequent syntactic parsing.

[0072] The dependency structure parsing submodule calls the semantic vector representation with entity annotation to construct a dependency parsing tree, identifies the root node predicate, subject-object components and modification and limiting structures in the sentence, extracts the core action verb and target noun phrase, and combines entity type labels to determine the resource domain, rule domain or personnel domain involved in the question, and generates a preliminary intent framework.

[0073] The system constructs a dependency parsing tree by invoking semantic vector representations with entity annotations. It locates the root node predicate "responsible" by traversing the tree nodes and identifies its governing subject "Professor Zhang" and object "database". The system further extracts the modifier "open access policy" and performs domain weight allocation calculations based on entity labels. Resource domains are defined. Rule domain Personnel Domain The initial values ​​are all 0. The system recognizes the "database" and... Add 1, "Open Access Policy" makes Add 1, "Professor Zhang" makes Add 1. The system calculates the score for each domain and obtains the result. , , The system identifies the predicate "whether or not" as having query attribute characteristics, determining the core intent to be attribute confirmation. By identifying the subject and object components and their corresponding domain scores, the system generates a preliminary intent framework containing logical chains of "subject matter expert - responsible - collection resources" and "collection resources - applicable - open access policy".

[0074] The constraint extraction submodule calls the modification and limitation structure in the initial intent framework to identify constraint elements such as time range, geographical restrictions, language preferences, and access permissions, and aligns them with the corresponding attribute fields in the fine-grained knowledge graph to generate a composite intent structured description containing the main intent identifier and constraint key-value pairs.

[0075] The system invokes the embellishment and constraint structures within the initial intent framework to identify constraint elements in the text. It identifies "open access policy" as an attribute constraint and uses an alignment formula to calculate its overlap with the "policy type" field in the knowledge graph. . If the text constraint is "Open Access" and the graph field is "Open Access Policy", then The system sets the semantic expansion coefficient. Calculate the corrected matching value The system sets the constraint alignment threshold to 0.90, therefore... The system determines that the constraint aligns with the graph attribute "Policy_Type". It then encapsulates the backbone intent "QUERY_RELATION" with the constraint key-value pair {Policy_Type:OA} to generate a structured description of the composite intent. The advantage of this approach is that by calculating the overlap of constraint terms and correcting the coefficients, it ensures that natural language constraints are accurately mapped to the graph's physical fields.

[0076] The multi-hop logic reasoning engine module includes:

[0077] The starting node location submodule calls the main intent identifier and core entity name in the composite intent structured description, retrieves the matching set of starting nodes in the fine-grained knowledge graph, and performs preliminary filtering of the starting nodes based on the constraint key-value pairs to generate a list of valid starting nodes.

[0078] Receive the core entity name "Professor Zhang" from the structured description of the composite intent, perform a full-text search in the fine-grained knowledge graph, and obtain the node set. The system extracts the attribute requirements from the constraints and compares the attribute values ​​of each node. (Known) The "professional title" attribute is "professor". The system executes a logical comparison action, matching and scoring the node attributes against the "professor" label in the intent. The node is designated as "Associate Professor".

[0079] Table 2: Starting Node Filtering Scoring Table

[0080]

[0081] Referring to Table 2, the system sets the node validity baseline value to 0.80. Because... Score The system determined it to be a valid starting node. This result indicates that the system, through precise identification of attribute values, excluded nodes with duplicate names but mismatched attributes, and generated a list of valid starting nodes.

[0082] The path expansion and traversal submodule starts with each node in the list of valid starting nodes, executes a breadth-first search strategy, expands layer by layer to adjacent nodes, and applies a set of preset reasoning rules during each layer expansion process, including business rules such as "if resource A belongs to database B and database B supports document delivery, then resource A can apply for document delivery" and "if expert C belongs to discipline D and discipline D is associated with journal E, then expert C may be a reviewer for journal E", to generate candidate paths for logical consistency verification.

[0083] With a valid starting node Starting with the first layer, a breadth-first search strategy is executed. In the first layer expansion, the system identifies the relationship "Professor Zhang - Responsible for - Artificial Intelligence Database"; the second layer expansion identifies "Applicable - Golden OA Policy" through the "Artificial Intelligence Database" node. During this process, the system applies a pre-set set of inference rules to extract the business rule "If expert C is responsible for resource A, and resource A is subject to policy P, then expert C needs to interpret policy P". The system calculates the logical strength of this path. Its value is the mean confidence score of each segment. Given that the confidence score of the first segment is 0.92 and the confidence score of the second segment is 0.88, then... The system will Compared to the rule strength threshold of 0.85, because The system determines that the extended path conforms to the business logic and generates candidate paths for logical consistency verification.

[0084] The path validity determination submodule calls the path expansion traversal submodule to traverse all candidate paths, calculates the cumulative confidence score of each path, and removes paths whose length exceeds the preset hop limit or contains conflicting relationships, retains paths that meet all constraints and whose confidence is higher than the preset threshold, and generates a multi-hop association reasoning result set.

[0085] All candidate paths are invoked, and the cumulative confidence score for each path is calculated. . Using a multiplication method, for the path "Professor Zhang - Responsible - Database - Applicable - Policy", given that the confidence levels of the triples are 0.92 and 0.88 respectively, then... The system calculates the path length. Jump. System settings limit the number of path jumps. The system sets a legality confidence threshold of 0.75. The system performs a dual-judgment process: first, it checks... Established; secondly, a judgment was made. The system is confirmed to be correct. It further verifies the existence of conflicting predicates in the path, and if no conflicts are found, the path is retained. Through the aforementioned confidence product operation and hop count limitation judgment, the system eliminates unreliable multi-hop associations and generates a multi-hop association inference result set.

[0086] The dynamic response generation module includes:

[0087] The template matching selection submodule calls the path type identifier in the multi-hop association reasoning result set, retrieves the matching template structure from the preset response template library, including resource location templates, rule interpretation templates, expert recommendation templates and policy application templates, and loads the mapping relationship between placeholders in the templates and knowledge graph attribute fields;

[0088] The system retrieves path type identifiers from the multi-hop association inference result set and performs a search in the preset template library. It extracts the path endpoint entity type "Open Access Policy" and the predicate "Applicable," matching it to the "Policy Applicable Template." The system loads the placeholder mapping rules from the template, mapping "Expert Name" to the "name" attribute of the path start node and "Policy Name" to the "policy_title" attribute of the path endpoint node. The system performs an integrity check and calculates the coverage of the required parameters. If the template has 3 placeholders and the inference result provides 3 parameters, then... The system settings template activation threshold is 1.0, therefore... The system locks the template for subsequent filling.

[0089] The context-aware fill submodule calls the mapping relationship, extracts the corresponding entity attribute values ​​and relationship description text from the multi-hop association reasoning result set, and dynamically fills the template placeholders by combining the current dialogue round number, user identity category and historical interaction preferences to generate a preliminary response draft.

[0090] The system invokes the mapping relationship to extract entity attribute values ​​from the inference result set. The current dialogue round is 2, and the user's identity is "Teacher." The system sets the response detail weight. Regarding the setting of teacher identity The system fills the placeholders with the attribute values ​​"Professor Zhang", "Artificial Intelligence Database", and "Golden OA", and then... Select the supplementary explanatory text in the template.

[0091] Table 3: Attribute Extraction Table for Response Elements

[0092]

[0093] Referring to Table 3, due to the high correction coefficient, the system selected a detailed policy description. The generated preliminary response draft was: "Professor Zhang's artificial intelligence database is subject to the Gold OA policy, and the Gold OA policy is implemented, exempting page fees." This result indicates that the system, through the intervention of identity weights, achieved differentiated generation of response content.

[0094] The sentence fluency optimization submodule performs grammatical correction and connector insertion on the initial response draft to ensure that the generated natural language response text is logically coherent and conforms to the library's professional consultation service standards in terms of style, and finally outputs the natural language response text.

[0095] The initial draft response undergoes grammatical correction and conjunction insertion. The system then calculates the semantic fluency score of the draft. The score is derived from the co-occurrence probability of adjacent terms. The system identified semantic redundancy between "applicable to the Golden OA policy" and "implement the Golden OA policy," and performed redundancy removal. Subsequently, the system inserted the conjunction "specifically" between the two sentences. The system calculated the language model scores before and after optimization. Known before optimization After optimization (Lower scores result in smoother performance). The system optimization threshold is 10.0, therefore... The optimization was deemed effective. The final output natural language response text reads: "The AI ​​database managed by Professor Zhang is subject to the Gold OA policy; specifically, this policy waives publication fees." This result ensures that the response text meets the rigorous stylistic requirements for professional library consultations.

[0096] The knowledge graph incremental update module includes:

[0097] The change event capture submodule subscribes to data change notifications from the library integrated management system, subject service platform, and institutional knowledge base through the message queue interface, parses the change type, entity identifier, and new and old value fields in the notification message, and generates change event records with timestamps.

[0098] The system subscribes to data change notifications from external systems via a message queue interface. Upon receiving an update message from the subject service platform, the system parses it to determine the changed entity ID as "Expert_001," the changed attribute as "Responsible Domain," the new value as "Deep Learning," and the old value as "Machine Learning." The system extracts the source identifier of this notification as "Sub_Platform_A" and obtains the synchronization timestamp. The system then sets a change importance score. The system determines the value based on the field type. Since "Responsible Domain" is a core semantic relationship field, the system assigns a value accordingly. The system is set to update the threshold in real-time at 0.80. The system marks the event as a high-priority event, generates a timestamped change event record, and stores it in the write queue.

[0099] The difference triple generation submodule calls the entity identifier in the change event record, locates the corresponding entity node in the fine-grained knowledge graph, compares the old and new attribute values ​​or relation adjacency table, generates difference triples that reflect the change content, and attaches the change source and effective time tags.

[0100] The system retrieves the entity identifier from the change event log and locates the "Expert_001" node in the knowledge graph. It then extracts the node's adjacency table and compares it with the new value in the change event. Finally, the system calculates the edit distance between the old and new attribute values. If the new value is "deep learning" (4 characters) and the old value is "machine learning" (4 characters), and the characters are completely different, then The system calculates the difference coefficient. ,Right now The system's difference judgment threshold is set to 0.1, because... The system determines that the attribute has undergone a substantial change. It generates a difference triplet, marks the original "Responsible for Machine Learning" as invalid, marks "Responsible for Deep Learning" as newly added, and attaches an effective time tag to provide precise triplet operation instructions for subsequent physical writes.

[0101] The graph fusion writing submodule categorizes the set of difference triples according to the change type, performs graph insertion operation on newly added triples, performs attribute overwrite operation on updated triples, performs relation disconnection operation on invalid triples, and synchronously updates the cache index of related inference paths to complete the incremental update of the knowledge graph.

[0102] The system performs write operations on the set of differing triples, categorizing them by change type. For newly added triples, the system creates new edge connections in the graph database; for updated triples, it performs an attribute overwrite operation, changing the value of the "Responsible Domain" field from "Machine Learning" to "Deep Learning". Simultaneously, the system triggers a cache index update task, recalculating the inference path involving that node. The system obtains the affected path numbers. The confidence level was recalculated to 0.94. The system compared the index weights before and after the update to ensure consistency of the query results. The advantage of this approach is that by covering specific attribute slots and synchronously reconstructing related path indexes, the interference of outdated knowledge on the reasoning results is eliminated, and incremental updates of the fine-grained knowledge graph are completed.

[0103] Please see Figure 3 A knowledge graph-based intelligent question-answering method for library reference services includes the following steps:

[0104] S1: Acquire multi-source heterogeneous data streams from the library, perform entity recognition and relation extraction on the data streams, normalize entity types according to library business semantic rules, and establish a four-tuple structured representation containing resource entities, rule entities, personnel entities, and policy entities to generate a fine-grained knowledge graph for the library domain.

[0105] S2: Receives natural language query input from users, generates semantic vector representation through a pre-trained language model, and extracts the core predicates, target resource types and constraints in the question by combining the named entity recognition results and dependency parsing tree. It then calls the entity index in the fine-grained knowledge graph for preliminary matching and generates a composite intent structured description containing the main intent and auxiliary constraints.

[0106] S3: Based on the target resource type and constraints in the composite intent structured description, perform path expansion operation in the fine-grained knowledge graph, sequentially traverse the adjacent nodes that are directly related to the starting entity, and perform legality verification on each candidate path according to the preset inference rule set, filter out the inference path set that meets all constraints and whose path length does not exceed the preset hop limit, and generate a multi-hop association inference result set.

[0107] S4: Call the entity attribute values ​​and relationship descriptions corresponding to each reasoning path in the multi-hop association reasoning result set, combine the context dialogue state and user role identifier, fill in the content and reorganize the sentences according to the preset response template library, and generate natural language response text that conforms to the professional context of the library and has logical coherence.

[0108] S5: Monitor data change events in various library business systems. When new collection additions, rule revisions, changes in expert information, or policy updates are detected, trigger the incremental extraction process, perform entity recognition and relationship comparison on the changed data, generate a set of difference triples, and merge them into the existing fine-grained knowledge graph to maintain the timeliness and integrity of the knowledge graph.

[0109] The above description is merely a specific embodiment of the present invention, but the scope of protection of the present invention is not limited thereto. Any variations or substitutions that can be easily conceived by those skilled in the art within the technical scope disclosed in the present invention should be included within the scope of protection of the present invention. Therefore, the scope of protection of the present invention should be determined by the scope of the claims.

Claims

1. A library reference consultation intelligent question-answering system based on knowledge graph reasoning, characterized in that, The system includes: The fine-grained knowledge graph construction module acquires multi-source heterogeneous data streams from the library, performs entity recognition and relation extraction on the data streams, normalizes entity types according to library business semantic rules, and establishes a four-tuple structured representation containing resource entities, rule entities, personnel entities, and policy entities to generate a fine-grained knowledge graph for the library domain. The composite intent parsing module receives natural language query input from users, generates semantic vector representations through a pre-trained language model, and extracts the core predicates, target resource types, and constraints in the question by combining named entity recognition results and dependency parsing tree. It then calls the entity index in the fine-grained knowledge graph for preliminary matching and generates a composite intent structured description containing the main intent and auxiliary constraints. The multi-hop logic reasoning engine module, based on the target resource type and constraints in the composite intent structured description, performs path expansion operations in the fine-grained knowledge graph, sequentially traverses the adjacent nodes that are directly related to the starting entity, and performs legality verification on each candidate path according to the preset reasoning rule set, filters out the set of reasoning paths that meet all constraints and whose path length does not exceed the preset hop limit, and generates a multi-hop association reasoning result set. The dynamic response generation module calls the entity attribute values ​​and relationship descriptions corresponding to each reasoning path in the multi-hop association reasoning result set, combines the contextual dialogue state and user role identifier, and fills in the content and reorganizes the sentences according to the preset response template library to generate natural language response text that conforms to the professional context of the library and has logical coherence.

2. The intelligent question-answering system for library reference services based on knowledge graph reasoning according to claim 1, characterized in that: The fine-grained knowledge graph includes a subgraph of entities representing library resources, borrowing rules, subject matter experts, and open access policies. The structured description of the composite intent includes the intent to query core resources, the intent to interpret rules, the intent to recommend experts, and the intent to judge policy applicability. The multi-hop association reasoning result set includes single-resource location paths, cross-database retrieval paths, rule chain derivation paths, and policy-resource mapping paths. The natural language response text includes resource acquisition guidance, rule clause citations, expert contact information, and open access compliance suggestions.

3. The intelligent question-and-answer system for library reference services based on knowledge graph reasoning according to claim 1, characterized in that, The fine-grained knowledge graph construction module includes: The multi-source data fusion submodule acquires the library catalog data stream output by the library integrated management system, the expert directory data stream published by the subject service platform, the open access policy document stream exported by the institutional knowledge base, and the user interaction sequence recorded by the reference consultation log system. It performs format standardization processing on each type of data stream, converts unstructured text into Unified Markup Language documents, and extracts the timestamps, source identifiers and update frequency metadata to generate a set of raw data units with spatiotemporal labels. The entity type normalization submodule calls all text units in the original data unit set, performs type mapping on the identified named entities based on the entity type system defined by the library domain ontology, unifies the resource descriptions into the library collection resource entity type, unifies the rule descriptions into the borrowing rule entity type, unifies the personnel descriptions into the subject expert entity type, and unifies the policy descriptions into the open access policy entity type, and generates a type-normalized entity list. The quadruple relation modeling submodule calls all entity pairs in the type-normalized entity list, extracts semantic relations between entities from the original text context according to the preset relation extraction rule set, and adds confidence scores and source evidence identifiers to each relation triple, constructing a quadruple structure consisting of four parts: subject, predicate, object, and evidence, and generating a fine-grained knowledge graph for the library domain.

4. The intelligent question-and-answer system for library reference services based on knowledge graph reasoning according to claim 1, characterized in that, The composite intent parsing module includes: The semantic vector encoding submodule receives natural language consultation text input by the user, calls a pre-trained language model to perform word segmentation and embedding processing on the text, generates a sequence of word vectors with context awareness, and aggregates them into a global semantic vector through pooling operations. At the same time, it uses a named entity recognition model to annotate key entities and their type labels in the text, generating semantic vector representations with entity annotations. The dependency structure parsing submodule calls the semantic vector representation with entity annotations to construct a dependency parsing tree, identifies the root node predicate, subject-object components and modification and limiting structures in the sentence, extracts the core action verb and target noun phrase, and combines entity type labels to determine the resource domain, rule domain or personnel domain involved in the question, and generates a preliminary intent framework. The constraint extraction submodule calls the modification and limitation structure in the preliminary intent framework to identify constraint elements such as time range, geographical restrictions, language preferences, and access permissions, and aligns them with the corresponding attribute fields in the fine-grained knowledge graph to generate a composite intent structured description containing the main intent identifier and constraint key-value pairs.

5. The intelligent question-and-answer system for library reference services based on knowledge graph reasoning according to claim 1, characterized in that, The multi-hop logic reasoning engine module includes: The starting node location submodule calls the main intent identifier and core entity name in the composite intent structured description, retrieves the matching set of starting nodes in the fine-grained knowledge graph, and performs preliminary filtering of the starting nodes according to the constraint key-value pairs to generate a list of valid starting nodes. The path expansion and traversal submodule takes each node in the list of valid starting nodes as the starting point, executes a breadth-first search strategy, expands to adjacent nodes layer by layer, and applies a preset inference rule set during each layer expansion process to generate candidate paths for logical consistency verification. The path validity determination submodule calls all candidate paths output by the path expansion traversal submodule, calculates the cumulative confidence score of each path, and removes paths whose length exceeds the preset hop count limit or contains conflicting relationships, retaining paths that meet all constraints and whose confidence score is higher than the preset threshold, and generating a multi-hop association reasoning result set.

6. The intelligent question-and-answer system for library reference services based on knowledge graph reasoning according to claim 1, characterized in that, The dynamic response generation module includes: The template matching selection submodule calls the path type identifier in the multi-hop association reasoning result set, retrieves the matching template structure from the preset response template library, and loads the mapping relationship between placeholders in the template and knowledge graph attribute fields. The context-aware fill submodule calls the mapping relationship, extracts the corresponding entity attribute values ​​and relationship description text from the multi-hop association reasoning result set, and dynamically fills the template placeholders by combining the current dialogue round number, user identity category and historical interaction preferences to generate a preliminary response draft. The sentence fluency optimization submodule performs grammatical correction and connector insertion operations on the preliminary response draft to ensure that the resulting response text is logically coherent and conforms to the library's professional consultation service standards in terms of style, and finally outputs the natural language response text.

7. The intelligent question-answering system for library reference services based on knowledge graph reasoning according to claim 1, characterized in that, The system also includes: The knowledge graph incremental update module monitors data change events in various library business systems. When it detects new additions to the collection, rule revisions, changes in expert information, or policy updates, it triggers the incremental extraction process, performs entity recognition and relationship comparison on the changed data, generates a set of difference triples, and merges them into the existing fine-grained knowledge graph to maintain the timeliness and integrity of the knowledge graph. The set of difference triples includes triples for newly added entities, triples for updated attribute values, and triples for invalidated relations.

8. The intelligent question-and-answer system for library reference services based on knowledge graph reasoning according to claim 7, characterized in that, The knowledge graph incremental update module includes: The change event capture submodule subscribes to data change notifications from the library integrated management system, subject service platform, and institutional knowledge base through the message queue interface, parses the change type, entity identifier, and new and old value fields in the notification message, and generates change event records with timestamps. The difference triple generation submodule calls the entity identifier in the change event record, locates the corresponding entity node in the fine-grained knowledge graph, compares the old and new attribute values ​​or relation adjacency table, generates difference triples that reflect the change content, and attaches change source and effective time tags. The graph fusion writing submodule categorizes the set of difference triples according to the change type, performs graph insertion operation on newly added triples, performs attribute overwrite operation on updated triples, performs relation disconnection operation on invalid triples, and synchronously updates the cache index of related inference paths to complete the incremental update of the knowledge graph.

9. A library reference consultation intelligent question-answering method based on knowledge graph reasoning, characterized in that, The method, used in the library reference consultation intelligent question-answering system based on knowledge graph reasoning as described in any one of claims 1-8, includes the following steps: S1: Acquire multi-source heterogeneous data streams from the library, perform entity recognition and relation extraction on the data streams, normalize entity types according to library business semantic rules, and establish a four-tuple structured representation containing resource entities, rule entities, personnel entities, and policy entities to generate a fine-grained knowledge graph for the library domain. S2: Receive natural language consultation input from the user, generate semantic vector representation through a pre-trained language model, and combine named entity recognition results with dependency parsing tree to extract the core predicates, target resource types and constraints in the question. Call the entity index in the fine-grained knowledge graph for preliminary matching to generate a composite intent structured description containing the main intent and auxiliary constraints. S3: Based on the target resource type and constraints in the composite intent structured description, perform path expansion operation in the fine-grained knowledge graph, sequentially traverse the adjacent nodes that are directly related to the starting entity, and perform legality verification on each candidate path according to the preset inference rule set, filter out the inference path set that meets all constraints and whose path length does not exceed the preset hop limit, and generate a multi-hop association inference result set. S4: Call the entity attribute values ​​and relationship descriptions corresponding to each reasoning path in the multi-hop association reasoning result set, combine the context dialogue state and user role identifier, fill in the content and reorganize the sentences according to the preset response template library, and generate natural language response text that conforms to the professional context of the library and has logical coherence. S5: Monitor data change events in various library business systems. When new collection additions, rule revisions, changes in expert information, or policy updates are detected, trigger the incremental extraction process, perform entity recognition and relationship comparison on the changed data, generate a set of difference triples, and merge them into the existing fine-grained knowledge graph to maintain the timeliness and integrity of the knowledge graph.

Citation Information

Patent Citations

  • An intelligent question-answering system based on business knowledge graph retrieval

    CN106951470B

  • A technical consultation question and answer method in the agricultural field

    CN116911311B