A knowledge question and answer method and device based on a knowledge graph and a medium

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
By using dual-location encoding and a graph-hybrid expert architecture model in medical diagnosis, the problem of global topological logic fusion of knowledge graphs is solved, the reliability and interpretability of answers in medical diagnosis are improved, and the probability of hallucination output is reduced.

CN122242777APending Publication Date: 2026-06-19UNIV OF JINAN

View PDF 0 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Applications(China)
Current Assignee / Owner: UNIV OF JINAN
Filing Date: 2026-05-21
Publication Date: 2026-06-19

Application Information

Patent Timeline

21 May 2026

Application

19 Jun 2026

Publication

CN122242777A

IPC: G06N5/04; G06N3/042; G06N3/0464; G06N5/022

AI Tagging

Application Domain

Biological models Inference methods

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

A power distribution network voltage support evaluation method, system, device and medium based on generalized regulation resources
CN122225477ABiological models Ac network voltage adjustment
System(s) and method(s) for generative model processing of image data including object(s) having particular feature(s) and / or classification(s)
WO2026122857A1Biological models
Knowledge graph construction method and device, equipment and storage medium
CN119149753BImprove timing analysisImproving performance in directional reasoningBiological models Knowledge representation
QA system and method
US20260162247A1Programme control Image enhancement
Systems and methods for data collection in an industrial environment
US20260161153A1Machine part testing Receivers monitoring

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

Smart Images

Figure CN122242777A_ABST

Patent Text Reader

Abstract

This specification discloses a knowledge graph-based question answering method, device, and medium, relating to the field of knowledge question answering technology. The method includes: acquiring medical question text input by a user; extracting at least one triple from the medical question text; concatenating the triple and the medical question text to determine an input sequence; performing dual position encoding on each element in the input sequence based on a pre-constructed medical knowledge graph to determine the element position encoding corresponding to each element, thereby determining a corresponding first combined representation, the first combined representation including a text embedding vector, a sequence position encoding embedding vector, and a graph position encoding embedding vector; inputting the first combined representation into a graph hybrid expert architecture model; generating a graph expert output representation and a feedforward network expert output representation through the graph hybrid expert architecture model; and concatenating the graph expert output representation and the feedforward network expert output representation to obtain a model concatenated representation to generate a medical answer sequence.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This specification relates to the field of knowledge question answering technology, and in particular to a knowledge question answering method, device and medium based on knowledge graph. Background Technology

[0002] In fields with strict operational standards, such as disease diagnosis and treatment, intelligent question-answering systems not only need to accurately understand user intent but also must adhere to pre-defined rules and logical constraints. Currently, generative artificial intelligence, represented by large-scale language models, has made significant progress in natural language understanding and generation. However, its essence lies in probabilistic sequence prediction models. When interacting with complex, dynamic, and uncertain environments, it suffers from problems such as uncontrollable reasoning processes and poor interpretability. It is also susceptible to factual errors, i.e., hallucinations, due to empirical biases in the training data.

[0003] To address the aforementioned issues, existing technologies employ Retrieval Augmentation (RAG) technology. This involves retrieving information fragments relevant to the user's question from external knowledge sources (such as knowledge graphs, databases, and literature) and adding them as background knowledge to the input of a large model to enhance the factual basis of the answer. Knowledge graphs, due to their structured node relationship representation capabilities, are commonly used external knowledge bases in RAG systems. However, existing RAG-based knowledge graph question answering processes typically only acquire local subgraph information from the knowledge graph. Large models struggle to capture complete global topological relationships, leading to inference paths deviating from true logic. Furthermore, the self-attention mechanism of Transformers and the graph aggregation mechanism of graph convolutional networks lack inherent connection at the neural network structure level. Large models struggle to establish accurate semantic mappings between natural language entities in the user's question and entities in the knowledge graph. This makes it possible for models to generate answers that contradict the logic of the knowledge graph, even in sensitive scenarios such as medical diagnosis, even with a knowledge graph. The generated answers cannot guarantee strict adherence to clinical pathways, drug contraindications, and other normative requirements. The output suggestions may contain logical jumps or violate known medical relationships, failing to meet the application requirements for controllable and interpretable behavior.

[0004] Therefore, in behaviorally sensitive scenarios such as medical diagnosis, existing RAG technology cannot intrinsically integrate the global topological logic of the knowledge graph into the reasoning process of large models. This results in problems such as uncontrollable reasoning paths and inaccurate entity mapping, which can easily lead to hallucinatory outputs when facing strictly regulated scenarios, making it difficult to meet the requirements of practical applications for behavioral reliability and decision traceability. Summary of the Invention

[0005] This specification provides one or more embodiments of a knowledge graph-based question answering method, device, and medium to address the following technical problem: In behaviorally sensitive scenarios such as medical diagnosis, existing RAG technology cannot intrinsically integrate the global topological logic of the knowledge graph into the large model reasoning process, resulting in uncontrollable reasoning paths and inaccurate entity mapping. This leads to illusory outputs when facing strictly regulated scenarios, making it difficult to meet the requirements of practical applications for behavioral reliability and decision traceability.

[0006] One or more embodiments of this specification employ the following technical solutions: This specification provides one or more embodiments of a knowledge graph-based question answering method, the method comprising: The system obtains medical question text input by the user, extracts at least one triple from the medical question text, and concatenates the triple with the medical question text to determine the input sequence. The triple includes at least one element from a head entity, relation, and tail entity. Based on a pre-constructed medical knowledge graph, each element in the input sequence undergoes dual position encoding to determine the element position encoding corresponding to each element, thereby determining a corresponding first combined representation. Each node and each node relation in the medical knowledge graph corresponds to a unique graph position identifier. The first combined representation includes a text embedding vector, a sequence position encoding embedding vector, and a graph position encoding embedding vector for each element. The first combined representation is input into a pre-trained graph hybrid expert architecture model, which generates a graph expert output representation and a feedforward network expert output representation. The graph expert output representation and the feedforward network expert output representation are then concatenated to obtain a model concatenated representation to generate a medical answer sequence.

[0007] This specification provides one or more embodiments of a knowledge graph-based question-answering device, including: At least one processor; and, A memory communicatively connected to the at least one processor; wherein, The memory stores instructions that can be executed by the at least one processor, which, when executed by the at least one processor, enables the at least one processor to perform the above-described method.

[0008] This specification provides one or more embodiments of a non-volatile computer storage medium storing computer-executable instructions configured to perform the above-described method.

[0009] The above-described at least one technical solution adopted in the embodiments of this specification can achieve the following beneficial effects: Compared to conventional Retrieval Augmentation (RAG) techniques, the above-mentioned solution addresses the issues of missing global topology and structural mismatch inherent in RAG methods by embedding the structured logic of the knowledge graph into the input representation and model architecture of the large model. In the RAG framework, the knowledge graph is only used locally as an external retrieval source, and the model can only obtain a limited number of subgraphs directly connected to the question entities, losing the implicit constraints in multi-hop paths. In contrast, this solution explicitly injects the node and relation identifiers of the knowledge graph into the representation of each element at the input end through dual positional encoding. This allows the model to perceive from the outset which words correspond to entities within the graph, thus preserving the semantic anchors of the global graph structure. Furthermore, in the RAG method, the Transformer and graph convolution are structurally isolated, and the retrieved graph information cannot dynamically interact with the model's self-attention computation. This solution, however, specifically sets up a graph convolutional network expert within the graph hybrid expert architecture. At each layer, the expert is adaptively activated based on the input semantics through a routing module. This allows entities identified by graph location encodings to continue receiving neighbor aggregation updates from graph convolutions after each layer of self-attention. This ensures that the reasoning process of the large model is always guided by the topological logic of the knowledge graph. Self-attention captures dependencies within the sequence, while graph convolution propagates structured constraints along graph edges. At each layer of deep fusion, the ambiguous expressions in the user's question are gradually locked onto clearly defined paths in the graph. Therefore, when generating answers, not only can traceable intermediate results be output along the graph reasoning path, but also, because the graph encodings of non-entity locations are set to zero, the model will not incorrectly associate irrelevant words with the graph structure, thus significantly reducing the probability of hallucinations. In sensitive scenarios such as disease diagnosis, this achieves controllable reasoning behavior and interpretable results. Attached Figure Description

[0010] To more clearly illustrate the technical solutions in the embodiments or prior art of this specification, the drawings used in the description of the embodiments or prior art will be briefly introduced below. Obviously, the drawings described below are only some embodiments recorded in this specification. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort. In the drawings: Figure 1 A flowchart illustrating a knowledge graph-based question-answering method provided in the embodiments of this specification; Figure 2 A schematic diagram of dual position encoding provided for embodiments of this specification; Figure 3 This specification provides a schematic diagram of a question-answering process for a large graph-hybrid expert architecture model, as illustrated in an embodiment of the present specification. Figure 4 This is a schematic diagram of the structure of a knowledge graph-based question-answering device provided in an embodiment of this specification. Detailed Implementation

[0011] To enable those skilled in the art to better understand the technical solutions in this specification, the technical solutions in the embodiments of this specification will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of this specification, and not all embodiments. Based on the embodiments of this specification, all other embodiments obtained by those skilled in the art without creative effort should fall within the scope of protection of this specification.

[0012] This specification provides a knowledge graph-based question answering method. It should be noted that the execution entity in this specification embodiment can be a server or any device with data processing capabilities. Figure 1 This specification provides a flowchart illustrating a knowledge graph-based question-answering method as an embodiment of the present invention. Figure 1 As shown, the main steps include the following: Step S101: Obtain the medical question text input by the user, extract at least one triple from the medical question text, and concatenate the triple and the medical question text to determine the input sequence.

[0013] The triple includes at least one element from the head entity, relation, and tail entity. The input sequence is determined by concatenating the triplet with the medical question text. Specifically, this involves: extracting text triplets from the medical question text using a pre-trained large model, where each text triplet includes a head entity, a direct extraction relation, and a tail entity; when the direct extraction relation is null, performing a relation path query in a pre-defined medical knowledge graph based on the head entity and the tail entity to determine at least one relation path between the head entity and the tail entity; filling null values in the direct extraction relation based on the relation path to determine the relation element corresponding to the triplet; concatenating the triplet with the medical question text in a pre-defined order, and inserting a separator between the end of the triplet and the beginning of the medical question text to determine the input sequence using the concatenated character sequence.

[0014] In one embodiment of this specification, the medical question text submitted by the user through the input interface is first obtained. The medical question text is a string composed of natural language, such as the user describing their own symptoms or asking for disease-related information in text form.

[0015] To transform unstructured medical question text into structured knowledge units, a pre-trained large model is invoked. This pre-trained model has been fine-tuned on massive amounts of medical text and knowledge graph triple data, enabling it to extract semantic triples from free text. The medical question text is input into the large model, which outputs text triples via an encoder-decoder structure or sequence labeling. Each text triple consists of three fields: a head entity, a direct extraction relation, and a tail entity. The head entity represents the medical concept as the subject of the question, the tail entity represents the medical concept as the object, and the direct extraction relation is the predicate directly identified by the model from the text that connects the head and tail entities. However, in real-world medical questions, users often do not explicitly provide relation words. For example, in the question "Does a cold have fever symptoms?", the word "symptoms" does not appear. Therefore, the direct extraction relation may be set to a null value, which can be a special placeholder such as "NULL" or "?".

[0016] The algorithm determines whether the directly extracted relation is null. If it is not null, the triple is used directly as the triple in subsequent steps. If it is null, relation path querying and filling are performed. A pre-built medical knowledge graph loaded into memory is obtained. This graph contains nodes of types such as diseases, symptoms, and drugs, as well as relations such as recommended medications. Each node and relation has a unique identifier. Using the head and tail entities in the text triples as query keys, a graph traversal algorithm is executed in the graph's adjacency list. Specifically, a bidirectional breadth-first search or a depth-first search with path constraints can be used to find all paths from the head entity node, through several intermediate nodes and relation edges, to the tail entity node. Each path consists of a series of relation identifiers and their connected relay nodes. Single-hop relations directly connecting the head and tail entities are extracted, or the semantically most significant one-step relation in the path is selected as the relation element to be filled.

[0017] The retrieved relation elements are filled into the text triples at the positions where relations are directly extracted, replacing any empty values, thus forming a complete triple containing three explicit elements: a head entity, a relation, and a tail entity. After obtaining the complete triples, the input sequence is constructed according to a preset order. First, the head entity, relation, and tail entity in the triples are expanded into character sequences, which can be separated by no additional separator or by spaces. Then, a preset separator is inserted at the end of the triple character sequence. This separator can be a special marker such as "[SEP]" to clearly delineate the boundary between structured and unstructured parts in the sequence. Finally, the original medical question text is appended to the separator, resulting in a complete character sequence composed of the triple string, the separator, and the original question string concatenated sequentially. This sequence is the input sequence. The input sequence is passed to the subsequent dual positional encoding module, where the triplet part carries precise relational information verified by the knowledge graph, and the original question part retains the user's original expression. The two are separated by a separator but connected, providing a unified input form for the model to understand both structured knowledge and natural language context.

[0018] Compared to conventional retrieval enhancement generation methods that simply concatenate user questions with retrieved knowledge graph fragments, this approach introduces relational path queries and null value imputation during the input construction phase. By traversing the knowledge graph's structure, it proactively completes missing relations with topologically verified accurate relationships, transforming incomplete triples into complete and verifiable ones. By injecting the logical constraints of the knowledge graph into the input sequence beforehand, it avoids the uncertainty introduced by large models guessing relationships during inference. Furthermore, because the imputation process utilizes global relational paths in the knowledge graph rather than local neighbors, it can capture the semantic connections of multi-hop reasoning, and the path query can select the relationship that best matches medical facts.

[0019] Step S102: Based on the pre-constructed medical knowledge graph, perform dual position encoding on each element in the input sequence to determine the element position encoding corresponding to each element, so as to determine the corresponding first combination representation.

[0020] In this medical knowledge graph, each node and each node relationship corresponds to a unique graph location identifier. The first combination represents the text embedding vector, sequence location encoding embedding vector, and graph location encoding embedding vector of each element. Based on a pre-constructed medical knowledge graph, each element in the input sequence undergoes dual positional encoding. Before determining the positional encoding of each element, the method further includes: extracting multiple nodes and relationships connecting these nodes from a pre-acquired medical data source, wherein each node includes at least disease nodes, symptom nodes, drug nodes, examination item nodes, and treatment plan nodes, and each relationship includes at least a manifestation relationship, a recommended medication relationship, a contraindicated medication relationship, a required relationship, an available relationship, and an accompanying relationship; generating a unique hierarchical combined node identifier for each node, wherein the hierarchical combined node identifier is a packed integer, which is obtained by concatenating a type code, a level code, an intra-domain sequence number, and a check bit; wherein the type code is used to distinguish the type of the node, and the level code is used to indicate the node's position in the medical field. The abstract hierarchy in the ontology is defined by a unique, incrementally increasing sequence number within the same type code and hierarchy code. This check bit is generated using a hash function from the type code, hierarchy code, and sequence number within the domain. A unique hierarchical composite relation identifier is generated for each relation. This identifier is a packed integer, obtained by concatenating the source type code, relation type code, and target type code bit by bit. The source type code and target type code correspond to the type codes of the nodes at both ends of the relation, and the relation type code is used to distinguish the type of the relation. The mapping between node names and node graph location identifiers for all nodes, and the mapping between relation names and relation graph location identifiers for all relations, are stored in an identifier mapping table. An adjacency table is constructed based on the connections between nodes. This mapping table and the adjacency table are then encapsulated to determine the medical knowledge graph.

[0021] In one embodiment of this specification, multiple nodes and the relationships connecting these nodes are first extracted from a pre-acquired medical data source. The medical data source includes structured databases such as electronic medical records, clinical guideline knowledge bases, and drug instructions, as well as unstructured medical literature. Using natural language processing techniques, such as named entity recognition and relation extraction models, entities with medical significance are identified as nodes, and the semantic connections between these entities are identified as relationships. Nodes include at least disease nodes (e.g., cold), symptom nodes (e.g., fever), drug nodes (e.g., ibuprofen), examination item nodes (e.g., complete blood count), and treatment plan nodes (e.g., surgical treatment). These node types cover the core elements of medical diagnosis and treatment. Relationships include at least the following: a relationship connecting disease and symptom; a recommended medication relationship (connecting disease and drug); a contraindicated medication relationship (connecting disease and drug, indicating prohibition of use); a required relationship (connecting disease and examination item); an available relationship (connecting disease and treatment plan); and a companion relationship (connecting two symptoms, indicating they often occur simultaneously). These relationship types define the typical medical logic between nodes.

[0022] After extraction, a unique hierarchical combined node identifier is generated for each node. This identifier is a packed integer, meaning multiple independent numerical fields are combined into a single integer variable through bitwise operations. The packed integer is obtained by concatenating the type code, hierarchical code, intra-domain sequence number, and check bits. Specifically, the type code occupies the high-order bits of the integer and is used to distinguish the node type; for example, a disease node type code is set to 1, a symptom node to 2, a drug node to 3, an examination item node to 4, and a treatment plan node to 5. The hierarchical code occupies the second-highest bit segment and represents the node's abstract hierarchy within the medical ontology; smaller values indicate greater abstraction, such as system level (1), organ level (2), disease level (3), and subtype level (4). This helps the model perceive the level of detail in the concept. The intra-domain sequence number occupies the middle or low-order bits and is a unique, incrementing sequence number under the same type code and hierarchical code combination, assigned sequentially starting from 1 to ensure the uniqueness of each node within that combination. The check bit occupies the least significant bit segment and is generated by a hash function (e.g., XORing the three values and taking the lower 8 bits) from the type code, level code, and domain sequence number. It is used to quickly verify the integrity of the identifier and prevent data errors during transmission or storage.

[0023] It's important to note that in medical ontology, abstract levels are used to describe the hierarchical relationship between medical concepts, from broad to narrow, and from general to specific. For example, respiratory diseases are a relatively abstract higher-level concept, encompassing more specific sub-concepts such as lung infections. Lung infections can be further subdivided into more specific concepts like pneumonia and bronchitis, and pneumonia can be further subdivided into subtypes such as bacterial pneumonia and viral pneumonia. This hierarchical tree or network structure constitutes the hierarchical system of medical ontology. The position of each node within this hierarchy is called its abstract level. The closer to the root node (e.g., disease), the smaller the level code value, representing a more abstract concept with a wider scope of application; the closer to the leaf node (e.g., bacterial pneumonia), the larger the level code value, representing a more specific concept with more refined attributes. The purpose of setting level codes is to allow the model to perceive the level of abstraction of nodes, thereby reasonably controlling the scope of information propagation during graph convolution aggregation. For abstract nodes, the common features of all their subclass nodes should be aggregated; for specific nodes, more attention should be paid to their unique attributes and their relationships with adjacent specific nodes. This hierarchical coding simulates the reasoning process of human doctors, which gradually locates specific diseases from chief symptoms, and helps to improve the logical rigor and diagnostic accuracy of large models in medical question answering.

[0024] Following the bit allocation rules described above, the four fields are shifted left by the corresponding number of bits and then bitwise ORed to form a single integer, which is the graph position identifier of the node. Simultaneously, a unique hierarchical composite relation identifier is generated for each relation. This hierarchical composite relation identifier is also a packed integer, obtained by concatenating the source type code, relation type code, and target type code bit by bit. The source type code occupies the high-order bits and its value is the type code of the relation's starting node (head entity); the target type code occupies the low-order bits and its value is the type code of the relation's ending node (tail entity); the relation type code occupies the middle bits and is used to distinguish the type of relation, for example, relation code 1, recommended medication relation 2, contraindicated medication relation 3, required relation 4, usable relation 5, and accompanying relation 6. These three fields are then left-shifted and merged to obtain an integer as the graph position identifier of the relation.

[0025] After assigning identifiers to all nodes and relationships, key-value pairs of node names and their graph location identifiers are stored in an identifier mapping table. Similarly, key-value pairs of relationship names and their graph location identifiers are stored in the same or separate identifier mapping tables. Identifier mapping tables typically use a hash table structure, allowing for quick lookup of the corresponding identifier by name. Furthermore, an adjacency table is constructed based on the connections between nodes. An adjacency table is a graph storage structure that, for each node identifier, records the identifiers of all its neighboring nodes and the relationship identifiers connecting the two nodes.

[0026] Iterate through all extracted relations, using the source node identifier as the key, and add (target node identifier, relation identifier) as an adjacency item to the list corresponding to that key. Since relations in a medical knowledge graph are usually directed (e.g., medication recommendations point from disease to drug), reverse adjacency items can also be added to support bidirectional queries. Finally, the identifier mapping table and adjacency list are encapsulated into a medical knowledge graph and loaded into memory for real-time access during subsequent dual positional encoding and graph convolution operations. The knowledge graph not only contains semantic information about entities and relations but also embeds type and hierarchical knowledge through hierarchical combination of identifiers, providing a structured numerical foundation for subsequent sine / cosine embedding and path querying.

[0027] By assigning nodes a hierarchical combination of identifiers containing type code, hierarchy code, domain sequence number, and check bits, the identifiers themselves carry rich semantic information. Type codes allow the model to directly perceive whether a node represents a disease, symptom, or medication, facilitating differentiated processing of different types of nodes during graph convolution aggregation. Hierarchical codes reflect the degree of abstraction to concreteness of a concept, enabling the model to adjust the scope of information aggregation based on the level of abstraction during inference. For example, nodes at the abstract level (such as respiratory diseases) should aggregate broader subclass information, while nodes at the concrete level (such as the common cold) focus more on their own attributes. The introduction of check bits provides a lightweight verification method for data integrity, allowing for rapid verification of whether identifiers have been tampered with or corrupted before performing graph convolution, enhancing the system's robustness. For relation identifiers, the type information of the nodes at both ends of the relation is embedded in the identifier through source and target type codes. This allows for determining the validity of a relation without additional node type queries during processing. For example, a medication recommendation relation requires the source node type to be a disease and the target node type to be a medication. If the type codes in the identifiers do not match, filtering can be performed in advance, reducing invalid computation. Furthermore, the mapping table and adjacency list are encapsulated as independent knowledge graph modules, enabling rapid name-to-identifier conversion and fast retrieval of neighbor information. This provides an efficient data access path for graph position encoding generation in dual positional encoding and neighbor aggregation in graph convolution. In summary, not only is a semantically rich knowledge graph constructed, but the hierarchical combination of identifiers lays a structured foundation for the endogenous fusion of large models and knowledge graphs, significantly improving the model's logical controllability and inference efficiency in medical question-answering tasks.

[0028] Based on a pre-constructed medical knowledge graph, each element in the input sequence is subjected to dual positional encoding to determine the element positional code corresponding to each element. Specifically, this includes: obtaining the sequence length of the input sequence; assigning a unique position index to each element in the input sequence according to its order of appearance in the sequence; determining the sequence positional code corresponding to each element, where the position index is an integer from 1 to the sequence length; obtaining the identifier mapping table of the medical knowledge graph, where the identifier mapping table includes a node identifier mapping table and a relation identifier mapping table; querying whether each element in the input sequence matches a node name or relation name in the mapping table; if it matches, obtaining the corresponding node graph position identifier or relation graph position identifier; determining the graph positional code corresponding to the element; if it does not match, setting the graph positional code corresponding to the element to 0.

[0029] In one embodiment of this specification, the length of the input sequence is first obtained. The length refers to the total number of characters or word segments in the input sequence; for example, the sequence after concatenation and delimiter insertion contains several basic elements. Each element in the input sequence is assigned a unique position index according to its left-to-right order of appearance in the sequence. This position index is an integer starting from 1 and increasing to the sequence length; for example, the first element's index is 1, the second element's index is 2, and so on. Based on the position index of each element, a sequence position code corresponding to that element is generated. The sequence position code provides the model with the sequential relationship between elements, enabling the self-attention mechanism to distinguish between the different semantics of "cold" preceding "fever" and the reverse order. In actual implementation, the sequence position code typically uses sine and cosine functions to generate fixed-dimensional embedding vectors, requiring no additional training parameters.

[0030] Simultaneously, the pre-constructed identifier mapping table in the medical knowledge graph is retrieved. This identifier mapping table consists of two parts: a node identifier mapping table and a relationship identifier mapping table. The node identifier mapping table stores the mapping relationship between each node name and its corresponding hierarchical composite node identifier in key-value pairs. Node names are, for example, "cold" or "fever," and node identifiers are integers obtained by packaging type code, level code, domain sequence number, and check digit. Similarly, the relationship identifier mapping table stores the mapping relationship between each relationship name and its corresponding hierarchical composite relationship identifier. Relationship names are, for example, "recommended medication," etc. Each element in the input sequence is traversed. For the current element, it is first checked whether the string content of the element completely matches any node name in the node identifier mapping table. If it matches, the node graph position identifier corresponding to the node name is directly obtained as the graph position code of the element. If it does not match, it continues to check whether the element matches any relationship name in the relationship identifier mapping table. If it matches, the corresponding relationship graph position identifier is obtained as the graph position code. It should be noted that the elements in the input sequence include words obtained by expanding the head entity, relation, and tail entity in the triple, as well as ordinary words in the original medical question text. Only elements that are completely identical to the node or relation name in the knowledge graph can be successfully matched.

[0031] For elements that match successfully, their graph position code is set to the integer value obtained from the query. For elements that do not match, such as ordinary verbs, adjectives, punctuation marks, or separators, their graph position code is set to the value 0. 0 is a special identifier indicating that the element has no corresponding node or relation in the knowledge graph. Through this process, each element in the input sequence obtains two encoded values: one is the sequence position code derived from its position index, and the other is the graph position code determined by the knowledge graph matching result, which is a non-zero integer or 0. These two codes will be converted into embedding vectors using sine / cosine functions in subsequent steps and concatenated with the text embedding vector, enabling the model to simultaneously perceive the sequential position of elements and their underlying semantic associations within the knowledge graph.

[0032] Compared to conventional Transformer models that only use sequence position encoding, conventional position encoding only tells the model the order in which elements appear, but cannot determine whether an element corresponds to a specific entity or relation in the knowledge graph. This causes the model to treat ordinary words and medical terms equally when processing medical issues, losing structured knowledge association information. This technical solution introduces graph position encoding in addition to sequence position encoding, explicitly marking whether each element matches a node or relation in the knowledge graph, and which specific node or relation it matches. When the graph convolutional network expert in the subsequent graph hybrid expert architecture receives this information, it can precisely identify which positions need to be replaced with graph convolutional embeddings and which positions need to be cleared based on the non-zero positions of the graph position encoding, thus achieving alignment between text semantics and graph structural information. The model no longer needs to learn from scratch that there might be a relationship between "cold" and "fever," but directly learns from the graph position encoding that these two words have corresponding node identifiers in the knowledge graph, and then uses graph experts to aggregate their neighbor information in the graph. Furthermore, encoding mismatched element graph positions as 0 ensures that graph experts only process elements with genuine graph correspondences, avoiding noise interference and improving computational efficiency. Compared to the simple concatenation approach after retrieval in the RAG method, embedding the knowledge graph's identifier information directly into the model's input representation allows knowledge constraints to permeate the entire forward propagation process, rather than being concatenated only at the input layer. Therefore, the model can perceive which locations carry structured knowledge in each layer of self-attention and expert hybrid computation, thereby generating answers that better conform to logical constraints.

[0033] Determining the corresponding first combined representation specifically includes: inputting each element in the input sequence into a pre-trained word embedding matrix, and converting it into a text embedding vector of a preset fixed dimension through a lookup table operation; inputting the sequence position code of each element into a preset first sine position encoding function to generate a sequence position encoding embedding vector of the same dimension as the text embedding vector; generating a graph position encoding embedding vector of the same dimension as the text embedding vector through a preset second sine position encoding function and the graph position code of each element; concatenating the text embedding vector, the sequence position encoding embedding vector, and the graph position encoding embedding vector in the vector dimension to obtain a combined vector of each element; and stacking the combined vectors of all elements in the input sequence in order to obtain the first combined representation. Using a preset second sine position encoding function and the graph position encoding of each element, a graph position encoding embedding vector of the same dimension as the text embedding vector is generated. Specifically, this includes: obtaining each element whose graph position encoding is non-zero; unpacking the graph position encoding into multiple components, where the node graph position identifier is unpacked into four components: type code, level code, domain sequence number, and check bit; and the relation graph position identifier is unpacked into three components: source type code, relation type code, and target type code; inputting each component into the second sine position encoding function to convert it into a component embedding vector of the same dimension as the text embedding vector; concatenating multiple component embedding vectors in the unpacking order to obtain a component concatenated vector; compressing the component concatenated vector into a vector of the same dimension as the text embedding vector through a preset learnable linear projection layer to determine the graph position encoding embedding vector of the element; and setting the graph position encoding embedding vector corresponding to the element whose graph position encoding is 0 to a zero vector of the same dimension as the text embedding vector.

[0034] In one embodiment of this specification, each element in the input sequence is obtained. An element can be a single character, a word segmentation unit, or a subword, depending on the word segmentation strategy. Each element is input to a pre-trained word embedding matrix, which is a two-dimensional lookup table with the number of rows equal to the vocabulary size and the number of columns equal to a preset fixed dimension (denoted as d). This matrix is obtained through pre-training on a large-scale general corpus or fine-tuned in the current task. Using the element's index as the lookup key, the corresponding row is searched in the word embedding matrix, and the d-dimensional floating-point vector of that row is extracted as the text embedding vector for that element. This vector captures the semantic information of the element. Simultaneously, the sequence position encoding of each element is input to a preset first sine position encoding function. This function transforms the integer index using sine and cosine formulas to generate a floating-point vector of the same dimension d as the text embedding vector, called the sequence position encoding embedding vector. This vector represents the order relationship of the elements in the sequence.

[0035] For graph location encoding, the first step is to determine if the graph location encoding of an element is 0. If it is 0, it means that the element has no corresponding node or relationship in the knowledge graph, and the processor directly generates a zero vector of dimension d as the graph location encoding embedding vector for the element. If the graph location encoding is a non-zero integer, then the integer is a packaged hierarchical combination identifier, which needs to be unpacked first. Specifically, for the node graph location identifier (a 64-bit integer composed of type code, level code, domain sequence number, and check bit concatenated bit by bit), four independent components are extracted through bitmasking and right shift operation: type code (e.g., occupying the high 8 bits, with values from 1 to 5 representing diseases, symptoms, etc.), level code (occupying the second high 8 bits, representing the abstract level), domain sequence number (occupying the middle 40 bits, a unique sequence number under the same type and level), and check bit (occupying the low 8 bits, used for verification). In the relation graph position identifier (a 32-bit integer composed of the source type code, relation type code, and target type code), three components are extracted: source type code, relation type code, and target type code. After obtaining each component, each component is treated as an independent integer and input into the second sine position encoding function. The second sine position encoding function has the same form as the first sine position encoding function (both are based on sine / cosine transformations), but they can share the same function implementation, differing only in the input values. For each component value v, a d-dimensional component embedding vector is generated. Specifically, for the dimension index j, which ranges from 0 to d-1, if j is even, then... If j is odd, then take Here, BASE is the preset cardinality. Thus, the four components of the node identifier result in four d-dimensional vectors, and the three components of the relation identifier result in three d-dimensional vectors. These component embedding vectors are concatenated along the vector dimension in the order of unpacking (e.g., type code, level code, domain sequence number, check bit order), resulting in a concatenated vector with a dimension of (number of components × d). Since the dimension of this vector is usually greater than d, and the subsequent self-attention module requires a uniform input dimension of d, this concatenated vector is input to a pre-set learnable linear projection layer. This linear projection layer is a fully connected network with a weight matrix dimension of (d, number of components × d), which learns through training how to compress the high-dimensional concatenated vector back to the target dimension d. After linear projection, the final d-dimensional vector is obtained, serving as the graph position encoding embedding vector for that element.

[0036] Each element corresponds to three d-dimensional vectors: a text embedding vector, a sequence position encoding embedding vector, and a knowledge graph position encoding embedding vector. These three vectors are concatenated along the last dimension to obtain a combined vector for each element, which has a dimension of 3d. For all elements in the input sequence, the combined vectors of each element are stacked sequentially according to their order in the original sequence to form a two-dimensional tensor of (sequence length L, 3d). This two-dimensional tensor is the first combined representation, which simultaneously contains the semantics, positional order, and knowledge graph structured identifier of each element.

[0037] The above technical solution unpacks the packaged hierarchical combined identifier into multiple semantic components, and then fuses each component independently using sine / cosine positional encoding. Different components (such as type codes and hierarchy codes) have independent frequency features in the embedding space. The model can focus on the category attributes or abstract hierarchy attributes of nodes through attention mechanisms, without having to compress all information into a single encoding. For example, the component embedding of type codes can help the model quickly distinguish between disease nodes and symptom nodes, while the component embedding of hierarchy codes allows the model to perceive the coarseness and fineness of concepts, thus rationally selecting the neighbor range during graph convolution aggregation. In addition, the use of sine / cosine encoding instead of learnable embedding layers allows the relative numerical relationships between different components (such as the difference between type codes 1 and 2) to be mapped to meaningful distances through periodic functions, whereas learnable embeddings may ignore this numerical order. Finally, a learnable linear projection layer compresses the concatenated high-dimensional vector back to the same dimension as the text embedding, preserving the independent information of each component while ensuring compatibility with other parts of the model.

[0038] Step S103: Input the first combined representation into the pre-trained graph hybrid expert architecture large model, generate graph expert output representation and feedforward network expert output representation through the graph hybrid expert architecture large model, and concatenate the graph expert output representation and the feedforward network expert output representation to obtain the model concatenated representation to generate the medical answer sequence.

[0039] The generation of graph expert output representations and feedforward network expert output representations through the graph hybrid expert architecture large model specifically includes: inputting the first combined representation into the self-attention module of the graph hybrid expert architecture large model to obtain a second combined representation; inputting the second combined representation into the routing module of the graph hybrid expert architecture large model, through which the routing module calculates the selection weight of each expert, and activates at least one feedforward network expert and at least one graph convolutional network expert based on the selection weight; determining the initial node features of each activated graph convolutional network expert using the subset of non-zero positions corresponding to the non-zero positions of the graph position encoding in the second combined representation; performing graph convolution operation on each activated graph convolutional network expert, aggregating neighbor node information according to the neighbor node relationship in the medical knowledge graph, and setting the subset of the second combined representation corresponding to the zero position of the graph position encoding vector to a zero vector to obtain the graph expert output representation; inputting the second combined representation into each activated feedforward network expert, performing feedforward calculation to obtain the feedforward network expert output representation.

[0040] The routing module calculates the selection weight for each expert, specifically including: determining the input of the routing module using the second combined representation, wherein the routing module includes a linear transformation layer and a Softmax output layer; performing global average pooling on the second combined representation to determine a global feature vector; inputting the global feature vector into the linear transformation layer, and obtaining an original score vector of the same length as the number of experts through linear mapping, wherein each original score corresponds to a feedforward network expert or a graph convolutional network expert; inputting the original score vector into the Softmax output layer to calculate the activation probability of each expert, and determining the selection weight of the expert based on the activation probability; comparing the selection weight with a preset activation threshold, and selecting experts whose weights are all greater than the activation threshold to be activated; when none of the selection weights are greater than the activation threshold, determining the activated feedforward network expert based on the maximum selection weight among the feedforward network experts, and determining the activated graph convolutional network expert based on the maximum selection weight among the graph convolutional network experts.

[0041] In one embodiment of this specification, the previously generated first combined representation is input into a pre-trained graph hybrid expert architecture large model. The graph hybrid expert architecture large model improves upon the standard Transformer structure by introducing a hybrid expert mechanism at the feedforward network position of each layer. The experts are divided into feedforward network experts (i.e., traditional fully connected feedforward networks, used for nonlinear transformation of text semantics) and graph convolutional network experts (i.e., graph convolutional networks, used to aggregate neighbor node information in the knowledge graph).

[0042] Specifically, the first combined representation is a tensor of shape (sequence length L, 3d), where each position corresponds to an element in the input sequence, and the representation of each element includes three parts: text embedding, sequence position encoding embedding, and graph position encoding embedding. This tensor is input into a self-attention module, which contains multiple self-attention heads. Each head captures the dependency between any two elements in the sequence through linear transformations of queries, keys, and values, and the calculation of attention weights, outputting a second combined representation of shape (L, 3d). The vector at each position in the second combined representation has already incorporated the contextual information of the entire sequence.

[0043] The second combined representation is then input into the routing module. The routing module is a learnable gating network that dynamically determines which experts should process the current input. The routing module consists of two main parts: a linear transformation layer and a softmax output layer. To calculate the selection weights for each expert, the second combined representation is first subjected to global average pooling, i.e., the average value of each feature dimension is calculated along the sequence length dimension to obtain a global feature vector. This 3d-dimensional vector represents the global semantic information of the entire sequence. This global feature vector is then input into the linear transformation layer, whose weight matrix has the shape (total number of experts, 3d). Matrix multiplication maps the 3d-dimensional vector to a raw score vector of the same length as the total number of experts, where each component corresponds to the raw score of one expert. This raw score vector is then input into the softmax output layer. The softmax function converts each raw score into a probability value, such that the sum of the probabilities of all experts is 1. These probability values are the selection weights for each expert.

[0044] To determine which experts are activated, the selection weight of each expert is compared to a preset activation threshold (e.g., a hyperparameter between 0 and 1), and experts with all weights greater than this threshold are selected as activated experts. This dynamic activation allows the model to adaptively decide which experts to use based on the input content, rather than using all experts at once. However, there is a boundary case where the selection weights of all experts are not greater than the activation threshold, resulting in no experts being activated and the model being unable to continue computation. To avoid this situation, a safety net strategy is adopted: the expert with the largest selection weight in the feedforward network is selected as the activated feedforward network expert; simultaneously, the expert with the largest selection weight in the graph convolutional network is selected as the activated graph convolutional network expert. This ensures that at least one feedforward network expert and one graph convolutional network expert are activated at each layer.

[0045] Next, the activated graph convolutional network experts and the activated feedforward network experts are processed separately. For each activated graph convolutional network expert, a subset corresponding to non-zero positions of the graph position encoding is first extracted from the second combined representation. Specifically, since the first combined representation of each element contains a graph position encoding embedding vector, and whether the embedding vector is all zero reflects whether the element matches a node or relation in the knowledge graph, the non-zero values of the previously saved graph position encoding can be used to determine which positions belong to the graph position. Vectors at these positions are selected from the second combined representation as the initial node features of the graph convolutional network expert. Each initial node feature corresponds to a node or relation in the knowledge graph, and since the second combined representation has already incorporated contextual information, these features already carry textual semantics. Then, a graph convolution operation is performed on the graph convolutional network expert. Based on the adjacency list pre-stored in the medical knowledge graph, for each initial node feature, the features of all its neighboring nodes are aggregated. Neighboring nodes include nodes directly connected by relations. The aggregation method can be summation, averaging, or attention weighting. After aggregation, a linear transformation is performed using a learnable weight matrix, followed by a non-linear activation function to obtain the updated node features. It's important to note that the graph convolution operation only applies to nodes corresponding to non-zero positions in the graph location encoding; positions with zero graph location encoding (i.e., ordinary text words) do not participate in graph convolution. Simultaneously, the subset of positions in the second combined representation corresponding to zero graph location encoding is directly set to all-zero vectors to eliminate the influence of these positions on the graph expert output. Finally, all positions (including node positions updated by graph convolution and non-node positions set to zero) together constitute the graph expert output representation, with the same shape (L, 3d) as the second combined representation.

[0046] For each activated feedforward network expert, the second combined representation is directly input into the feedforward network expert, which is a two-layer fully connected network (the intermediate dimension is usually greater than 3d, and the output dimension is restored to 3d). Standard feedforward computation is performed, i.e., linear transformation, activation function, and then linear transformation again, to obtain the feedforward network expert output representation, which also has the shape (L, 3d). When multiple feedforward network experts are activated, the output of each expert is calculated separately, and then averaged or weighted summed according to element position to obtain a fused feedforward output. Similarly, the outputs of multiple graph convolutional network experts are also fused.

[0047] After obtaining the graph expert output representation and the feedforward network expert output representation, they are concatenated in the last dimension to obtain a concatenated representation of shape (L, 6d). This concatenated representation simultaneously incorporates the structured knowledge features captured by the graph convolutional network and the textual semantic features captured by the traditional feedforward network. To match the dimensionality requirements of subsequent layers or the output module, this concatenated representation is input to a fully connected layer (linear transformation layer) with a weight matrix of shape (3d, 6d). Matrix multiplication maps the 6d-dimensional vector back to 3d dimensions, yielding the model concatenated representation. This model concatenated representation has the same dimension (L, 3d) as the first combined representation and can be used as input to the next layer of the graph hybrid expert architecture or as a hidden representation before the final output. The model concatenated representation output from the last layer is used as the final hidden representation and then input to the output mapping layer. The output mapping layer contains a linear transformation that maps the 3d-dimensional vector at each position to a vocabulary-sized logits vector, which is then transformed into a probability distribution for the next character using a Softmax function. An autoregressive approach is used to generate medical answer sequences character by character. At each step, based on the currently generated partial answers and the original input, the probability distribution of the next character is calculated. The character with the highest probability is sampled or selected as the newly generated character and appended to the answer sequence. This process is repeated until a termination character is generated. During the generation process, when the generated content belongs to the graph reasoning path (e.g., symbols or node names representing reasoning chains), its graph position code is kept non-zero during encoding, allowing graph experts to intervene in subsequent generation. When the generated content belongs to the plain text description (e.g., dietary advice, registration methods), the processor sets its graph position code to zero, so that graph experts no longer constrain this part, thus balancing logical rigor and linguistic fluency.

[0048] Compared to conventional Transformer models, this approach treats graph convolutional networks (GCNs) as specialized experts, operating in parallel with traditional freeform network (FFN) experts. A routing module dynamically selects activations, enabling the model to adaptively decide when to leverage the topological constraints of the knowledge graph (activating the graph expert) and when to focus solely on pure text semantic transformation (activating the FFN expert) based on the semantic features of the input content. The routing module calculates selection weights based on the feature vectors after global average pooling and employs a safety net strategy to ensure that at least one FFN expert and one graph expert are activated at each layer. This guarantees the model can handle both pure text and knowledge graph fusion. Compared to RAG techniques, this approach's knowledge fusion occurs within each layer of the model, rather than just a single retrieval and concatenation at the input. The graph expert performs graph convolution aggregation on the node features corresponding to the graph location at each layer, allowing the structural information of the knowledge graph to repeatedly influence the representation of the entire sequence, achieving true endogenous fusion. Furthermore, zeroing out positions in the second combination avoids interference from non-entity words on graph convolution, improving computational efficiency and representation purity. When generating answers, different graph reasoning path parts and text description parts are distinguished and different graph position encoding strategies are adopted for each. This ensures both the interpretability of the reasoning process (the path part is constrained by the graph) and the naturalness of the answer (the description part is freely generated). This significantly improves the controllability and interpretability of large model question answering in norm-sensitive scenarios and reduces the illusion caused by the lack of knowledge constraints.

[0049] The above-described at least one technical solution adopted in the embodiments of this specification can achieve the following beneficial effects: Compared to conventional Retrieval Augmentation (RAG) techniques, the above-mentioned solution addresses the issues of missing global topology and structural mismatch inherent in RAG methods by embedding the structured logic of the knowledge graph into the input representation and model architecture of the large model. In the RAG framework, the knowledge graph is only used locally as an external retrieval source, and the model can only obtain a limited number of subgraphs directly connected to the question entities, losing the implicit constraints in multi-hop paths. In contrast, this solution explicitly injects the node and relation identifiers of the knowledge graph into the representation of each element at the input end through dual positional encoding. This allows the model to perceive from the outset which words correspond to entities within the graph, thus preserving the semantic anchors of the global graph structure. Furthermore, in the RAG method, the Transformer and graph convolution are structurally isolated, and the retrieved graph information cannot dynamically interact with the model's self-attention computation. This solution, however, specifically sets up a graph convolutional network expert within the graph hybrid expert architecture. At each layer, the expert is adaptively activated based on the input semantics through a routing module. This allows entities identified by graph location encodings to continue receiving neighbor aggregation updates from graph convolutions after each layer of self-attention. This ensures that the reasoning process of the large model is always guided by the topological logic of the knowledge graph. Self-attention captures dependencies within the sequence, while graph convolution propagates structured constraints along graph edges. At each layer of deep fusion, the ambiguous expressions in the user's question are gradually locked onto clearly defined paths in the graph. Therefore, when generating answers, not only can traceable intermediate results be output along the graph reasoning path, but also, because the graph encodings of non-entity locations are set to zero, the model will not incorrectly associate irrelevant words with the graph structure, thus significantly reducing the probability of hallucinations. In sensitive scenarios such as disease diagnosis, this achieves controllable reasoning behavior and interpretable results.

[0050] Figure 2 This specification provides a schematic diagram of a dual positional encoding method, as shown in the embodiments. Figure 2 As shown, the input text T Existing encoding methods mainly include text embedding encoding and sequence position encoding. In the embodiments of this specification, graph position embedding encoding is introduced. Assuming that the numbers for cold, symptoms, and fever in the knowledge graph are 8, 2, and 6 respectively, then T... The graph position encoding sequence for "fever with cold symptoms" in the text T is "882266", and the graph position encoding sequence for "fever with cold symptoms" in the medical problem text T is "88006600". This means that only the corresponding entities implied in T are encoded, without encoding the implication relations or other parts. To facilitate subsequent self-attention operations, sine / cosine encoding and other methods are used to transform the above graph position encodings into equivalent embedding encoding forms.

[0051] Figure 3 This specification provides a schematic diagram of a question-answering process for a large graph-hybrid expert architecture model, as illustrated in the embodiments of this specification. Figure 3As shown, the text sequence embedding, text position embedding, and graph position embedding are concatenated to obtain the first combined representation S0. The first combined representation S0 is input into the attention calculation module to obtain the second combined representation S1. Softmax routing is used to weightedly select the FFN expert and the GCN graph expert. The graph expert calculation process involves replacing the graph position embedding in S1 with the GCN graph convolution aggregation embedding, and eliminating zeros at other positions. The calculation results of the FFN expert and the GCN graph expert are concatenated and a fully connected layer is performed. The output dimension of the fully connected layer is consistent with the first combined representation S0. Based on the LLMs endogenous fusion model, the input and T... A consistent Prompt format is used to generate graph reasoning path answers and text description answers. When generating answers word-for-word based on the Prompt, the newly generated graph reasoning path answer still uses graph position encoding, but the graph position encoding of the newly generated text description answer is set to zero. By combining graph hybrid expert architecture with sequence graph hybrid address position encoding technology, the structured logic of the knowledge graph is integrated into the endogenous reasoning process of large models. This solves the problems of missing global knowledge topology and lack of constraints in generated answers in existing RAG methods. It is used to improve the controllability and interpretability of behavioral decisions of large models in strictly regulated scenarios such as medical diagnosis, so that the model output follows the logical constraints defined by the knowledge graph and reduces the generation of illusions.

[0052] This specification also provides an embodiment of a knowledge graph-based question-answering device, such as... Figure 4 As shown, the device includes: at least one processor; and a memory communicatively connected to the at least one processor; wherein the memory stores instructions executable by the at least one processor, the instructions being executed by the at least one processor to enable the at least one processor to perform the above-described method.

[0053] This specification also provides a non-volatile computer storage medium storing computer-executable instructions configured to perform the above-described method.

[0054] The various embodiments in this specification are described in a progressive manner. Similar or identical parts between embodiments can be referred to mutually. Each embodiment focuses on describing the differences from other embodiments. In particular, the embodiments of apparatus, devices, and non-volatile computer storage media are basically similar to the method embodiments, so the descriptions are relatively simple; relevant parts can be referred to the descriptions of the method embodiments.

[0055] The above description is merely one or more embodiments of this specification and is not intended to limit this specification. Various modifications and variations can be made to the one or more embodiments of this specification by those skilled in the art. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principle of one or more embodiments of this specification should be included within the scope of the claims of this specification.

Claims

1. A knowledge question answering method based on a knowledge graph, characterized in that, The method includes: Obtain medical question text input by the user, extract at least one triple from the medical question text, concatenate the triple and the medical question text to determine the input sequence, wherein the triple includes at least one element from head entity, relation and tail entity; Based on a pre-constructed medical knowledge graph, each element in the input sequence is subjected to dual position encoding to determine the element position encoding corresponding to each element, thereby determining the corresponding first combination representation. Each node and each node relationship in the medical knowledge graph corresponds to a unique graph position identifier. The first combination representation includes the text embedding vector, sequence position encoding embedding vector, and graph position encoding embedding vector of each element. The first combined representation is input into a pre-trained graph hybrid expert architecture large model. The graph hybrid expert architecture large model generates graph expert output representation and feedforward network expert output representation. The graph expert output representation and the feedforward network expert output representation are then concatenated to obtain the model concatenated representation, which is used to generate a medical answer sequence. 2.The knowledge question answering method based on a knowledge graph according to claim 1, characterized in that, The input sequence is determined by concatenating the triplet and the medical question text, specifically including: Using a pre-trained large model, text triples are extracted from the medical problem text, wherein the text triples include a head entity, a direct extraction relation, and a tail entity. When the direct extraction relationship is null, a relationship path query is performed in the preset medical knowledge graph based on the head entity and the tail entity to determine at least one relationship path between the head entity and the tail entity. Based on the relationship path, the direct extraction relationship is filled with null values to determine the relation element corresponding to the triplet and to determine the triplet. The triples are concatenated with the medical question text in a preset order, and a separator is inserted between the end of the triples and the beginning of the medical question text to determine the input sequence with the concatenated character sequence. 3.The knowledge question answering method based on a knowledge graph according to claim 1, characterized in that, Based on a pre-constructed medical knowledge graph, each element in the input sequence is subjected to dual positional encoding. Before determining the element positional encoding corresponding to each element, the method further includes: Multiple nodes and relationships connecting the nodes are extracted from a pre-acquired medical data source. The nodes include at least disease nodes, symptom nodes, drug nodes, examination item nodes, and treatment plan nodes. The relationships include at least manifestation relationships, recommended medication relationships, contraindicated medication relationships, required relationships, available relationships, and accompanying relationships. A unique hierarchical composite node identifier is generated for each node. The hierarchical composite node identifier is a packaged integer, which is obtained by concatenating the type code, level code, domain sequence number, and check bit bit by bit. The type code is used to distinguish the type of the node, the level code is used to represent the abstract level of the node in the medical ontology, the domain sequence number is a unique incrementing sequence number under the same type code and level code, and the check bit is generated by hashing the type code, the level code, and the domain sequence number. A unique hierarchical composite relation identifier is generated for each relation. The hierarchical composite relation identifier is a packed integer, which is obtained by concatenating the source type code, relation type code, and target type code bit by bit. The source type code and the target type code correspond to the type codes of the nodes at both ends of the relation, and the relation type code is used to distinguish the type of the relation. The mapping between node names and node graph location identifiers of all nodes, and the mapping between relation names and relation graph location identifiers of all relations are stored in an identifier mapping table. An adjacency table is constructed based on the connection relationships between nodes. The mapping table and the adjacency table are then encapsulated to determine the medical knowledge graph. 4.The knowledge question answering method based on a knowledge graph according to claim 1 or 3, characterized in that, Based on a pre-constructed medical knowledge graph, each element in the input sequence is subjected to dual positional encoding to determine the element positional code corresponding to each element, specifically including: Obtain the sequence length of the input sequence, assign a unique position index to each element in the input sequence according to its order of appearance in the sequence, and determine the sequence position code corresponding to each element, wherein the position index is an integer from 1 to the sequence length; Obtain the identifier mapping table of the medical knowledge graph, wherein the identifier mapping table includes a node identifier mapping table and a relationship identifier mapping table; Query whether each element in the input sequence matches a node name or relation name in the mapping table. If a match is found, obtain the corresponding node graph location identifier or relation graph location identifier, and determine the graph location code corresponding to the element. If there is no match, the map position code corresponding to the element is set to 0. 5.The knowledge question answering method based on a knowledge graph according to claim 4, characterized in that, Determine the corresponding first combination representation, specifically including: Each element in the input sequence is input into a pre-trained word embedding matrix, and converted into a text embedding vector of a preset fixed dimension through a lookup operation; The sequence position code of each element is input into a preset first sine position code function to generate a sequence position code embedding vector with the same dimension as the text embedding vector. By using a preset second sinusoidal position encoding function and the spectral position encoding of each element, a spectral position encoding embedding vector of the same dimension as the text embedding vector is generated; The text embedding vector, the sequence position encoding embedding vector, and the graph position encoding embedding vector are concatenated along the vector dimension to obtain a combined vector for each element. The combined vectors of all elements in the input sequence are stacked in order to obtain the first combined representation. 6.The knowledge question answering method based on a knowledge graph according to claim 5, characterized in that, By using a preset second sine position encoding function and the spectral position encoding of each element, a spectral position encoding embedding vector of the same dimension as the text embedding vector is generated, specifically including: Obtain each element whose graph position code is non-zero, and unpack the graph position code into multiple components. Specifically, for node graph position identifiers, unpack them into four components: type code, level code, domain sequence number, and check bit. For relation graph position identifiers, unpack them into three components: source type code, relation type code, and target type code. Each component is input into the second sine position encoding function and converted into a component embedding vector of the same dimension as the text embedding vector; Multiple component embedding vectors are concatenated in unpacking order to obtain a component concatenated vector. The component concatenated vector is then compressed into a vector of the same dimension as the text embedding vector through a pre-set learnable linear projection layer to determine the spectral position encoding embedding vector of the element. Set the graph position encoding embedding vector corresponding to the element whose graph position encoding is 0 to a zero vector of the same dimension as the text embedding vector. 7.The knowledge question answering method based on a knowledge graph according to claim 1, characterized in that, The graph hybrid expert architecture large model generates graph expert output representations and feedforward network expert output representations, specifically including: The first combined representation is input into the self-attention module of the graph hybrid expert architecture large model to obtain the second combined representation; The second combination representation is input into the routing module of the graph hybrid expert architecture large model. The routing module calculates the selection weight of each expert to activate at least one feedforward network expert and at least one graph convolutional network expert based on the selection weight. The initial node features of each activated graph convolutional network expert are determined by using the subset of non-zero positions corresponding to the non-zero positions in the second combination representation that correspond to the non-zero positions in the graph position encoding. For each activated graph convolutional network expert, a graph convolution operation is performed to aggregate neighbor node information based on the neighbor node relationships in the medical knowledge graph, and the subset corresponding to the zero position of the graph position encoding vector in the second combined representation is set to zero vector to obtain the graph expert output representation; The second combined representation is input into each activated feedforward network expert, and feedforward computation is performed to obtain the feedforward network expert output representation. 8.The knowledge question answering method based on a knowledge graph according to claim 7, characterized in that, The routing module calculates the selection weight for each expert, specifically including: The second combination represents the input of the routing module, wherein the routing module includes a linear transformation layer and a Softmax output layer; The second combined representation is subjected to global average pooling to determine the global feature vector; The global feature vector is input into the linear transformation layer, and an original score vector of the same length as the number of experts is obtained through linear mapping. Each original score corresponds to a feedforward network expert or a graph convolutional network expert. The original score vector is input into the Softmax output layer to calculate the activation probability of each expert, and the selection weight of the expert is determined by the activation probability. The selection weights are compared with a preset activation threshold, and experts with all weights greater than the activation threshold are selected to be activated. When each of the selection weights is not greater than the activation threshold, the maximum selection weight among the feedforward network experts is used to determine the activated feedforward network expert, and the maximum selection weight among the graph convolutional network experts is used to determine the activated graph convolutional network expert. 9.A knowledge question answering device based on a knowledge graph, characterized in that, The device includes: At least one processor; and, A memory communicatively connected to the at least one processor; wherein, The memory stores instructions that can be executed by the at least one processor to enable the at least one processor to perform the method as described in any one of claims 1-8.

10. A non-transitory computer storage medium storing computer-executable instructions, the computer-executable instructions comprising instructions for: receiving a request to access a file; determining whether the file is stored in a cache; and in response to determining that the file is stored in the cache, providing access to the file from the cache. The computer-executable instructions are configured to perform the method as described in any one of claims 1-8.