Text acquisition method and device, electronic equipment and storage medium
By using instances, inheritance, and inclusion relationships of concept graphs to determine the target query graph in a knowledge base question-answering system, the accuracy and complexity issues of ranking reasoning problems are solved, and more efficient text retrieval is achieved.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- BEIJING XIAOMI MOBILE SOFTWARE CO LTD
- Filing Date
- 2021-09-30
- Publication Date
- 2026-06-16
AI Technical Summary
Existing knowledge base question answering systems suffer from an explosion of constraints and complex logical operations when solving ranking and reasoning problems. In particular, they cannot correctly parse query text when core entities are missing, leading to a decrease in accuracy.
By extracting core entity and constraint information from the query text, and using instance relationships, inheritance relationships, and inclusion relationships in the concept graph to determine the target association relationships, a target query graph is constructed to obtain the target text corresponding to the query text.
It improves the accuracy of retrieving target text from query text, avoids the problems of an explosion in the number of condition constraints and complex logical operations, and enhances the recall ability of sorting and reasoning problems.
Smart Images

Figure CN115905457B_ABST
Abstract
Description
Technical Field
[0001] This disclosure relates to the field of artificial intelligence technology, and in particular to text acquisition methods, apparatus, electronic devices, and storage media. Background Technology
[0002] In recent years, with the rise of artificial intelligence technology, intelligent question answering systems, as one of the important research directions in the field of artificial intelligence, have received widespread attention. Knowledge Base Question Answering (KBQA) is one of the implementation methods of intelligent question answering systems. KBQA uses a knowledge graph as its knowledge base, and its main task is to translate the user's input query text into a query statement of the knowledge graph in order to obtain the answer from the knowledge graph.
[0003] Ranking and reasoning problems are one of the main types of problems that KBQA aims to solve. Currently, KBQA primarily uses a parsing approach to solve ranking and reasoning problems. In practice, when the distance between the core entity and the answer is long, parsing can lead to an explosion of constraints and complex logical operations, thus requiring constraints to be limited to one hop. However, this approach affects KBQA's ability to solve ranking and reasoning problems. Furthermore, if the query text lacks core entities, KBQA cannot locate the constraints through them, leading to incorrect query text parsing and reduced accuracy in solving ranking and reasoning problems. Summary of the Invention
[0004] To overcome the problems existing in related technologies, this disclosure provides a text acquisition method, apparatus, electronic device, and storage medium.
[0005] According to a first aspect of the present disclosure, a text acquisition method is provided, the method comprising:
[0006] Extract the core entity from the query text, as well as the constraint information of the query text;
[0007] The target association relationships of the core entities are determined based on the concept graph; the concept graph includes instance relationships, inheritance relationships, and inclusion relationships, wherein the instance relationship is used to characterize that a specified entity among two connected entities in the concept graph is an instance of another entity; the inheritance relationship is used to characterize the hierarchical relationship between two connected entities in the concept graph; the inclusion relationship is used to characterize whether a specified entity among two connected entities in the concept graph belongs to another entity; the target association relationship includes at least one of the target instance relationship, target inheritance relationship, and target inclusion relationship;
[0008] The target query graph is determined based on the core entity, the constraint information, and the target association.
[0009] Based on the target query graph, the target text corresponding to the query text is obtained from the concept graph.
[0010] Optionally, obtaining the core entity from the query text and the constraint information of the query text includes:
[0011] Information is extracted from the query text to obtain the core entity and the constraint information, which includes sorting attributes, sorting rules, and sorting positions.
[0012] Optionally, determining the target query graph based on the core entity, the constraint information, and the target association includes:
[0013] When the target association includes the target instance relationship, sorting constraint rules are determined based on the core entity, the constraint information, and the target instance relationship;
[0014] If the target association relationship includes the target inheritance relationship, the target instance relationship is updated according to the target inheritance relationship, and the sorting constraint rule is determined according to the core entity, the constraint information, and the updated target instance relationship;
[0015] If the target association relationship only includes the target inclusion relationship, the entity that has the target inclusion relationship with the core entity is taken as the new core entity, and the sorting constraint rule is determined according to the new core entity, the constraint information and the target instance relationship;
[0016] The target query graph is determined based on the sorting constraint rules.
[0017] Optionally, if the constraint information further includes supplementary entities and supplementary associations of the supplementary entities, determining the target query graph according to the sorting constraint rules includes:
[0018] Based on the supplementary entities and the supplementary relationships, determine the supplementary constraint rules;
[0019] Multiple query subgraphs are constructed using the sorting constraint rules and the supplementary constraint rules;
[0020] Based on the query text, a target query subgraph is determined from multiple query subgraphs.
[0021] Optionally, determining the target query subgraph from multiple query subgraphs based on the query text includes:
[0022] For each query subgraph, the query text and the query subgraph are input into a pre-trained matching model to obtain the matching degree of the query subgraph;
[0023] The query subgraph with the highest matching degree is taken as the target query subgraph.
[0024] Optionally, obtaining the target text corresponding to the query text from the concept graph based on the target query graph includes:
[0025] Generate a query statement based on the target query subgraph;
[0026] The target text is obtained by querying the concept graph using the query statement.
[0027] Optionally, the query text is a question text, which is the text corresponding to the question entered by the user, and the target text is an answer text, which is the text corresponding to the answer to the question obtained from the concept graph.
[0028] According to a second aspect of the present disclosure, a text acquisition apparatus is provided, the apparatus comprising:
[0029] The acquisition module is configured to acquire the core entity and the constraint information of the query text from the query text;
[0030] The determination module is configured to determine the target association relationships of the core entities based on a concept graph; the concept graph includes instance relationships, inheritance relationships, and inclusion relationships, wherein the instance relationship is used to characterize that a specified entity among two connected entities in the concept graph is an instance of another entity; the inheritance relationship is used to characterize the hierarchical relationship between two connected entities in the concept graph; the inclusion relationship is used to characterize whether a specified entity among two connected entities in the concept graph belongs to another entity; the target association relationship includes at least one of target instance relationship, target inheritance relationship, and target inclusion relationship;
[0031] The determining module is further configured to determine a target query graph based on the core entity, the constraint information, and the target association relationship;
[0032] The query module is configured to retrieve the target text corresponding to the query text from the concept graph based on the target query graph.
[0033] Optionally, the acquisition module is configured as follows:
[0034] Information is extracted from the query text to obtain the core entity and the constraint information, which includes sorting attributes, sorting rules, and sorting positions.
[0035] Optionally, the determining module includes:
[0036] The first determining submodule is configured to determine sorting constraint rules based on the core entity, the constraint information, and the target instance relationship, when the target association includes the target instance relationship.
[0037] The first determining submodule is further configured to, when the target association includes the target inheritance relationship, update the target instance relationship according to the target inheritance relationship, and determine the sorting constraint rules according to the core entity, the constraint information and the updated target instance relationship;
[0038] The first determining submodule is further configured to, when the target association relationship only includes the target inclusion relationship, take the entity that has the target inclusion relationship with the core entity as the new core entity, and determine the sorting constraint rule according to the new core entity, the constraint information and the target instance relationship;
[0039] The second determining submodule is configured to determine the target query graph based on the sorting constraint rules.
[0040] Optionally, if the constraint information further includes supplementary entities and supplementary associations of the supplementary entities, the second determining submodule is configured as follows:
[0041] Based on the supplementary entities and the supplementary relationships, determine the supplementary constraint rules;
[0042] Multiple query subgraphs are constructed using the sorting constraint rules and the supplementary constraint rules;
[0043] Based on the query text, a target query subgraph is determined from multiple query subgraphs.
[0044] Optionally, the second determining submodule is configured as follows:
[0045] For each query subgraph, the query text and the query subgraph are input into a pre-trained matching model to obtain the matching degree of the query subgraph;
[0046] The query subgraph with the highest matching degree is taken as the target query subgraph.
[0047] Optionally, the query module includes:
[0048] The generation submodule is configured to generate a query statement based on the target query subgraph;
[0049] The query submodule is configured to use the query statement to perform a query in the concept graph to obtain the target text.
[0050] Optionally, the query text is a question text, which is the text corresponding to the question entered by the user, and the target text is an answer text, which is the text corresponding to the answer to the question obtained from the concept graph.
[0051] According to a third aspect of the present disclosure, an electronic device is provided, comprising:
[0052] processor;
[0053] Memory used to store processor-executable instructions;
[0054] The processor is configured to perform the steps of the text acquisition method provided in the first aspect of this disclosure.
[0055] According to a fourth aspect of the present disclosure, a computer-readable storage medium is provided that stores computer program instructions thereon, which, when executed by a processor, implement the steps of the text acquisition method provided in the first aspect of the present disclosure.
[0056] The technical solutions provided by the embodiments of this disclosure may include the following beneficial effects:
[0057] This disclosure first extracts the core entities and constraint information of the query text from the query text. Then, it determines the target relationships of the core entities based on a concept graph, where the concept graph includes instance relationships, inheritance relationships, and inclusion relationships. The target relationships include at least one of target instance relationships, target inheritance relationships, and target inclusion relationships. Next, it determines a target query graph based on the core entities, constraint information, and target relationships, and then extracts the target text corresponding to the query text from the concept graph based on the target query graph. This disclosure, through a concept graph including instance relationships, inheritance relationships, and inclusion relationships, can accurately determine the target relationships of core entities in the query text, achieving correct parsing of the query text and improving the accuracy of obtaining the target text from the query text. Furthermore, obtaining the target text through a target query graph avoids the problems of an explosion in the number of condition constraints and complex logical operations caused by excessive distance between the core entity and the answer, reducing the restriction on the distance between the core entity and the answer, thereby improving the ability to obtain the target text from the query text.
[0058] It should be understood that the above general description and the following detailed description are exemplary and explanatory only, and are not intended to limit this disclosure. Attached Figure Description
[0059] The accompanying drawings, which are incorporated in and form a part of this specification, illustrate embodiments consistent with this disclosure and, together with the description, serve to explain the principles of this disclosure.
[0060] Figure 1 This is a flowchart illustrating a text acquisition method according to an exemplary embodiment.
[0061] Figure 2 This is a schematic diagram of a conceptual map according to an exemplary embodiment.
[0062] Figure 3 It is based on Figure 1 The illustrated embodiment shows a flowchart of step 103.
[0063] Figure 4 This is a schematic diagram illustrating a query subgraph according to an exemplary embodiment.
[0064] Figure 5 It is based on Figure 1 The illustrated embodiment shows a flowchart of step 104.
[0065] Figure 6 This is a block diagram illustrating a text acquisition device according to an exemplary embodiment.
[0066] Figure 7 It is based on Figure 6 The illustrated embodiment shows a block diagram of a determining module.
[0067] Figure 8 It is based on Figure 6 The illustrated embodiment shows a block diagram of a query module.
[0068] Figure 9 This is a block diagram illustrating an electronic device according to an exemplary embodiment.
[0069] Figure 10 This is a block diagram illustrating another electronic device according to an exemplary embodiment. Detailed Implementation
[0070] Exemplary embodiments will now be described in detail, examples of which are illustrated in the accompanying drawings. When the following description relates to the drawings, unless otherwise indicated, the same numerals in different drawings denote the same or similar elements. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with this disclosure. Rather, they are merely examples of apparatuses and methods consistent with some aspects of this disclosure as detailed in the appended claims.
[0071] Before introducing the text acquisition methods, apparatus, electronic devices, and storage media provided in this disclosure, the application scenarios involved in the various embodiments of this disclosure are first described. These application scenarios can be those where text required by a user is obtained through a knowledge graph, such as knowledge base question-and-answer scenarios or search scenarios. In a knowledge base question-and-answer scenario, a user can engage in intelligent question-and-answer with KBQA through a terminal. For example, a user can input query text containing a ranking reasoning question through a terminal, so that KBQA can retrieve the target text corresponding to the query text (containing the answer to the ranking reasoning question) from KBQA's knowledge graph based on the query text, and return the target text to the user. The terminal can be, for example, a mobile terminal such as a smartphone, tablet, smartwatch, smart bracelet, or PDA (Personal Digital Assistant), or a fixed terminal such as a desktop computer.
[0072] Figure 1 This is a flowchart illustrating a text acquisition method according to an exemplary embodiment. Figure 1 As shown, the method includes the following steps:
[0073] In step 101, the core entity and constraint information of the query text are obtained from the query text.
[0074] For example, the query text input by the user according to their query needs can be obtained first. The query text can be the question text, which can be the text corresponding to the question input by the user. For example, the question can be a sorting reasoning question. In practice, the sorting reasoning question can be described as follows: For multiple entities that meet the specified conditions, sort them according to the specified sortable attributes (e.g., ascending or descending order), and select the entity at the specified position in the sorted entity sequence as the answer to the question. It can be seen that sorting reasoning questions necessarily have the following: (1) All entities to be sorted belong to the same entity type. That is, in sorting reasoning questions, there must be entity type constraints. For example, when the query text is "the largest animal", the entity type to be sorted should be "animal". In some cases, the entity type may not appear directly in the query text, but this constraint still exists. For example, when the query text is "the largest seed", the complete query text should be "the plant with the largest seed", and its corresponding entity type should be "plant". (2) The entity type has a sortable attribute. For example, when the query text is "the animal with the largest weight", the entity type constraint is "animal", and the sortable attribute should be "weight" defined under the "animal" type. (3) The sorting process needs to specify ascending or descending order rules. For example, when the query text is "the animal with the largest weight", it is sorted in descending order by "weight", and when the query text is "which Olympic Games were held the earliest", it is sorted in ascending order by "held time". (4) In the sorted entity sequence, the sorting position of the answer needs to be specified. When the query text is "what is the largest dinosaur", its corresponding sorting position is 1, and when the query text is "the country with the second smallest area", its corresponding sorting position is 2.
[0075] Based on the above analysis of ranking reasoning problems, the ranking constraints (i.e., the conditions for ranking reasoning problems) mainly include: entity type, ranking attribute (i.e., a specific attribute contained within an entity), ranking rule (ascending or descending order), and ranking position (i.e., the position of the answer in the ranked entity sequence). Therefore, we can first use a pre-defined extraction method to extract information from the query text, obtaining the core entities and constraint information. The core entities can be understood as entities in the query text that play a crucial role in the target query text, and the constraint information can include ranking attributes, ranking rules, and ranking positions. Pre-defined extraction methods can include slot filling, predicate prediction, etc., and vocabularies and rules can also be used to extract information from the query text.
[0076] In step 102, the target relationships of the core entities are determined based on the concept map.
[0077] The concept graph includes instance relations, inheritance relations, and inclusion relations. Instance relations are used to characterize that a specified entity in two connected entities in the concept graph is an instance of another entity. Inheritance relations are used to characterize the hierarchical relationship between two connected entities in the concept graph. Inclusion relations are used to characterize whether a specified entity in two connected entities in the concept graph belongs to another entity. Target association relations include at least one of target instance relations, target inheritance relations, and target inclusion relations.
[0078] For example, when using parsing to retrieve target text from a knowledge graph, there are issues such as limiting conditional constraints to one hop and limiting recall capabilities for ranking and reasoning problems if the query text lacks core entities. These issues can be avoided by constructing a concept graph (i.e., improving the knowledge graph) and refining the constraint extraction method. Specifically, the appropriate knowledge graph can be selected in advance based on specific business needs. For instance, if the specific business need is to solve animal-related problems, an animal-related knowledge graph can be selected. Then, this knowledge graph can be improved to obtain the corresponding concept graph.
[0079] A concept graph is a directed graph composed of nodes and edges. Each node corresponds to an entity. Entities in a concept graph can be further divided into conceptual entities and ordinary entities. Every two adjacent nodes are connected by an edge, and each edge sets the relationship between the two entities connected by the edge. The division between conceptual and ordinary entities is not fixed in different scenarios; generally, real-world individuals are considered ordinary entities, and the rest are conceptual entities. For example... Figure 2 As shown ( Figure 2 The circles and rectangles in the diagram are used to represent nodes in the concept map. Figure 2 The arrows in the diagram represent edges in the concept graph. We can define <Shark> as a regular entity, a specific instance of <Fish>, where <Fish> is a conceptual entity. Similarly, to achieve better ranking and reasoning capabilities, we can define <Shark> as a conceptual entity and <Tiger Shark>, <Whale Shark>, etc., as regular instances. It's important to note that as the granularity of concepts becomes more refined, KBQA's ability to answer ranking and reasoning questions becomes increasingly precise and accurate, but the cost of data construction also increases significantly. Therefore, it's necessary to choose an appropriate granularity of concepts based on the actual usage scenario. Relationships in the concept graph can be categorized into instance relationships (using...). <isa>(to represent), inheritance relationship (using) <ako>(to represent) and containment relationship (using) <part-of>(to indicate). <isa>A relation describes the attribution relationship between a regular entity and a conceptual entity; that is, it is used to characterize that a regular entity is a specific instance of another conceptual entity, such as "<Blue Whale>". <isa>"<Mammals>" means that the <blue whale> is an instance of <mammals>. <ako>A relation describes the hierarchical relationship between two connected entities (it can also be understood as the inheritance relationship between parent and child classes), for example, "<Mammals>". <ako>"<animal>" means that "<mammal>" is a type of "animal". In this case, the node corresponding to "<animal>" is the parent node (or superior node) of the node corresponding to "<mammal>". Therefore, the attributes defined under "<animal>" will be inherited by "<mammal>". <part-of>A relationship describes whether a specified entity in two connected entities belongs to another entity (attribute inheritance is not allowed between these two entities), for example, "<seed>". <part-of>"<plant>" indicates that "<seed>" is a part of "<plant>", but "<seed>" does not possess the attributes defined under the concept of "<plant>". It should be noted that for conceptual entities like "<seed>" that represent "a part" of a certain conceptual entity, there are usually no corresponding concrete instances; that is, they do not exist. <isa>relation.
[0080] Then, after obtaining the core entity, the target association relationships between the core entity and other entities connected to the core entity can be found in the concept graph. These target association relationships can include at least one of the following: target instance relationships, target inheritance relationships, and target inclusion relationships.
[0081] In step 103, the target query graph is determined based on the core entities, constraint information, and target relationships.
[0082] In step 104, the target text corresponding to the query text is obtained from the concept graph based on the target query graph.
[0083] Furthermore, the entity types included in the ranking constraint rules are reflected as core entities in the concept graph. <isa>Therefore, after determining the target relationships, the core entities can be determined based on these relationships. <isa>Relationships, and based on the core entity, the core entity's <isa>Relationships, sorting attributes, sorting rules, and sorting positions are used to construct a target query graph for retrieving target text. Finally, the target query graph is used to search within the concept graph to identify entities that satisfy the sorting constraints as target text. The target text can be the answer text, which is the text corresponding to the answer to the user's input question obtained from the concept graph. This approach ensures correct parsing of the question text and allows the retrieval of the answer text based on the core entities within the correctly parsed question text. This avoids the problems of an explosion of constraints and complex logical operations caused by excessive distances between the core entities in the question text and the answer to the user's input question, thus improving the ability to retrieve answers based on user input questions in the knowledge-based question-answering domain.
[0084] It should be noted that the process of determining the target text can be implemented by the server or by the terminal, and this disclosure does not make any specific limitation on it.
[0085] In summary, this disclosure first obtains the core entities and constraint information of the query text from the query text, and determines the target relationships of the core entities based on a concept graph. The concept graph includes instance relationships, inheritance relationships, and inclusion relationships, and the target relationships include at least one of target instance relationships, target inheritance relationships, and target inclusion relationships. Then, a target query graph is determined based on the core entities, constraint information, and target relationships, and the target text corresponding to the query text is obtained from the concept graph based on the target query graph. This disclosure, through a concept graph including instance relationships, inheritance relationships, and inclusion relationships, can accurately determine the target relationships of core entities in the query text, achieving correct parsing of the query text and improving the accuracy of obtaining the target text from the query text. Furthermore, obtaining the target text through a target query graph avoids the problems of an explosion in the number of condition constraints and complex logical operations caused by excessive distance between the core entity and the answer, reducing the restriction on the distance between the core entity and the answer, thereby improving the ability to obtain the target text from the query text.
[0086] Figure 3 It is based on Figure 1 The illustrated embodiment shows a flowchart of step 103. For example... Figure 3 As shown, step 103 may include the following steps:
[0087] Step 1031: When the target association includes the target instance relationship, determine the sorting constraint rules based on the core entity, constraint information and target instance relationship.
[0088] For example, if the target association of a core entity includes a target instance relationship, the entity type of the core entity can be directly limited by the target instance relationship of the core entity. Then, the sorting constraint rules for sorting reasoning problems can be determined by combining the core entity, sorting attribute, sorting rule and sorting position.
[0089] Step 1032: If the target association relationship includes the target inheritance relationship, update the target instance relationship according to the target inheritance relationship, and determine the sorting constraint rules according to the core entity, constraint information and the updated target instance relationship.
[0090] Specifically, since the attributes of the parent entity in two connected entities with an inheritance relationship are inherited by the child entity, the inheritance relationship can be used to update the instance relationships of the core entity during the process of determining the entity type of the core entity, thereby determining the entity type of the core entity. Furthermore... Figure 2 Taking an example, there is an inheritance relationship between <Aquatic Animals> and <Fish>, and the node corresponding to <Aquatic Animals> is the parent node of the node corresponding to <Fish>. Simultaneously, there is an instance relationship between <Shark> and <Fish>, and <Shark> is a specific instance of <Fish>. Therefore, we can conclude that there is an instance relationship between <Shark> and <Aquatic Animals>, and <Shark> is a specific instance of <Aquatic Animals>. That is, the instance relationship of <Aquatic Animals> is updated through the inheritance relationship between <Aquatic Animals> and <Fish>. Furthermore, when the target association relationship includes a target inheritance relationship, the target instance relationship can be updated based on the target inheritance relationship, and the entity type of the core entity can be limited by the updated target instance relationship of the core entity. Then, the ordering constraint rules for ordering reasoning problems can be determined based on the core entity, constraint information, and the updated target instance relationship.
[0091] It should be noted that updating the instance relationships of core entities using inheritance can also be done directly during the concept graph construction process. <isa>Relationship along <ako>The direction is expanded to make it easier to utilize the information in the concept graph, thereby avoiding the problem of an explosion in the number of condition constraints caused by the excessive distance between the core entity and the answer node, reducing the complexity of logical operations, and improving the recall ability of sorting and reasoning problems.
[0092] Step 1033: If the target association relationship only includes the target inclusion relationship, the entity that has a target inclusion relationship with the core entity is taken as the new core entity, and the sorting constraint rules are determined based on the new core entity, constraint information and target instance relationship.
[0093] For example, if the target association only includes the target containment relationship, it means that the core entity in the query text does not have an instance relationship, and it is impossible to directly locate the entity type in the ranking constraint rule through the current core entity (i.e., the query text has a core entity problem). In this case, the entity that has a target containment relationship with the core entity can be used as the new core entity, that is, the entity that may be missing in the query text can be inferred by using the containment relationship. Continuing with... Figure 2 For example, when the query text is "the largest seed" and the core entity is <seed>, <seed> has no... <isa>Relationships can be utilized at this time. <part-of>The relational reasoning identifies the missing entity in the query text as "<plant>", and sets "<plant>" as the new core entity. Finally, based on the new core entity, constraint information, and target instance relations, the sorting constraint rules for sorting reasoning problems are determined.
[0094] Step 1034: Determine the target query graph based on the sorting constraint rules.
[0095] In this step, a target query graph can be constructed based on the core entities in the sorting constraints, the target instance relationships of the core entities, sorting attributes, sorting rules, and sorting positions.
[0096] In one scenario, in reasoning and ranking problems, besides entity type constraints, there may be other constraints as well. Let's continue with... Figure 2 For example, when the query text is "the largest animal in the ocean," in addition to the entity type constraint of "animal," there is also the condition constraint of "living environment" being "ocean." That is to say, during the information extraction process of the query text, besides extracting the core entity, supplementary entities that complement the query target text may also be extracted. In this case, the supplementary entities, as well as the supplementary relationships between the supplementary entities found in the concept graph and other entities connected to the supplementary entities, can be used as constraint information. When the constraint information also includes supplementary entities and their supplementary relationships, step 1034 may include the following steps:
[0097] Step a) Determine the supplementary constraint rules based on the supplementary entities and supplementary relationships.
[0098] Step b) involves constructing multiple query subgraphs using sorting constraints and supplementary constraints.
[0099] For example, when the constraint information also includes supplementary entities and supplementary relationships, supplementary constraint rules can be determined first based on these supplementary entities and relationships. Then, a query subgraph corresponding to the ranking constraint rules can be constructed based on the ranking constraint rules, and a query subgraph corresponding to the supplementary constraint rules can also be constructed based on the supplementary constraint rules. Finally, the query subgraphs corresponding to the ranking constraint rules and the supplementary constraint rules can be combined to obtain the combined query subgraph. For example, when the query text is "What is the second largest animal in the ocean by weight?", the core entity included in the ranking constraint rules is <animal>, and the target instance relationship of the core entity is... <isa>Given a relationship with the sorting attribute <weight>, sorting rule in descending order, sort position 2, supplementary constraint rules including the supplementary entity <ocean>, and supplementary association <living environment>, the query subgraph corresponding to the sorting constraint rules can be as follows: Figure 4 As shown in (1), the query subgraph corresponding to the supplementary constraint rule can be as follows: Figure 4 As shown in (2) above, the combined query subgraph can be as follows: Figure 4 As shown in (3) in the figure. Wherein, "{Max 2}" represents the second entity in the entity sequence after descending order, and "?x" represents the set of nodes corresponding to entities that satisfy the sorting constraint rules and / or supplementary constraint rules.
[0100] Step c) Based on the query text, determine the target query subgraph from multiple query subgraphs.
[0101] In this step, for each query subgraph, the query text and the query subgraph can be input into a pre-trained matching model to obtain the matching degree of the query subgraph, and the query subgraph with the highest matching degree can be used as the target query subgraph.
[0102] Figure 5 It is based on Figure 1 The illustrated embodiment shows a flowchart of step 104. For example... Figure 5 As shown, step 104 may include the following steps:
[0103] Step 1041: Generate a query statement based on the target query subgraph.
[0104] Step 1042: Use a query statement to perform a query in the concept graph to obtain the target text.
[0105] For example, after determining the target query subgraph, a query statement corresponding to the target query subgraph can be generated. This query statement can then be used to search within the concept graph to obtain the target text (i.e., the answer to a ranking reasoning problem). Taking a SPARQL query statement as an example, when the query text is "What do blue whales eat?", the corresponding SPARQL statement could be "SELECT ? x WHERE {<blue whale><food> ? x.}", and the target text could be "<fish>, <molluscs>, <plankton>".
[0106] In summary, this disclosure first obtains the core entities and constraint information of the query text from the query text, and determines the target relationships of the core entities based on a concept graph. The concept graph includes instance relationships, inheritance relationships, and inclusion relationships, and the target relationships include at least one of target instance relationships, target inheritance relationships, and target inclusion relationships. Then, a target query graph is determined based on the core entities, constraint information, and target relationships, and the target text corresponding to the query text is obtained from the concept graph based on the target query graph. This disclosure, through a concept graph including instance relationships, inheritance relationships, and inclusion relationships, can accurately determine the target relationships of core entities in the query text, achieving correct parsing of the query text and improving the accuracy of obtaining the target text from the query text. Furthermore, obtaining the target text through a target query graph avoids the problems of an explosion in the number of condition constraints and complex logical operations caused by excessive distance between the core entity and the answer, reducing the restriction on the distance between the core entity and the answer, thereby improving the ability to obtain the target text from the query text.
[0107] Figure 6 This is a block diagram illustrating a text acquisition device according to an exemplary embodiment. Figure 6 As shown, the device 200 includes an acquisition module 201, a determination module 202, and a query module 203.
[0108] The acquisition module 201 is configured to retrieve the core entity and constraint information of the query text from the query text.
[0109] Module 202 is configured to determine the target relationships of core entities based on the concept map.
[0110] The concept graph includes instance relations, inheritance relations, and inclusion relations. Instance relations are used to characterize that a specified entity in two connected entities in the concept graph is an instance of another entity. Inheritance relations are used to characterize the hierarchical relationship between two connected entities in the concept graph. Inclusion relations are used to characterize whether a specified entity in two connected entities in the concept graph belongs to another entity. Target association relations include at least one of target instance relations, target inheritance relations, and target inclusion relations.
[0111] The determination module 202 is also configured to determine the target query graph based on the core entities, constraint information, and target relationships.
[0112] The query module 203 is configured to retrieve the target text corresponding to the query text from the concept graph based on the target query graph.
[0113] Optionally, the acquisition module 201 is configured as follows:
[0114] Information is extracted from the query text to obtain core entities and constraint information, including sorting attributes, sorting rules, and sorting positions.
[0115] Figure 7 It is based on Figure 6 The illustrated embodiment presents a block diagram of a determining module. For example... Figure 7 As shown, the determining module 202 includes:
[0116] The first determining submodule 2021 is configured to determine sorting constraint rules based on core entities, constraint information, and target instance relationships when the target association includes target instance relationships.
[0117] The first determining submodule 2021 is also configured to, when the target association includes the target inheritance relationship, update the target instance relationship based on the target inheritance relationship, and determine the sorting constraint rules based on the core entity, constraint information and the updated target instance relationship.
[0118] The first determining submodule 2021 is also configured to, when the target association relationship only includes the target inclusion relationship, take the entity that has a target inclusion relationship with the core entity as the new core entity, and determine the sorting constraint rules based on the new core entity, constraint information and target instance relationship.
[0119] The second determination submodule 2022 is configured to determine the target query graph based on sorting constraint rules.
[0120] Optionally, if the constraint information also includes supplementary entities and supplementary relationships between the supplementary entities, the second determining submodule 2022 is configured as follows:
[0121] Based on the supplementary entities and supplementary relationships, determine the supplementary constraint rules.
[0122] Multiple query subgraphs are constructed using sorting constraints and supplementary constraints.
[0123] Based on the query text, determine the target query subgraph from multiple query subgraphs.
[0124] Optionally, the second determining submodule 2022 is configured as follows:
[0125] For each query subgraph, the query text and the query subgraph are input into a pre-trained matching model to obtain the matching degree of the query subgraph.
[0126] The query subgraph with the highest matching degree is used as the target query subgraph.
[0127] Figure 8 It is based on Figure 6 The illustrated embodiment presents a block diagram of a query module. For example... Figure 8 As shown, the query module 203 includes:
[0128] The generation submodule 2031 is configured to generate query statements based on the target query subgraph.
[0129] The query submodule 2032 is configured to use query statements to perform queries in the concept graph to obtain the target text.
[0130] Optionally, the query text is the question text, which is the text corresponding to the question entered by the user, and the target text is the answer text, which is the text corresponding to the answer to the question obtained from the concept map.
[0131] Regarding the apparatus in the above embodiments, the specific manner in which each module performs its operation has been described in detail in the embodiments related to the method, and will not be elaborated upon here.
[0132] This disclosure also provides a computer-readable storage medium having stored thereon computer program instructions that, when executed by a processor, implement the steps of the text acquisition method provided in this disclosure.
[0133] In summary, this disclosure first obtains the core entities and constraint information of the query text from the query text, and determines the target relationships of the core entities based on a concept graph. The concept graph includes instance relationships, inheritance relationships, and inclusion relationships, and the target relationships include at least one of target instance relationships, target inheritance relationships, and target inclusion relationships. Then, a target query graph is determined based on the core entities, constraint information, and target relationships, and the target text corresponding to the query text is obtained from the concept graph based on the target query graph. This disclosure, through a concept graph including instance relationships, inheritance relationships, and inclusion relationships, can accurately determine the target relationships of core entities in the query text, achieving correct parsing of the query text and improving the accuracy of obtaining the target text from the query text. Furthermore, obtaining the target text through a target query graph avoids the problems of an explosion in the number of condition constraints and complex logical operations caused by excessive distance between the core entity and the answer, reducing the restriction on the distance between the core entity and the answer, thereby improving the ability to obtain the target text from the query text.
[0134] Figure 9 This is a block diagram illustrating an electronic device according to an exemplary embodiment. For example, the electronic device 300 may be a mobile phone, computer, digital broadcasting terminal, messaging device, game console, tablet device, medical device, fitness equipment, personal digital assistant, etc.
[0135] Reference Figure 9 The electronic device 300 may include one or more of the following components: processing component 302, memory 304, power component 306, multimedia component 308, audio component 310, input / output (I / O) interface 312, sensor component 314, and communication component 316.
[0136] Processing component 302 typically controls the overall operation of electronic device 300, such as operations associated with display, telephone calls, data communication, camera operation, and recording. Processing component 302 may include one or more processors 320 to execute instructions to complete all or part of the steps of the text acquisition method described above. Furthermore, processing component 302 may include one or more modules to facilitate interaction between processing component 302 and other components. For example, processing component 302 may include a multimedia module to facilitate interaction between multimedia component 308 and processing component 302.
[0137] Memory 304 is configured to store various types of data to support the operation of electronic device 300. Examples of such data include instructions for any application or method operating on electronic device 300, contact data, phonebook data, messages, pictures, videos, etc. Memory 304 can be implemented by any type of volatile or non-volatile storage device or a combination thereof, such as static random access memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic storage, flash memory, magnetic disk, or optical disk.
[0138] Power component 306 provides power to various components of electronic device 300. Power component 306 may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power to electronic device 300.
[0139] Multimedia component 308 includes a screen that provides an output interface between the electronic device 300 and the user. In some embodiments, the screen may include a liquid crystal display (LCD) and a touch panel (TP). If the screen includes a touch panel, the screen may be implemented as a touchscreen to receive input signals from the user. The touch panel includes one or more touch sensors to sense touches, swipes, and gestures on the touch panel. The touch sensors may sense not only the boundaries of the touch or swipe action but also the duration and pressure associated with the touch or swipe operation. In some embodiments, multimedia component 308 includes a front-facing camera and / or a rear-facing camera. When the electronic device 300 is in an operating mode, such as a shooting mode or a video mode, the front-facing camera and / or the rear-facing camera may receive external multimedia data. Each front-facing camera and rear-facing camera may be a fixed optical lens system or have focal length and optical zoom capabilities.
[0140] Audio component 310 is configured to output and / or input audio signals. For example, audio component 310 includes a microphone (MIC) configured to receive external audio signals when electronic device 300 is in an operating mode, such as call mode, recording mode, and voice recognition mode. The received audio signals may be further stored in memory 304 or transmitted via communication component 316. In some embodiments, audio component 310 also includes a speaker for outputting audio signals.
[0141] I / O interface 312 provides an interface between processing component 302 and peripheral interface modules, such as keyboards, click wheels, buttons, etc. These buttons may include, but are not limited to, home buttons, volume buttons, power buttons, and lock buttons.
[0142] Sensor assembly 314 includes one or more sensors for providing state assessments of various aspects of electronic device 300. For example, sensor assembly 314 can detect the on / off state of electronic device 300, the relative positioning of components such as the display and keypad of electronic device 300, changes in position of electronic device 300 or a component of electronic device 300, the presence or absence of user contact with electronic device 300, orientation or acceleration / deceleration of electronic device 300, and temperature changes of electronic device 300. Sensor assembly 314 may include a proximity sensor configured to detect the presence of nearby objects without any physical contact. Sensor assembly 314 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, sensor assembly 314 may also include an accelerometer, gyroscope, magnetometer, pressure sensor, or temperature sensor.
[0143] Communication component 316 is configured to facilitate wired or wireless communication between electronic device 300 and other devices. Electronic device 300 can access wireless networks based on communication standards, such as WiFi, 2G, or 3G, or combinations thereof. In one exemplary embodiment, communication component 316 receives broadcast signals or broadcast-related information from an external broadcast management system via a broadcast channel. In one exemplary embodiment, communication component 316 also includes a near-field communication (NFC) module to facilitate short-range communication. For example, the NFC module may be implemented based on radio frequency identification (RFID) technology, Infrared Data Association (IrDA) technology, ultra-wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.
[0144] In an exemplary embodiment, the electronic device 300 may be implemented by one or more application-specific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), field-programmable gate arrays (FPGAs), controllers, microcontrollers, microprocessors, or other electronic components to perform the text acquisition method described above.
[0145] In an exemplary embodiment, a non-transitory computer-readable storage medium including instructions is also provided, such as a memory 304 including instructions, which can be executed by a processor 320 of an electronic device 300 to complete the text acquisition method described above. For example, the non-transitory computer-readable storage medium may be a ROM, random access memory (RAM), CD-ROM, magnetic tape, floppy disk, and optical data storage device, etc.
[0146] In another exemplary embodiment, a computer program product is also provided, the computer program product comprising a computer program executable by a programmable device, the computer program having a code portion for performing the text acquisition method described above when executed by the programmable device.
[0147] Figure 10 This is a block diagram illustrating another electronic device according to an exemplary embodiment. For example, device 400 may be provided as a server. (See also...) Figure 10 The device 400 includes a processing component 422, which further includes one or more processors, and memory resources represented by memory 432 for storing instructions, such as application programs, that can be executed by the processing component 422. The application programs stored in memory 432 may include one or more modules, each corresponding to a set of instructions. Furthermore, the processing component 422 is configured to execute instructions to perform the text acquisition method described above.
[0148] Device 400 may also include a power supply component 426 configured to perform power management of device 400, a wired or wireless network interface 450 configured to connect device 400 to a network, and an input / output (I / O) interface 458. Device 400 can operate on an operating system, such as Windows Server, stored in memory 432. TM Mac OS X TM Unix TM Linux TM FreeBSD TM Or similar.
[0149] Other embodiments of this disclosure will readily occur to those skilled in the art upon consideration of the specification and practice of this disclosure. This application is intended to cover any variations, uses, or adaptations of this disclosure that follow the general principles of this disclosure and include common knowledge or customary techniques in the art not disclosed herein. The specification and examples are to be considered exemplary only, and the true scope and spirit of this disclosure are indicated by the following claims.
[0150] It should be understood that this disclosure is not limited to the precise structures described above and shown in the accompanying drawings, and various modifications and changes can be made without departing from its scope. The scope of this disclosure is limited only by the appended claims.< / isa> < / isa> < / ako> < / isa> < / isa> < / isa> < / isa> < / isa> < / ako> < / ako> < / isa> < / isa> < / ako> < / isa>
Claims
1. A text acquisition method, characterized in that, The method includes: Information is extracted from the query text to obtain the core entity and the constraint information of the query text. The constraint information includes sorting attributes, sorting rules and sorting positions. The target association relationships of the core entities are determined based on the concept graph; the concept graph includes instance relationships, inheritance relationships, and inclusion relationships, wherein the instance relationship is used to characterize that a specified entity among two connected entities in the concept graph is an instance of another entity; the inheritance relationship is used to characterize the hierarchical relationship between two connected entities in the concept graph; the inclusion relationship is used to characterize whether a specified entity among two connected entities in the concept graph belongs to another entity; the target association relationship includes at least one of the target instance relationship, target inheritance relationship, and target inclusion relationship; The sorting constraint rules are determined based on the core entity, the constraint information, and the target association. Based on the sorting constraint rules, determine the target query graph; Based on the target query graph, the target text corresponding to the query text is obtained from the concept graph.
2. The method according to claim 1, characterized in that, The step of determining the sorting constraint rules based on the core entity, the constraint information, and the target association includes: When the target association includes the target instance relationship, sorting constraint rules are determined based on the core entity, the constraint information, and the target instance relationship; If the target association relationship includes the target inheritance relationship, the target instance relationship is updated according to the target inheritance relationship, and the sorting constraint rule is determined according to the core entity, the constraint information, and the updated target instance relationship; If the target association relationship only includes the target inclusion relationship, the entity that has the target inclusion relationship with the core entity is taken as the new core entity, and the sorting constraint rule is determined according to the new core entity, the constraint information and the target instance relationship.
3. The method according to claim 2, characterized in that, If the constraint information further includes supplementary entities and supplementary associations of the supplementary entities, then determining the target query graph according to the sorting constraint rules includes: Based on the supplementary entities and the supplementary relationships, determine the supplementary constraint rules; Multiple query subgraphs are constructed using the sorting constraint rules and the supplementary constraint rules; Based on the query text, a target query subgraph is determined from multiple query subgraphs.
4. The method according to claim 3, characterized in that, The step of determining the target query subgraph from multiple query subgraphs based on the query text includes: For each query subgraph, the query text and the query subgraph are input into a pre-trained matching model to obtain the matching degree of the query subgraph; The query subgraph with the highest matching degree is taken as the target query subgraph.
5. The method according to claim 1, characterized in that, The step of obtaining the target text corresponding to the query text from the concept graph based on the target query graph includes: Generate a query statement based on the target query subgraph; The target text is obtained by querying the concept graph using the query statement.
6. The method according to any one of claims 1-5, characterized in that, The query text is the question text, which is the text corresponding to the question entered by the user. The target text is the answer text, which is the text corresponding to the answer to the question obtained from the concept map.
7. A text acquisition device, characterized in that, The device includes: The acquisition module is configured to extract information from the query text to obtain the core entity and the constraint information of the query text, wherein the constraint information includes sorting attributes, sorting rules and sorting positions. The determination module is configured to determine the target association relationships of the core entities based on a concept graph; the concept graph includes instance relationships, inheritance relationships, and inclusion relationships, wherein the instance relationship is used to characterize that a specified entity among two connected entities in the concept graph is an instance of another entity; the inheritance relationship is used to characterize the hierarchical relationship between two connected entities in the concept graph; the inclusion relationship is used to characterize whether a specified entity among two connected entities in the concept graph belongs to another entity; the target association relationship includes at least one of target instance relationship, target inheritance relationship, and target inclusion relationship; The determining module is further configured to determine sorting constraint rules based on the core entity, the constraint information, and the target association; and to determine the target query graph based on the sorting constraint rules. The query module is configured to retrieve the target text corresponding to the query text from the concept graph based on the target query graph.
8. An electronic device, characterized in that, include: processor; Memory used to store processor-executable instructions; The processor is configured to perform the steps of the method according to any one of claims 1-6.
9. A computer-readable storage medium having computer program instructions stored thereon, characterized in that, When executed by a processor, the program instructions implement the steps of the method described in any one of claims 1-6.