A hybrid storage and semantic retrieval method, device, computer equipment and storage medium

By combining hybrid storage and semantic retrieval methods with relational and vector databases, the problem of low retrieval efficiency of agents in multi-turn dialogues is solved, achieving efficient semantic understanding and contextual information retrieval, and improving the accuracy and efficiency of agents' dialogue understanding.

CN122240645APending Publication Date: 2026-06-19BEIJING ZHONGSHURUIZHI TECH CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
BEIJING ZHONGSHURUIZHI TECH CO LTD
Filing Date
2026-03-19
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

In existing technologies, intelligent agents cannot quickly find historical dialogues that are semantically related to the current problem but not temporally adjacent in multi-turn dialogues, resulting in low retrieval efficiency and accuracy, inability to understand the semantics of the dialogue, and poor scalability due to the difficulty of a single storage mode to cope with the needs of rapid querying and diversified management of massive historical data.

Method used

A hybrid storage and semantic retrieval method is adopted. Time-series text information is obtained from a relational database through a first storage index, and the request text is converted into a semantic embedding vector for similarity search in a vector database. The text information of the two storage modes is fused to generate context text.

Benefits of technology

It improves the accuracy and efficiency of the agent's understanding in multi-turn dialogues, and can quickly recall semantically relevant historical context information to ensure temporal continuity.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122240645A_ABST
    Figure CN122240645A_ABST
Patent Text Reader

Abstract

This application discloses a method, apparatus, computer device, and storage medium based on hybrid storage and semantic retrieval. The method includes the following steps: responding to a session request instruction, the session request instruction carrying request text information, session identifier information, and timestamp information; retrieving first text information from a first database indicated by a first storage index based on the session identifier information and timestamp information; converting the request text information into a semantic embedding vector; performing a similarity search on a second database indicated by a second storage index based on the semantic embedding vector to retrieve second text information, the second text information being the text information with the highest similarity to the request text information; and fusing the first text information and the second text information to generate contextual text information. Using this application can improve the accuracy and efficiency of intelligent agents in multi-turn dialogues.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of information retrieval, and in particular to a method, apparatus, computer device, and storage medium based on hybrid storage and semantic retrieval. Background Technology

[0002] With the rapid development of artificial intelligence, interaction between users and intelligent agents is essential. Intelligent agents typically need to remember historical dialogues to support multi-turn interactions. Currently, there are two main technical methods: one relies on the agent's limited context window for memorization, but this method has limited capacity and cannot be persisted; the other uses an external database (such as key-value stores) to linearly store dialogue records by session ID. While this method can persist, the retrieval method is coarse, usually only retrieving the most recent entries in chronological order, making it difficult to efficiently support semantically related referencing or long-range contextual backtracking. Therefore, a common problem is that linear retrieval cannot quickly find historical dialogues that are semantically related to the current question but not temporally adjacent, resulting in low retrieval efficiency and accuracy; it cannot understand the semantics of the dialogue, only providing raw text, which is not conducive to the agent's understanding in complex referencing scenarios and lacks intelligence; and a single storage mode is difficult to cope with the needs of rapid querying and diverse management of massive historical data, resulting in poor scalability. Summary of the Invention

[0003] This application provides a method, apparatus, computer device, and storage medium based on hybrid storage and semantic retrieval, which can improve the accuracy and efficiency of intelligent agents in multi-turn dialogues.

[0004] One embodiment of this application provides a method based on hybrid storage and semantic retrieval, which may include: The system responds to a session request instruction, which carries request text information, session identifier information, and timestamp information. The first text information is retrieved from the first database indicated by the first storage index based on the session identifier information and timestamp information; The request text information is converted into a semantic embedding vector. Based on the semantic embedding vector, a similarity search is performed on the second database indicated by the second storage index to obtain the second text information, which is the text information with the highest similarity to the request text information. The first text information and the second text information are merged to generate contextual text information.

[0005] In one feasible implementation, it further includes: Historical dialogue data is acquired, and a storage index is generated by indexing the historical dialogue data; the storage index includes a first storage index and a second storage index. The historical dialogue data is stored in the database corresponding to the storage index.

[0006] In one feasible implementation, it further includes: Obtain new dialogue data input by the user, and index the new dialogue data to generate a storage index for the new dialogue data; Add the storage index of the newly added dialogue data to the first storage index and the second storage index; The newly added dialogue data is added to the database corresponding to the storage index.

[0007] In one feasible implementation, the step of acquiring historical dialogue data and indexing the historical dialogue data to construct a storage index for the historical dialogue data includes: Acquire historical dialogue data, and generate a first storage index based on the session identifier information, round information, and timestamp information of the historical dialogue data; The text information of the historical dialogue data is concatenated to generate a text string; Based on the semantic encoding model, the text string is converted into a semantic vector, and a second storage index is generated according to the semantic vector, the session identifier information, round information, and timestamp information of the historical dialogue data.

[0008] In one feasible implementation, retrieving the first text information from the database indicated by the first storage index based on the session identifier information and timestamp information includes: Based on the session identification information, the session target is determined in the first storage index; Based on the session objective, at least two rounds of text information most recent to the timestamp are retrieved from the database indicated by the first storage index. Each round of text information includes user query information and agent response information. The at least two rounds of text information are identified as the first text information.

[0009] In one feasible implementation, the step of converting the query text information into a semantic embedding vector, and performing a similarity search on the database indicated by the second storage index based on the semantic embedding vector to obtain the second text information includes: The query text information is input into a pre-trained semantic encoding model, and the semantic encoding model outputs a semantic embedding vector corresponding to the query text information; the semantic embedding vector includes multiple word vectors. Obtain the target word vectors from the target vectors in the second database; the target vectors include multiple target word vectors. Calculate the vector angle between each word vector and the target word vector in sequence, and obtain the sum of the vector angles; Similarity matching is performed based on the sum of the vector angles. When the vector angle is greater than the vector angle threshold, the text information corresponding to the target word vector is determined as the second text. or, The semantic embedding vector is matched with the target vector in the second database using cosine similarity. When the cosine similarity between the semantic embedding vector and the target vector in the second database is greater than a similarity threshold, the text information corresponding to the target vector is determined as the second text.

[0010] In one feasible implementation, it further includes: When the cosine similarity between the semantic embedding vector and the target vector in the second database is less than or equal to the similarity threshold, a matching failure message is sent to the user. In response to a follow-up request based on the prompt information, the similarity threshold is updated to generate a new similarity threshold; The semantic embedding vector is matched with the target vector in the second database using cosine similarity. When the cosine similarity between the semantic embedding vector and the target vector in the second database is greater than a new similarity threshold, the text information corresponding to the target vector is determined as the second text.

[0011] In one feasible implementation, the step of fusing the first text information and the second text information to generate contextual text information includes: Both the first and second text information include at least one round of text information; At least one round of text information in the first text information and the second text information is sorted according to the round information, and each round of text information in the first text information and the second text information carries round information; The sorted text information from at least one round is deduplicated to generate context text information.

[0012] One embodiment of this application provides a hybrid storage and semantic retrieval device, which may include: A session response unit is used to respond to a session request instruction, wherein the session request instruction carries request text information, session identifier information, and timestamp information; The first text acquisition unit is used to acquire first text information from the first database indicated by the first storage index according to the session identifier information and timestamp information; The second text acquisition unit is used to convert the request text information into a semantic embedding vector, perform a similarity search on the second database indicated by the second storage index according to the semantic embedding vector, and acquire the second text information, wherein the second text information is the text information with the highest similarity to the request text information; The text fusion unit is used to fuse the first text information and the second text information to generate context text information.

[0013] In one feasible implementation, it further includes: A data storage unit is used to acquire historical dialogue data, index the historical dialogue data, and construct a storage index for the historical dialogue data; the storage index includes a first storage index and a second storage index. The historical dialogue data is stored in the database corresponding to the storage index.

[0014] In one feasible implementation, the data storage unit is further used for: Obtain new dialogue data input by the user, and index the new dialogue data to generate a storage index for the new dialogue data; Add the storage index of the newly added dialogue data to the first storage index and the second storage index; The newly added dialogue data is added to the database corresponding to the storage index.

[0015] In one feasible implementation, the data storage unit is further used for: Acquire historical dialogue data, and generate a first storage index based on the session identifier information, round information, and timestamp information of the historical dialogue data; The text information of the historical dialogue data is concatenated to generate a text string; Based on the semantic encoding model, the text string is converted into a semantic vector, and a second storage index is generated according to the semantic vector, the session identifier information, round information, and timestamp information of the historical dialogue data.

[0016] In one feasible implementation, the first text acquisition unit is used to: Based on the session identification information, the session target is determined in the first storage index; Based on the session objective, at least two rounds of text information most recent to the timestamp are retrieved from the database indicated by the first storage index. Each round of text information includes user query information and agent response information. The at least two rounds of text information are identified as the first text information.

[0017] In one feasible implementation, the second text acquisition unit is used to: The query text information is input into a pre-trained semantic encoding model, and the semantic encoding model outputs a semantic embedding vector corresponding to the query text information; the semantic embedding vector includes multiple word vectors. Obtain the target word vectors from the target vectors in the second database; the target vectors include multiple target word vectors. Calculate the vector angle between each word vector and the target word vector in sequence, and obtain the sum of the vector angles; Similarity matching is performed based on the sum of the vector angles. When the vector angle is greater than the vector angle threshold, the text information corresponding to the target word vector is determined as the second text. or, The semantic embedding vector is matched with the target vector in the second database using cosine similarity. When the cosine similarity between the semantic embedding vector and the target vector in the second database is greater than a similarity threshold, the text information corresponding to the target vector is determined as the second text.

[0018] In one feasible implementation, the second text acquisition unit is further configured to: When the cosine similarity between the semantic embedding vector and the target vector in the second database is less than or equal to the similarity threshold, a matching failure message is sent to the user. In response to a follow-up request based on the prompt information, the similarity threshold is updated to generate a new similarity threshold; The semantic embedding vector is matched with the target vector in the second database using cosine similarity. When the cosine similarity between the semantic embedding vector and the target vector in the second database is greater than a new similarity threshold, the text information corresponding to the target vector is determined as the second text.

[0019] In one feasible implementation, the text fusion unit is used for: Both the first and second text information include at least one round of text information; At least one round of text information in the first text information and the second text information is sorted according to the round information, and each round of text information in the first text information and the second text information carries round information; The sorted text information from at least one round is deduplicated to generate context text information.

[0020] One embodiment of this application provides a computer-readable storage medium storing a computer program adapted to be loaded by a processor and executed by the above-described method steps.

[0021] One embodiment of this application provides a computer device, including: a processor, a memory, and a network interface; the processor is connected to the memory and the network interface, wherein the network interface is used to provide network communication functions, the memory is used to store program code, and the processor is used to call the program code to execute the above-described method steps.

[0022] One embodiment of this application provides a computer program product or computer program, which includes computer instructions stored in a computer-readable storage medium. A processor of a computer device reads the computer instructions from the computer-readable storage medium and executes the computer instructions, causing the computer device to perform the method steps described above.

[0023] In this embodiment, by responding to a session request instruction, which carries request text information, session identifier information, and timestamp information, first text information is further obtained from a first database indicated by a first storage index based on the session identifier information and timestamp information. The request text information is converted into a semantic embedding vector, and a similarity search is performed on a second database indicated by a second storage index based on the semantic embedding vector to obtain second text information, which is the text information with the highest similarity to the request text information. Finally, the first text information and the second text information are fused to generate context text information. This solution obtains text information from databases with two storage modes and fuses the text information, ensuring temporal continuity and quickly recalling semantically highly relevant historical context information, thereby improving the accuracy and efficiency of the agent's understanding in multi-turn dialogues. Attached Figure Description

[0024] To more clearly illustrate the technical solutions in the embodiments of this application or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are only some embodiments of this application. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.

[0025] Figure 1 This is a network architecture diagram based on hybrid storage and semantic retrieval provided in an embodiment of this application; Figure 2 This is a flowchart illustrating a hybrid storage and semantic retrieval method provided in an embodiment of this application; Figure 3 This is a flowchart illustrating a hybrid storage and semantic retrieval method provided in an embodiment of this application; Figure 4This is a schematic diagram of a hybrid storage and semantic retrieval device provided in an embodiment of this application; Figure 5 This is a schematic diagram of the structure of a computer device provided in an embodiment of this application. Detailed Implementation

[0026] The technical solutions of the embodiments of this application will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of this application, and not all embodiments. Based on the embodiments of this application, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of this application.

[0027] Please see Figure 1 , Figure 1 This is a network architecture diagram based on hybrid storage and semantic retrieval provided in an embodiment of this application. The network architecture diagram may include a service server 100 and a user terminal cluster. The user terminal cluster may include user terminal 10a, user terminal 10b, ..., user terminal 10c. Communication connections may exist between user terminals in the cluster; for example, there is a communication connection between user terminal 10a and user terminal 10b, and a communication connection between user terminal 10b and user terminal 10c. Furthermore, any user terminal in the user terminal cluster may have a communication connection with the service server 100; for example, there is a communication connection between user terminal 10a and service server 100, and a communication connection between user terminal 10b and service server 100.

[0028] The aforementioned user terminal cluster (including user terminal 10a, user terminal 10b, and user terminal 10c) can all have the target application installed. Optionally, the target application may include an application with the function of displaying data information such as text, images, and videos.

[0029] Database 10d and Database 10e are different types of databases. Specifically, Database 10d can be a relational database, while Database 10e can be a vector database. Database 10d and Database 10e each store different types of data to accommodate different query needs in the system design. User terminals can query relevant context information through session request commands.

[0030] Optionally, the aforementioned user terminal can be one of the above-mentioned... Figure 1 Any user terminal selected in the user terminal cluster of the corresponding embodiment, for example, the user terminal can be the aforementioned user terminal 10b.

[0031] It is understood that the methods provided in this application embodiment can be executed by computer devices, including but not limited to terminals or servers. The business server 100 in this application embodiment can be a computer device, and the user terminals in the user terminal cluster can also be computer devices; this is not limited here. The aforementioned business server can be an independent physical server, a server cluster or distributed system composed of multiple physical servers, or a cloud server providing basic cloud computing services such as cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, CDN, and big data and artificial intelligence platforms. The aforementioned terminals can include: smartphones, tablets, laptops, desktop computers, smart TVs, smart speakers, desktop computers, smartwatches, and other smart terminals with image recognition functions, but are not limited to these. The user terminals and the business server can be directly or indirectly connected via wired or wireless communication; this application does not impose any restrictions on this connection.

[0032] Furthermore, for ease of understanding, please refer to Figure 2 , Figure 2 This is a flowchart illustrating the hybrid storage and semantic retrieval method provided in an embodiment of this application. This method can be implemented by a user terminal (e.g., the one described above). Figure 1 The operation can be performed by the user terminal shown, or by the user terminal and the service server (as described above). Figure 1 The service server 100 in the corresponding embodiment executes the method together. For ease of understanding, this embodiment uses the method executed by the aforementioned user terminal as an example for explanation. The method based on hybrid storage and semantic retrieval may include at least the following steps S101-S104: S101, respond to the session request instruction, the session request instruction carrying request text information, session identifier information and timestamp information; Specifically, the user terminal responds with a session request instruction carrying request text information, session identifier information, and timestamp information. This session request instruction is used to retrieve context information related to the request text information from the database. The request text information is the specific text content used to match data in the database. The session identifier information identifies different users and can be a user's session ID. The timestamp information identifies the time attribute of the data, specifically the time when the user sent a message (text, image, video, etc.) or the time when the intelligent agent replied to the user's message. The intelligent agent is an agent with autonomy, adaptability, and interactive capabilities; it can be software, hardware, or a system, such as an automation platform or terminal. The session request instruction can be a request initiated by the user through the terminal or API interface, or it can be a task instruction automatically triggered by the user terminal according to a preset strategy.

[0033] S102, retrieve the first text information from the first database indicated by the first storage index according to the session identifier information and timestamp information; Specifically, the user terminal retrieves the first text information from the first database indicated by the first storage index based on the session identifier information and timestamp information. It can be understood that the first database is a relational database, and the first storage index is the storage index corresponding to the first database, specifically a time-series storage index of the relational database. Relevant data can be found in the database based on the storage index. Specifically, the user terminal retrieves the most recent N rounds of original dialogue text from the first database based on the session ID in the session request instruction and the first storage index. These N rounds of original dialogue text constitute the first text information. Each round of dialogue includes user query information and agent response information. For example, in round 51, the user asks, "How is the battery life of that waterproof Bluetooth speaker I looked at before?", and the agent replies, "The Bluetooth speaker has a battery life of 5 hours." The user query and agent response constitute one round of information. It should be noted that the number of dialogue rounds retrieved can be set according to specific circumstances; for example, it can be set to the most recent 10 rounds. Retrieving the most recent N rounds of original dialogue text ensures the basic coherence and freshness of the context.

[0034] S103, the request text information is converted into a semantic embedding vector, and a similarity search is performed on the second database indicated by the second storage index according to the semantic embedding vector to obtain the second text information, wherein the second text information is the text information with the highest similarity to the request text information; Specifically, the semantic embedding vector includes multiple word vectors, and the user terminal obtains the target word vector of the target vector in the second database; the target vector includes multiple target word vectors; further, the vector angle between each word vector and the target word vector is calculated in sequence, and the sum of the vector angles is obtained; for example, both the semantic embedding vector and the target vector include three word vectors, namely the first word vector, the second word vector, and the third word vector, respectively. The first vector angle between the first word vectors, the second vector angle between the second word vectors, and the third vector angle between the third word vectors are calculated respectively, and the three vector angles are summed to obtain the sum of the vector angles. Further, similarity matching is performed based on the sum of the vector angles. When the vector angle is greater than the vector angle threshold, the text information corresponding to the target word vector is determined as the second text. Alternatively, cosine similarity can be used to obtain the second text. Specifically, the user terminal converts the request text information into a semantic embedding vector, and performs a similarity search on the second database indicated by the second storage index based on the semantic embedding vector to obtain the second text information. It is understood that the second database is a vector database, and the second storage index is the storage index corresponding to the second database. Specifically, it can be the semantic vector storage index of the vector database, and relevant data can be found in the database based on the storage index. Specifically, the user terminal converts the request text information in the session request instruction into a semantic embedding vector. Based on the second storage index, a similarity search is performed on the second database for the same session ID to find the most similar text information. Similar text information can include multi-turn information, for example, the three most similar turns. The similarity search can use cosine similarity, and a configurable similarity threshold can be set, for example, the default similarity threshold is 0.7. Results below the threshold are not recalled to prevent forced recall of semantically irrelevant content. Through the above steps, temporal barriers can be overcome to recall key conversations that are semantically highly relevant to the current issue but may have occurred in an earlier period.

[0035] It should be noted that steps S102 and S103 can be executed individually or in parallel; this scheme does not impose any restrictions.

[0036] S104, the first text information and the second text information are fused to generate contextual text information.

[0037] Specifically, the user terminal merges the first and second text information to generate contextual text information. This means the first and second text information include multiple rounds of text dialogue. The user terminal merges the first and second text information in terms of both time sequence and content. Specifically, duplicate round information can be deleted, and the contextual text information can be generated by sorting according to timestamp information. During the time sorting process, sorting can also be done by round number. For example, if the round number and timestamp are inconsistent, the round number takes precedence because the round number is atomically and sequentially allocated by the server during writing, accurately reflecting the logical order of the dialogue. Timestamps, however, are affected by system clock deviations and may become out of order in concurrent write scenarios. When concurrent writes occur within the same session, the server ensures the seriality of round number allocation through database row-level locks or atomic increment operations, preventing sorting conflicts. Through these steps, a coherent final historical contextual text sequence that integrates temporal proximity and semantic relevance can be formed.

[0038] In this embodiment, by responding to a session request instruction, which carries request text information, session identifier information, and timestamp information, first text information is further obtained from a first database indicated by a first storage index based on the session identifier information and timestamp information. The request text information is converted into a semantic embedding vector, and a similarity search is performed on a second database indicated by a second storage index based on the semantic embedding vector to obtain second text information, which is the text information with the highest similarity to the request text information. Finally, the first text information and the second text information are fused to generate context text information. This solution obtains text information from databases with two storage modes and fuses the text information, ensuring temporal continuity and quickly recalling semantically highly relevant historical context information, thereby improving the accuracy and efficiency of the agent's understanding in multi-turn dialogues.

[0039] Please see Figure 3 , Figure 3 This is a flowchart illustrating the hybrid storage and semantic retrieval method provided in an embodiment of this application. This method can be implemented by a user terminal (e.g., the one described above). Figure 1 The operation can be performed by the user terminal shown, or by the user terminal and the service server (as described above). Figure 1 The service server 100 in the corresponding embodiment executes the method together. For ease of understanding, this embodiment uses the method executed by the aforementioned user terminal as an example for explanation. The method based on hybrid storage and semantic retrieval may include at least the following steps S201-S207: S201, Obtain historical dialogue data, index the historical dialogue data to construct a storage index for the historical dialogue data; the storage index includes a first storage index and a second storage index; store the historical dialogue data in the database corresponding to the storage index respectively.

[0040] Specifically, the user terminal acquires historical dialogue data and indexes it to generate a storage index. This storage index includes a first storage index and a second storage index. The historical dialogue data is then stored in the database corresponding to each storage index. Essentially, the user terminal acquires historical dialogue data and generates the first storage index based on the session identifier, round information, and timestamp information. The session identifier can be a session ID, and the round information is a round number generated chronologically. Each round of historical dialogue data includes fields such as session ID, round number, user query text, agent response text, and timestamp. The round number is monotonically increased from 1 within the same session ID, following the order of dialogue occurrence (i.e., the round number of the kth round within the same session is k). This round number is atomically allocated by the server upon receiving a dialogue storage request, ensuring that the round number is unique and ordered within the same session. This storage provides efficient and accurate range queries and sorting by session ID and round number, ensuring rapid retrieval of dialogue streams arranged chronologically.

[0041] Furthermore, the user terminal concatenates the text information of the historical dialogue data to generate a text string. Based on a semantic encoding model, the text string is converted into a semantic vector. A second storage index is generated based on the semantic vector, the session identifier information, round information, and timestamp information of the historical dialogue data. It is understood that the semantic encoding model is a pre-trained encoding model, such as Sentence-BERT or text-embedding-ada-002. This type of model can map natural language text to dense vectors in the semantic space, making semantically similar texts close in distance in the vector space. Specifically, the text content of each round of dialogue is concatenated to generate a high-dimensional semantic embedding vector. The text processing method adopts unified encoding after concatenation. The user query text and agent reply text of the same round are concatenated into a single string in the format "User query: [text] Agent reply: [text]", input into the semantic encoding model, and output a vector representing the complete semantics of that round of dialogue. The above method can simultaneously capture the semantic information of user intent and agent response, and further store the generated vector along with metadata such as session ID and round number into the vector database.

[0042] Furthermore, when adding a new dialogue record, an index can be written simultaneously to both databases, with the two processes executing concurrently in the writing sequence without blocking each other. During queries, the request is routed to the corresponding database for independent querying based on the request type. Specifically, the new dialogue data input by the user is obtained, an index is built on the new dialogue data to generate a storage index, and this storage index is then added to both the first and second storage indexes. Finally, the new dialogue data is added to the database corresponding to the storage index. These steps complete the database update.

[0043] S202, responding to a session request instruction, wherein the session request instruction carries request text information, session identifier information, and timestamp information; S203, retrieve the first text information from the database indicated by the first storage index according to the session identifier information and timestamp information; Specifically, the user terminal retrieves the first text information from the database indicated by the first storage index based on the session identifier information and the timestamp information. It can be understood that the user terminal determines the session target in the first storage index based on the session identifier information. The first storage index includes information on multiple session targets. The user's corresponding session target can be determined based on the session ID. Further, based on the session target, the user terminal queries the database indicated by the first storage index for at least two rounds of text information most recent to the timestamp. Each round of text information includes user query information and agent response information. It should be noted that the number of dialogue rounds retrieved can be set according to specific circumstances; for example, it can be set to the most recent 10 rounds. Further, the at least two rounds of text information are determined as the first text information.

[0044] S204, the query text information is converted into a semantic embedding vector, and a similarity search is performed on the database indicated by the second storage index based on the semantic embedding vector to obtain the second text information; Specifically, the user terminal inputs the query text information into a pre-trained semantic encoding model, which outputs a semantic embedding vector corresponding to the query text information. The semantic embedding vector includes multiple word vectors. The user terminal obtains the target word vector of the target vector in the second database. The target vector includes multiple target word vectors. Further, the vector angle between each word vector and the target word vector is calculated sequentially, and the sum of the vector angles is obtained. For example, both the semantic embedding vector and the target vector include three word vectors, namely the first word vector, the second word vector, and the third word vector. The first vector angle between the first word vectors, the second vector angle between the second word vectors, and the third vector angle between the third word vectors are calculated respectively. The three vector angles are summed to obtain the sum of the vector angles. Further, similarity matching is performed based on the sum of the vector angles. When the vector angle is greater than the vector angle threshold, the text information corresponding to the target word vector is determined as the second text. Alternatively, cosine similarity can be used to obtain the second text. Specifically, the user terminal inputs the query text information into a pre-trained semantic encoding model. The semantic encoding model outputs a semantic embedding vector corresponding to the query text information, and then performs cosine similarity matching between the semantic embedding vector and the target vector in the second database. When the cosine similarity between the semantic embedding vector and the target vector in the second database is greater than a similarity threshold, the text information corresponding to the target vector is determined as the second text. For example, the default similarity threshold is 0.7. If there are 3 rounds of similarity thresholds greater than 0.7, then the information from those 3 rounds is determined as the second text. When the cosine similarity between the semantic embedding vector and the target vector in the second database is less than or equal to the similarity threshold, a matching failure message is sent to the user. Furthermore, if the user needs to make another request based on the previous session request instruction, the terminal can perform another matching based on a lowered threshold. Specifically, in response to the request instruction based on the prompt message, the similarity threshold is updated to generate a new similarity threshold. The threshold update is usually a reduction, for example, if the previous threshold was 0.7, the updated threshold is 0.65. Further, the semantic embedding vector is matched with the target vector in the second database using cosine similarity. When the cosine similarity between the semantic embedding vector and the target vector in the second database is greater than the new similarity threshold, the text information corresponding to the target vector is determined as the second text. Similarly, the second text can have multiple rounds of information; if the similarity threshold is greater than 0.65 for 3 rounds, then 3 rounds of information are determined as the second text.

[0045] S205, the first text information and the second text information are fused to generate context text information; Specifically, both the first and second text information include at least one round of text information. The user terminal sorts the at least one round of text information in the first and second text information according to the round information. Each round of text information in the first and second text information carries round information. The sorted at least one round of text information is then deduplicated to generate context text information. For example, the database stores 50 rounds of dialogue. When a user queries, "How is the battery life of that waterproof Bluetooth speaker I looked at before?", the user terminal first retrieves the most recent 5 rounds of dialogue from the relational database, i.e., the dialogue text from rounds 45 to 50. It then searches the vector database and finds that the semantics of the 12th round of dialogue (where the user asked "recommend some waterproof Bluetooth speakers") are the most similar. The 12th round of dialogue is then retrieved, and the results from the two databases are merged to obtain the [45, 46, 47, 48, 49, 50, 12] rounds of dialogue text. After deduplication, the context text is sorted by time as [12, 45, 46, 47, 48, 49, 50]. The aforementioned contextual text includes both recent coherent conversations and precise Bluetooth information from earlier mentions of "waterproof Bluetooth speakers," enhancing the agent's understanding accuracy in multi-turn dialogues.

[0046] Wherein, step S202 of the embodiment of the present invention is referred to Figure 2 The specific description of step S101 in the illustrated embodiment will not be repeated here.

[0047] In this embodiment, by responding to a session request instruction, which carries request text information, session identifier information, and timestamp information, first text information is further obtained from a first database indicated by a first storage index based on the session identifier information and timestamp information. The request text information is converted into a semantic embedding vector, and a similarity search is performed on a second database indicated by a second storage index based on the semantic embedding vector to obtain second text information, which is the text information with the highest similarity to the request text information. Finally, the first text information and the second text information are fused to generate context text information. This solution obtains text information from databases with two storage modes and fuses the text information, ensuring temporal continuity and quickly recalling semantically highly relevant historical context information, thereby improving the accuracy and efficiency of the agent's understanding in multi-turn dialogues.

[0048] Please see Figure 4 , Figure 4This is a schematic diagram of a hybrid storage and semantic retrieval device provided in an embodiment of this application. The hybrid storage and semantic retrieval device can be a computer program (including program code) running on a computer device; for example, the hybrid storage and semantic retrieval device is an application software. This device can be used to execute the corresponding steps in the method provided in the embodiments of this application. Figure 4 As shown, the hybrid storage and semantic retrieval device 1 described in this application embodiment may include: a session response unit 11, a first text acquisition unit 12, a second text acquisition unit 13, a text fusion unit 14, and a data storage unit 15.

[0049] Session response unit 11 is used to respond to session request instructions, wherein the session request instructions carry request text information, session identifier information and timestamp information; The first text acquisition unit 12 is used to acquire first text information from the first database indicated by the first storage index according to the session identifier information and timestamp information; The second text acquisition unit 13 is used to convert the request text information into a semantic embedding vector, perform a similarity search on the second database indicated by the second storage index according to the semantic embedding vector, and acquire the second text information, wherein the second text information is the text information with the highest similarity to the request text information. The text fusion unit 14 is used to fuse the first text information and the second text information to generate context text information.

[0050] In one feasible implementation, it further includes: Data storage unit 15 is used to acquire historical dialogue data, index the historical dialogue data to construct a storage index for the historical dialogue data; the storage index includes a first storage index and a second storage index. The historical dialogue data is stored in the database corresponding to the storage index.

[0051] In one feasible implementation, the data storage unit is further used for: Obtain new dialogue data input by the user, and index the new dialogue data to generate a storage index for the new dialogue data; Add the storage index of the newly added dialogue data to the first storage index and the second storage index; The newly added dialogue data is added to the database corresponding to the storage index.

[0052] In one feasible implementation, the data storage unit is further used for: Acquire historical dialogue data, and generate a first storage index based on the session identifier information, round information, and timestamp information of the historical dialogue data; The text information of the historical dialogue data is concatenated to generate a text string; Based on the semantic encoding model, the text string is converted into a semantic vector, and a second storage index is generated according to the semantic vector, the session identifier information, round information, and timestamp information of the historical dialogue data.

[0053] In one feasible implementation, the first text acquisition unit is used to: Based on the session identification information, the session target is determined in the first storage index; Based on the session objective, at least two rounds of text information most recent to the timestamp are retrieved from the database indicated by the first storage index. Each round of text information includes user query information and agent response information. The at least two rounds of text information are identified as the first text information.

[0054] In one feasible implementation, the second text acquisition unit is used to: The query text information is input into a pre-trained semantic encoding model, and the semantic encoding model outputs a semantic embedding vector corresponding to the query text information; the semantic embedding vector includes multiple word vectors. Obtain the target word vectors from the target vectors in the second database; the target vectors include multiple target word vectors. Calculate the vector angle between each word vector and the target word vector in sequence, and obtain the sum of the vector angles; Similarity matching is performed based on the sum of the vector angles. When the vector angle is greater than the vector angle threshold, the text information corresponding to the target word vector is determined as the second text. or, The semantic embedding vector is matched with the target vector in the second database using cosine similarity. When the cosine similarity between the semantic embedding vector and the target vector in the second database is greater than a similarity threshold, the text information corresponding to the target vector is determined as the second text.

[0055] In one feasible implementation, the second text acquisition unit is further configured to: When the cosine similarity between the semantic embedding vector and the target vector in the second database is less than or equal to the similarity threshold, a matching failure message is sent to the user. In response to a follow-up request based on the prompt information, the similarity threshold is updated to generate a new similarity threshold; The semantic embedding vector is matched with the target vector in the second database using cosine similarity. When the cosine similarity between the semantic embedding vector and the target vector in the second database is greater than a new similarity threshold, the text information corresponding to the target vector is determined as the second text.

[0056] In one feasible implementation, the text fusion unit is used for: Both the first and second text information include at least one round of text information; At least one round of text information in the first text information and the second text information is sorted according to the round information, and each round of text information in the first text information and the second text information carries round information; The sorted text information from at least one round is deduplicated to generate context text information.

[0057] In this embodiment, by responding to a session request instruction, which carries request text information, session identifier information, and timestamp information, first text information is further obtained from a first database indicated by a first storage index based on the session identifier information and timestamp information. The request text information is converted into a semantic embedding vector, and a similarity search is performed on a second database indicated by a second storage index based on the semantic embedding vector to obtain second text information, which is the text information with the highest similarity to the request text information. Finally, the first text information and the second text information are fused to generate context text information. This solution obtains text information from databases with two storage modes and fuses the text information, ensuring temporal continuity and quickly recalling semantically highly relevant historical context information, thereby improving the accuracy and efficiency of the agent's understanding in multi-turn dialogues.

[0058] Please see Figure 5 , Figure 5 This is a schematic diagram of the structure of a computer device provided in an embodiment of this application. Figure 5As shown, the computer device 1000 may include: at least one processor 1001, such as a CPU; at least one network interface 1004; a user interface 1003; a memory 1005; and at least one communication bus 1002. The communication bus 1002 is used to enable communication between these components. The user interface 1003 may include a display screen, and optionally, it may also include a standard wired interface or a wireless interface. The network interface 1004 may optionally include a standard wired interface or a wireless interface (such as a Wi-Fi interface). The memory 1005 may be random access memory (RAM) or non-volatile memory (NVM), such as at least one disk storage device. Optionally, the memory 1005 may also be at least one storage device located remotely from the aforementioned processor 1001. Figure 5 As shown, the memory 1005, which serves as a computer storage medium, may include an operating system, a network communication module, a user interface module, and a data processing application program.

[0059] exist Figure 5 In the computer device 1000 shown, the network interface 1004 provides network communication functions, the user interface 1003 is mainly used to provide an input interface for the user, and the processor 1001 can be used to call the data processing application stored in the memory 1005 to achieve the above. Figures 2-3 The description of the hybrid storage and semantic retrieval method in any of the corresponding embodiments will not be repeated here.

[0060] It should be understood that the computer device 1000 described in the embodiments of this application can execute the foregoing text. Figures 2-3 The description of the hybrid storage and semantic retrieval method in any of the corresponding embodiments can also be performed as described above. Figure 4 The description of the apparatus in the corresponding embodiments will not be repeated here. Furthermore, the beneficial effects of using the same method will also not be repeated.

[0061] Furthermore, it should be noted that this application embodiment also provides a computer-readable storage medium, which stores the computer program executed by the aforementioned hybrid storage and semantic retrieval device, and the computer program includes program instructions. When the processor executes the program instructions, it can execute the aforementioned... Figures 2-3The description of the hybrid storage and semantic retrieval method in any corresponding embodiment will not be repeated here. Furthermore, the beneficial effects of using the same method will also not be repeated. For technical details not disclosed in the computer-readable storage medium embodiments involved in this application, please refer to the description of the method embodiments of this application. As an example, program instructions can be deployed to execute on a single computing device, or on multiple computing devices located in one location, or on multiple computing devices distributed across multiple locations and interconnected via a communication network. These multiple computing devices distributed across multiple locations and interconnected via a communication network can constitute a blockchain system.

[0062] Those skilled in the art will understand that all or part of the processes in the methods of the above embodiments can be implemented by a computer program instructing related hardware. The program can be stored in a computer-readable storage medium, and when executed, it can include the processes of the embodiments of the above methods. The computer-readable storage medium can be an internal storage unit of the apparatus or device provided in any of the foregoing embodiments, such as a hard disk or memory of an electronic device. The computer-readable storage medium can also be an external storage device of the electronic device, such as a plug-in hard disk, smart media card (SMC), secure digital (SD) card, flash card, etc., equipped on the electronic device. The computer-readable storage medium can also include magnetic disks, optical disks, read-only memory (ROM), or random access memory, etc. Furthermore, the computer-readable storage medium can include both internal storage units and external storage devices of the electronic device. The computer-readable storage medium is used to store the computer program and other programs and quantities required by the electronic device. The computer-readable storage medium can also be used to temporarily store data that has been output or will be output.

[0063] The terms "first," "second," etc., used in the claims, description, and drawings of this invention are used to distinguish different objects, not to describe a particular order. Furthermore, the terms "comprising" and "having," and any variations thereof, are intended to cover non-exclusive inclusion. For example, a process, method, system, product, or apparatus that comprises a series of steps or units is not limited to the listed steps or units, but may optionally include steps or units not listed, or may optionally include other steps or units inherent to these processes, methods, products, or apparatuses. References to "embodiment" herein mean that a particular feature, structure, or characteristic described in connection with an embodiment may be included in at least one embodiment of the invention. The presentation of this phrase in various places in the specification does not necessarily refer to the same embodiment, nor is it a separate or alternative embodiment mutually exclusive with other embodiments. It will be explicitly and implicitly understood by those skilled in the art that the embodiments described herein can be combined with other embodiments. The term "and / or" as used in this specification and the appended claims refers to any combination of one or more of the associated listed items and all possible combinations, and includes such combinations.

[0064] Those skilled in the art will recognize that the units and algorithm steps of the various examples described in conjunction with the embodiments disclosed herein can be implemented in electronic hardware, computer software, or a combination of both. To clearly illustrate the interchangeability of hardware and software, the components and steps of the various examples have been generally described in terms of functionality in the foregoing description. Those skilled in the art can implement the described functions using different methods for each specific application, but such implementations should not be considered beyond the scope of this invention.

[0065] In the various embodiments of this application, the functional units can be integrated into one processing unit, or each unit can exist physically separately, or two or more units can be integrated into one unit. The integrated unit can be implemented in hardware or as a software functional unit.

[0066] The above-disclosed embodiments are merely preferred embodiments of this application and should not be construed as limiting the scope of this application. Therefore, any equivalent variations made in accordance with the claims of this application shall still fall within the scope of this application.

Claims

1. A method based on hybrid storage and semantic retrieval, characterized in that, include: The system responds to a session request instruction, which carries request text information, session identifier information, and timestamp information. Retrieving first text information from a first database indicated by a first storage index based on the session identifier information and timestamp information includes determining a session target in the first storage index based on the session identifier information; querying at least two rounds of text information most recent to the timestamp from the database indicated by the first storage index based on the session target, each round of text information including user query information and agent response information; and determining the at least two rounds of text information as the first text information. The request text information is converted into a semantic embedding vector. Based on the semantic embedding vector, a similarity search is performed on the second database indicated by the second storage index to obtain the second text information, which is the text information with the highest similarity to the request text information. The first text information and the second text information are merged to generate contextual text information.

2. The method based on hybrid storage and semantic retrieval according to claim 1, characterized in that, Also includes: Obtain historical dialogue data, and index the historical dialogue data to build a storage index for the historical dialogue data; The storage index includes a first storage index and a second storage index; The historical dialogue data is stored in the database corresponding to the storage index.

3. The method based on hybrid storage and semantic retrieval according to claim 2, characterized in that, The step of acquiring historical dialogue data and indexing the historical dialogue data to construct a storage index for the historical dialogue data includes: Acquire historical dialogue data, and generate a first storage index based on the session identifier information, round information, and timestamp information of the historical dialogue data; The text information of the historical dialogue data is concatenated to generate a text string; Based on the semantic encoding model, the text string is converted into a semantic vector, and a second storage index is generated according to the semantic vector, the session identifier information, round information, and timestamp information of the historical dialogue data.

4. The method based on hybrid storage and semantic retrieval according to claim 1, characterized in that, The step of converting the query text information into a semantic embedding vector, and performing a similarity search on the database indicated by the second storage index based on the semantic embedding vector to obtain the second text information includes: The query text information is input into a pre-trained semantic encoding model, and the semantic encoding model outputs a semantic embedding vector corresponding to the query text information; the semantic embedding vector includes multiple word vectors. Obtain the target word vectors from the target vectors in the second database; the target vectors include multiple target word vectors. Calculate the vector angle between each word vector and the target word vector in sequence, and obtain the sum of the vector angles; Similarity matching is performed based on the sum of the vector angles. When the vector angle is greater than the vector angle threshold, the text information corresponding to the target word vector is determined as the second text. or, The semantic embedding vector is matched with the target vector in the second database using cosine similarity. When the cosine similarity between the semantic embedding vector and the target vector in the second database is greater than a similarity threshold, the text information corresponding to the target vector is determined as the second text.

5. The method based on hybrid storage and semantic retrieval according to claim 4, characterized in that, Also includes: When the cosine similarity between the semantic embedding vector and the target vector in the second database is less than or equal to the similarity threshold, a matching failure message is sent to the user. In response to a follow-up request based on the prompt information, the similarity threshold is updated to generate a new similarity threshold; The semantic embedding vector is matched with the target vector in the second database using cosine similarity. When the cosine similarity between the semantic embedding vector and the target vector in the second database is greater than a new similarity threshold, the text information corresponding to the target vector is determined as the second text.

6. The method based on hybrid storage and semantic retrieval according to claim 1, characterized in that, The step of fusing the first text information and the second text information to generate contextual text information includes: Both the first and second text information include at least one round of text information; At least one round of text information in the first text information and the second text information is sorted according to the round information, and each round of text information in the first text information and the second text information carries round information; The sorted text information from at least one round is deduplicated to generate context text information.

7. A hybrid storage and semantic retrieval device, characterized in that, include: A session response unit is used to respond to a session request instruction, wherein the session request instruction carries request text information, session identifier information, and timestamp information; The first text acquisition unit is used to acquire first text information from the first database indicated by the first storage index according to the session identifier information and timestamp information; The second text acquisition unit is used to convert the request text information into a semantic embedding vector, perform a similarity search on the second database indicated by the second storage index according to the semantic embedding vector, and acquire the second text information, wherein the second text information is the text information with the highest similarity to the request text information; The text fusion unit is used to fuse the first text information and the second text information to generate context text information.

8. A computer device, characterized in that, include: Processor, memory, and network interface; The processor is connected to the memory and the network interface, wherein the network interface is used to provide network communication functions, the memory is used to store program code, and the processor is used to call the program code to execute the method according to any one of claims 1-6.

9. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores a computer program adapted to be loaded by a processor and to execute the method of any one of claims 1-6.