Response message generation method and apparatus, electronic device, and storage medium

By expanding the data types of prompts and optimizing word examples, and combining context, history, and professional knowledge to generate response messages, the mismatch problem in response of large language models is solved, improving interaction efficiency and accuracy.

CN117112755BActive Publication Date: 2026-06-19SHANGHAI MOBVOI INFORMATION TECH CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
SHANGHAI MOBVOI INFORMATION TECH CO LTD
Filing Date
2023-08-22
Publication Date
2026-06-19

Smart Images

  • Figure CN117112755B_ABST
    Figure CN117112755B_ABST
Patent Text Reader

Abstract

This disclosure provides a method for generating a response message, comprising: processing a question text to determine prompt data associated with the question text, wherein the prompt data includes contextual reference data, historical reference data, and related knowledge documents; optimizing the quantity of the prompt data based on a word example retrieval threshold of a question-answering model to form target prompt information; and calling a question-answering model to analyze the target prompt information and generate a response message corresponding to the question text. This disclosure also provides a response message generation apparatus, electronic device, and storage medium.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This disclosure relates to the field of artificial intelligence technology, and in particular to a method, apparatus, electronic device, and storage medium for generating response messages. Background Technology

[0002] Large Language Models (LLMs) are essentially dialogue models that can understand and generate human language, enabling everyday conversations with users and resolving various user problems.

[0003] However, current LLMs, when responding to user questions, only consider the preceding context of the dialogue, without incorporating domain-specific expertise or relevant content from previous conversations with the user. Therefore, when interacting with LLMs in related technologies, each round of dialogue requires providing rich question descriptions as prompts, undoubtedly increasing the user's interaction costs. If rich question descriptions are not provided, the responses typically differ significantly from the user's desired outcome, failing to achieve the purpose of human-computer interaction and reducing the user's overall experience. Summary of the Invention

[0004] To address at least one of the problems described above, this disclosure provides a method, apparatus, electronic device, and storage medium for generating response messages.

[0005] According to one aspect of this disclosure, a method for generating a response message is provided, comprising: processing a question text to determine prompt data associated with the question text, wherein the prompt data includes contextual association data, historical association data, and associated knowledge documents; optimizing the quantity of the prompt data based on a word example retrieval threshold of a question-answering model to form target prompt information; and invoking the question-answering model to analyze the target prompt information and generate a response message corresponding to the question text.

[0006] In some implementations, processing the question text to determine the prompt data associated with the question text includes: converting the question text into a question vector, wherein the question vector is used to represent the semantic information of the question text; extracting multiple knowledge-related documents that match the question vector from a knowledge document library; extracting multiple question-answer pairs associated with the question vector from a historical dialogue database to form historical association data composed of multiple question-answer pairs; and extracting multi-turn dialogue information adjacent to the question time of the question text in the target scene to form contextual association data.

[0007] In some implementations, after processing the question text and determining the prompt data associated with the question text, the method includes: integrating the question text, the prompt data, and the role description of the question-and-answer model to form original prompt information.

[0008] In some implementations, the word example retrieval threshold based on the question-answering model is used to optimize the quantity of the prompt data to form target prompt information. This includes: comparing the number of word examples in the original prompt information with the word example retrieval threshold of the question-answering model to obtain a comparison result; in response to the comparison result that the number of word examples is greater than the word example retrieval threshold, truncating multiple related knowledge documents in the prompt data and retaining any one of the related knowledge documents; and when the number of word examples in the original prompt information with a unique related knowledge document is less than or equal to the word example retrieval threshold, forming the target prompt information composed of contextual related data, historical related data, and the unique related knowledge document.

[0009] In some implementations, after truncating multiple related knowledge documents in the prompt data and retaining any one of the related knowledge documents, the method further includes: when the number of word examples in the original prompt information containing a unique related knowledge document is greater than the word example call threshold, truncating the multi-turn dialogue information in the contextual association data and retaining the unique dialogue information that is directly adjacent to the question text in time sequence; and in response to the number of word examples in the original prompt information containing the unique dialogue information being less than or equal to the word example call threshold, forming the target prompt information consisting of the unique dialogue information, the historical association data, and the unique related knowledge document.

[0010] In some implementations, after truncating the dialogue information from multiple rounds in the context-related data and retaining only the unique dialogue information directly adjacent to the question text in time sequence, the method further includes: when the number of word examples in the original prompt information containing the unique dialogue information is greater than the word example call threshold, pre-calculating the unique dialogue information to form accelerated dialogue information; and in response to the number of word examples in the original prompt information containing the accelerated dialogue information being less than or equal to the word example call threshold, forming the target prompt information consisting of the unique accelerated dialogue information, historical related data, and the unique related knowledge document; or in response to the number of word examples in the original prompt information containing the accelerated dialogue information being greater than the word example call threshold, deleting the context-related data to form the target prompt information consisting of the historical related data and the unique related knowledge document.

[0011] In some implementations, before processing the question text and determining the prompt data associated with the question text, the process includes: processing historical question-and-answer documents to form a historical dialogue database comprising multiple dialogue document vectors, wherein the dialogue document vectors correspond to question-and-answer pairs in the historical question-and-answer documents, including: discretizing the multiple historical question-and-answer documents to obtain multiple question-and-answer pairs with a target number of word examples; encoding each question-and-answer pair to obtain the dialogue document vector corresponding to the question-and-answer pair; and integrating the various dialogue document vectors to form the historical dialogue database.

[0012] According to another aspect of this disclosure, an apparatus for generating a response message is provided, comprising: a prompt data determination module, configured to process a question text and determine prompt data associated with the question text, wherein the prompt data includes contextual association data, historical association data, and associated knowledge documents; a target prompt information construction module, configured to optimize the quantity of the prompt data based on a word example call threshold of a question-answering model to form target prompt information; and a response message generation module, configured to call the question-answering model to analyze the target prompt information and generate a response message corresponding to the question text.

[0013] According to another aspect of this disclosure, an electronic device is provided, comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor executes the program to implement the method for generating a response message as described in any of the above embodiments.

[0014] According to another aspect of this disclosure, a readable storage medium is provided that stores a computer program adapted for loading by a processor to perform a response message generation method as described in any of the above embodiments. Attached Figure Description

[0015] The accompanying drawings illustrate exemplary embodiments of the present disclosure and, together with the description thereof, serve to explain the principles of the present disclosure. These drawings are included to provide a further understanding of the present disclosure and are incorporated in and constitute a part of this specification.

[0016] Figure 1 This is a flowchart illustrating a method for generating a response message according to one embodiment of the present disclosure.

[0017] Figure 2 This is a schematic diagram of the response message generation architecture according to one embodiment of the present disclosure.

[0018] Figure 3 This is one of the context-related data diagrams for one embodiment of this disclosure.

[0019] Figure 4 This is a second schematic diagram of contextual data in one embodiment of this disclosure.

[0020] Figure 5 An architecture diagram showing the adjustment of the number of lexical examples for one embodiment of this disclosure. And...

[0021] Figure 6 This is a block diagram of a response message generation apparatus according to one disclosed embodiment. Detailed Implementation

[0022] The present disclosure will now be described in further detail with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are for illustrative purposes only and are not intended to limit the scope of the disclosure. Furthermore, it should be noted that, for ease of description, only the parts relevant to the present disclosure are shown in the accompanying drawings.

[0023] It should be noted that, where there is no conflict, the embodiments and features described in this disclosure can be combined with each other. The technical solutions of this disclosure will now be described in detail with reference to the accompanying drawings and embodiments.

[0024] Unless otherwise stated, the exemplary implementations / embodiments shown are to be understood as providing exemplary features of various details that provide ways in which the technical concepts of this disclosure can be implemented in practice. Therefore, unless otherwise stated, the features of various implementations / embodiments may be additionally combined, separated, interchanged and / or rearranged without departing from the technical concepts of this disclosure.

[0025] The terminology used herein is for the purpose of describing particular embodiments and is not restrictive. As used herein, unless the context clearly indicates otherwise, the singular forms “a” and “the” are intended to include the plural forms as well. Furthermore, when the terms “comprising” and / or “including” and variations thereof are used in this specification, it indicates the presence of the stated features, integrals, steps, operations, parts, components, and / or groups thereof, but does not exclude the presence or addition of one or more other features, integrals, steps, operations, parts, components, and / or groups thereof. It should also be noted that, as used herein, the terms “substantially,” “about,” and other similar terms are used as approximate terms rather than as terms of degree, thus explaining the inherent biases in measurements, calculated values, and / or provided values ​​that would be recognized by one of ordinary skill in the art.

[0026] Figure 1 This is a flowchart illustrating a method for generating a response message according to one embodiment of the present disclosure. Figure 2 This is a schematic diagram of the response message generation architecture according to one embodiment of this disclosure. The following is in conjunction with... Figures 1 to 2The method for generating the response message S100 is described in detail.

[0027] Step S102: Process the question text to determine the prompt data associated with the question text.

[0028] In a dialogue scenario, the question text consists of the questions posed by the user or the information retrieval instructions given. Each dialogue scenario typically involves multiple rounds of conversation, with each round consisting of the user's question text and the response message from the question-and-answer model.

[0029] Prompt data refers to files that provide data support for the generation of response messages by the question-answering model, including at least context-related data, historical related data, and related knowledge documents. In related technologies, question-answering models typically only consider context-related data during human-computer interaction. However, the questions asked by users in each round of dialogue are not necessarily related, easily leading to misinterpretation of user semantics and the generation of response messages unrelated to the question text, thus degrading the user's dialogue experience. This disclosure expands the types of prompt data by including historical related data and related knowledge documents as prompt data, providing data support for the analysis of the question text and maximizing the relevance between the model's output response message and the user's needs.

[0030] Contextualized data is used to represent the dialogue information between the question-answering model and the user in the previous rounds of the current dialogue scenario, providing the question-answering model with contextual information. This dialogue information is usually referred to as the short-term memory of the question-answering model and is part of the original prompt information. Contextualized data can shape question-answering models with different expression styles and response capabilities.

[0031] Figure 3 This is one of the schematic diagrams illustrating contextually related data in one embodiment of this disclosure. For example... Figure 3 As shown, this illustrates four rounds of dialogue between the user and the question-answering model. Q1, Q2, Q3, and Q4 are the user's four questions, and A1, A2, and A3 are the user's responses to Q1, Q2, and Q3, respectively. Here, A represents the user's question, and Q represents the question-answering model's response. In the fourth round of dialogue, the user poses question Q4. When analyzing question Q4, the question-answering model uses the three rounds of dialogue preceding this round as contextual data, i.e., the surrounding context information, and then combines this with Q4 and other prompts to deduce the response message A4.

[0032] However, as the number of dialogue turns increases, and given the limitations on the number of word examples that the question-answering model can call from the target question information, this disclosure proposes a setting for the amount of context-related data. Specifically, it uses multi-turn dialogue information that is temporally adjacent to the question text as context-related data, while earlier dialogues are deleted from short-term memory. In other words, a preset number of multi-turn dialogues that are temporally adjacent to the question text are set as context-related data.

[0033] Figure 4 This is a second schematic diagram illustrating contextually related data in one embodiment of this disclosure. (See reference) Figure 4 When more rounds of dialogue occur in the interaction scenario, such as Q5, A5, Q6, A6, Q7, A7, and Q8, the first four rounds of dialogue information mentioned above will be removed from the contextual association data, and the remaining three rounds of dialogue information (Q5, A5, Q6, A6, Q7, and A7) will be retained as new contextual association data. In this case, Q8 is the question text, and A8 is the response information from the question-answering model. Of course, the preset number of adjacent dialogue information to the question text can be set according to the processing capability of the question-answering model, and is not limited to three rounds; other numbers fall within the protection scope of this disclosure.

[0034] Historical context data refers to the question-and-answer pairs associated with the question text generated during the user's and the question-answering model's historical interactions. Since the dialogue information is cleared after each conversation, the user cannot refer to previous interaction data when rescheduling the question-answering model for a new dialogue. However, the end of each dialogue between the user and the model means that the user has received the response they requested; therefore, historical interaction data is valuable for reference in a new round of dialogue. We process and store the short-term memory generated for each dialogue scenario, using this as the long-term memory of the question-answering model, so that the model can retrieve historical context data to support the analysis of the question text.

[0035] Related knowledge documents are professional files uploaded by users that share the same or similar knowledge categories as the current question text. These documents provide background knowledge to support the question-answering model's analysis of the question text, ensuring that the model's answer does not contain knowledge gaps. Furthermore, since the same concept may correspond to different knowledge categories, filtering related knowledge documents from user-specified files can improve the match between the response and the user's needs.

[0036] Step S104: Based on the word example retrieval threshold of the question-answering model, optimize the quantity of prompt data to form target prompt information.

[0037] Question answering models are large language models with both long-term and short-term memory capabilities. They have the ability to retrieve various types of data, such as contextual data in the current dialogue scenario, historical data generated from past dialogues, and related knowledge documents. Therefore, when analyzing users' question texts, question answering models can better understand user needs and have professional industry knowledge as data support.

[0038] The word example retrieval threshold is the maximum number of word examples that a question-answering model can retrieve when receiving prompt information. Because the essence of a question-answering model is a large language model, which belongs to a neural network model, its neural network structure supports a limited number of inputs, such as 2048 word examples. Therefore, before determining the target prompt information, it is necessary to process the prompt data according to the question-answering model's own retrieval capabilities to ensure that the question-answering model can fully retrieve these word examples.

[0039] Word examples are the result of segmenting the original prompt information; each word example can be understood as a meaningful word. The original prompt information is model prompt information containing prompt data and possessing an input format. However, the number of word examples in the original prompt information may not match the processing capacity of the question-answering model. Therefore, it is necessary to optimize the quantity of prompt data it contains. The optimized target prompt information can then be used as input data for the question-answering model.

[0040] The target prompt information is the optimized result of the prompt data, with its word count less than or equal to the word count threshold of the question-answering model. After determining the prompt data, the prompt data and the question text are integrated to obtain the original prompt information. Based on the word count threshold, the prompt data in the original prompt information is optimized to form the target prompt information, which consists of the optimized prompt data and the question text.

[0041] Step S106: Call the question-answering model to analyze the target prompt information and generate a response message corresponding to the question text.

[0042] The response message is the result of the question-answering model's analysis of the question text, which integrates contextual and historical data, as well as related knowledge documents. The response message conforms to the context and meets the personalized response needs revealed by the user during historical dialogues; at the same time, the response message also possesses industry-specific capabilities to resolve the issues raised in the question text.

[0043] In some implementations, the execution process of step S102 includes: converting the question text into a question vector, wherein the question vector is used to represent the semantic information of the question text; extracting multiple knowledge-related documents that match the question vector in a knowledge document library; extracting multiple question-answer pairs associated with the question vector in a historical dialogue database to form historical association data composed of multiple question-answer pairs; and extracting multi-turn dialogue information adjacent to the question time of the question text in the target scene to form contextual association data.

[0044] Specifically, after a user inputs a question text, it is encoded into a question vector to facilitate matching with knowledge document vectors in the knowledge document base and dialogue document vectors in the historical dialogue database, while also facilitating data transmission. The question vector serves at least to represent the semantic information of the question text.

[0045] Once the query vector is obtained, multiple knowledge-related documents matching the query vector are extracted from the knowledge document repository. The knowledge document repository is a FAISS (Facebook AI Similarity Search) computation library that calculates the vector distance between the query vector and each knowledge document. The vector distance can be calculated using methods such as Euclidean distance or dot product. The knowledge documents with the smaller distance values ​​calculated from the query vector are extracted as the knowledge-related documents for the query vector. Of course, the number of knowledge-related documents recalled can be set according to needs and is not limited here.

[0046] Multiple question-and-answer pairs associated with the question vector are extracted from the historical dialogue database, and these pairs are used as historical association data for the question text. The historical dialogue database is essentially a FAISS (Features, Interactions, and Documents) computation library, storing question-and-answer pairs generated during historical interactions, and thus generating dialogue document vectors corresponding to each pair. Then, by calculating the distance between the question vector and each dialogue document vector, the few dialogue documents with the smallest calculated distance values ​​are extracted, and the question-and-answer pairs in these documents are used as historical association data for the question vector. Of course, due to the time-sensitivity of question-and-answer pairs, the retrieved pairs are arranged in reverse chronological order, ensuring that the most recently occurring pair appears at the beginning and end of the historical association data.

[0047] Based on the chronological order of the various rounds of dialogue in the current dialogue scenario, information from multiple rounds of dialogue that are temporally adjacent to the current question text is extracted as contextual data. This dialogue information provides contextual support for analyzing the question text. Of course, the number of rounds of dialogue information extracted can be set according to needs and is not limited here.

[0048] In some implementations, after step S102, the process includes: integrating the question text, prompt data, and role descriptions of the question-and-answer model to form the original prompt information.

[0049] The number of word examples in the original prompt information may exceed the word example retrieval threshold. Therefore, the original prompt information needs to be processed before calling the question answering model. The original prompt information contains the prompt input format of the question answering model, as well as information such as the question text, prompt data, and role description.

[0050] The role descriptions in a question-and-answer model provide guidance on expression style and task setting for the response messages. Different role descriptions correspond to different expression styles.

[0051] Once the question text, hint data, and role descriptions of the question-answering model are obtained, the original hint information can be constructed. The hint input format can be, for example:

[0052] "″"

[0053] {role}

[0054] Extract useful information from the following text to answer your question. If the information is not mentioned below...

[0055] It can generate relevant content.

[0056] {knowledge}

[0057] {history}

[0058] {memory}

[0059] Question: {query}

[0060] answer:

[0061] "″"

[0062] Here, role represents the role description, knowledge represents the knowledge-related documents in the prompt data, history represents the historical related data in the prompt data, memory represents the context-related data, and query represents the question text.

[0063] First, obtain the character description, for example: "You are now playing the role of a secretary. Your task is to help me analyze the content of articles. Your speaking style is: concise and to the point. You can help me record what I say, so I can look up these conversations in the following dialogues."

[0064] Furthermore, the system retrieves prompt data. Four related knowledge documents are retrieved from the user-uploaded knowledge document library and concatenated using the symbol "\n". The system also retrieves the latest question-and-answer pairs related to the current question text from the historical dialogue database, selecting two pairs by default and arranging them in reverse chronological order. Finally, the system extracts the three most recent rounds of dialogue from the current dialogue context and arranges them in reverse chronological order.

[0065] Finally, the character descriptions, prompt data, and question texts are merged to form the original prompt information in a single-round question-and-answer format, for example:

[0066] messages = [{'role':'user','content':'\n\nYou are now playing the role of a secretary. Your task is to help me analyze the content of articles. Your speaking style is: concise and succinct. You can help me record what I say so that I can look up the content in the following conversation.\n\nExtract useful content from the following text to answer. If the content is not mentioned in the following text, you can generate relevant content.\n\nWhat do I need to buy in a while?\nYou need to buy potatoes, cabbage, and pork in a while.\nRemember that the Dragon Boat Festival is June 3rd\nThe Dragon Boat Festival is the fifth day of the fifth lunar month every year. The corresponding date in the Gregorian calendar will be different. This year (2023), the Dragon Boat Festival is June 3rd.\n\nPlease remember that I need to go to the supermarket in a while\nOkay, please tell me what you need to buy at the supermarket, and I will write it down for you.\n\nPotatoes, cabbage, tomatoes\nYou need to buy potatoes, cabbage, and tomatoes at the supermarket.\n\nQuestion: What are you going to the supermarket for in a while?\nAnswer:'}'].

[0067] Here, messages represent the original prompts, user represents the specific role value of the role, and content represents the role's task and style.

[0068] "What am I going to buy in a bit?" and "You need to buy potatoes, cabbage, and pork in a bit" is one question-and-answer pair; "Remember that the Dragon Boat Festival is June 3rd" and "The Dragon Boat Festival is the fifth day of the fifth lunar month every year, but the corresponding Gregorian calendar date will be different. This year (2023), the Dragon Boat Festival is June 3rd" is another question-and-answer pair. "Please remember that I'm going to the supermarket in a bit" and "Okay, please tell me what you need to buy at the supermarket, and I will write it down for you" is a set of dialogue information in the current dialogue scenario; "Potatoes, cabbage, and tomatoes" and "You need to buy potatoes, cabbage, and tomatoes at the supermarket" is a set of dialogue information in the current dialogue scenario. "What are you going to the supermarket for in a bit?" is the question text. It should be noted that the symbol "\n\n" represents the separator between different categories of prompt information.

[0069] In some implementations, after obtaining the original prompt information, step S104 includes: comparing the number of word examples in the original prompt information with the word example call threshold of the question-answering model to obtain a comparison result; in response to the comparison result that the number of word examples is greater than the word example call threshold, truncating multiple related knowledge documents in the prompt data and retaining any one related knowledge document; and when the number of word examples in the original prompt information with a unique related knowledge document is less than or equal to the word example call threshold, forming target prompt information composed of contextual related data, historical related data, and a unique related knowledge document.

[0070] Theoretically, inputting all the previously obtained prompt data into the question-answering model is the optimal approach. However, due to the limitations of question-answering models in terms of the number of word examples they can access, it is necessary to determine and control the number of word examples in the original prompt information to suit the processing capabilities of the question-answering model.

[0071] Figure 5 This is a diagram showing the adjustment of the number of words in one embodiment of the present disclosure.

[0072] refer to Figure 5 First, compare the number of word examples in the original prompt information with the word example retrieval threshold of the question answering model. If the number of word examples is less than or equal to the word example retrieval threshold, the processing of the original prompt information can be ended directly, and the original prompt information can be used as the target prompt information input into the question answering model.

[0073] Conversely, if the number of example words exceeds the word call threshold, the prompt data in the original prompt information needs to be truncated. Therefore, we first consider truncating the related knowledge documents, keeping only one related knowledge document and deleting the rest. This way, retaining only a single related knowledge document as data support for industry-specific expertise usually meets the response requirements.

[0074] Then, it is determined whether the number of word examples containing only the original prompt information of the unique associated knowledge document is less than or equal to the word example call threshold; if it is less than or equal to the word example call threshold, the original prompt information containing the unique associated knowledge document is used as the target prompt information, and the content of other prompt data, role descriptions and question texts are not modified.

[0075] In some implementations, if the number of word examples in the original prompt information containing only a unique associated knowledge document exceeds the word example retrieval threshold, we will perform the following steps: when the number of word examples in the original prompt information containing a unique associated knowledge document exceeds the word example retrieval threshold, the multi-turn dialogue information in the context association data is truncated, retaining the unique dialogue information that is directly adjacent to the question text in time sequence; and in response to the number of word examples in the original prompt information containing unique dialogue information being less than or equal to the word example retrieval threshold, a target prompt information consisting of unique dialogue information, historical association data, and a unique associated knowledge document is formed.

[0076] Because context-related data only considers timeliness and not relevance during recall, it is easy for context-related data to be irrelevant to the question text. Therefore, in this case, we prioritize truncating the dialogue information in the context-related data, while retaining the question-answer pairs in the historical data that are indeed related to the question text.

[0077] In this process, the unique dialogue information closest to the time of the question text is retained first, and other dialogue information is deleted. For the original prompt information that contains only a unique dialogue information and only a unique associated knowledge document, the number of word examples is determined. If the number of word examples is less than or equal to the word example retrieval threshold, then the original prompt information containing the unique dialogue information and the unique associated knowledge document is used as the target prompt information, and the content of other prompt data, role descriptions, and question text remains unchanged.

[0078] In some implementations, if the number of word examples in the original prompt information containing unique dialogue information is greater than the word example call threshold, we will perform the following steps: when the number of word examples in the original prompt information containing unique dialogue information is greater than the word example call threshold, the unique dialogue information is pre-calculated to form accelerated dialogue information.

[0079] Pre-calculation primarily serves to accelerate the process by eliminating irrelevant words in the dialogue. Furthermore, the number of words in the original prompts containing unique dialogue information is determined; if it is less than or equal to the word call threshold, that word is used as the target prompt, while the content of other prompt data, character descriptions, and question text remains unchanged.

[0080] Conversely, if the number of word examples in the original prompt message containing accelerated dialogue information exceeds the word example retrieval threshold, context-related data is deleted, and a target prompt message consisting of historical related data and a unique related knowledge document is formed.

[0081] Of course, if the number of word examples used at this time is still greater than the word example call threshold, historical related data can be discarded, and only one related knowledge document can be retained to form target prompt information and ensure the professionalism of the response message.

[0082] In some implementations, prior to step S102, the process includes: processing historical question-and-answer documents to form a historical dialogue database comprising multiple dialogue document vectors, wherein the dialogue document vectors correspond to question-and-answer pairs in the historical question-and-answer documents.

[0083] Specifically, multiple historical question-and-answer documents are discretized to obtain multiple question-and-answer pairs with the target number of word examples; each question-and-answer pair is encoded to obtain the corresponding dialogue document vector; and the various dialogue document vectors are integrated to form a historical dialogue database.

[0084] Historical question-and-answer documents are essentially stored as long-term memory for the question-and-answer model. During each dialogue, relevant question-and-answer pairs are selected from these documents as historical association data. This approach avoids the need for a large amount of question-and-answer data required for weight fine-tuning of the question-and-answer model using historical question-and-answer documents, thus overcoming the computational demands on the model during fine-tuning and offering universal applicability. Therefore, we choose to store historical question-and-answer pairs as historical question-and-answer documents in a historical dialogue database, using this database as an external database for the question-and-answer model. During each dialogue, relevant question-and-answer pairs are retrieved from this database to provide long-term memory for the model.

[0085] More specifically, we organize historical question-and-answer pairs into documents, with each document corresponding to one or a few question-and-answer pairs. This ensures that the number of use cases in each document is not too large, thus making each question-and-answer document suitable for input into the question-and-answer model. Next, we encode each question-and-answer document using an encoding service, forming a dialogue document vector. This dialogue document vector is the compressed result of the neural network on the question-and-answer document. Then, we integrate the dialogue document vectors corresponding to each question-and-answer document into a historical dialogue database, which is a FAISS computation library. Its keys are document vectors, and the corresponding values ​​are the relevant question-and-answer documents. When a user submits a question, the database queries the question vector of that question, and the query result is used as historical associated data, achieving long-term memory retrieval.

[0086] In some implementations, we also establish a knowledge document library for knowledge documents uploaded by users via document paths. Since the content of knowledge documents may be too long, the document is first sliced ​​into multiple document blocks, each document block serving as a record. To ensure the sentence integrity of each document block, the document content is first divided into the smallest granular text units using delimiters. Delimiters can be any one of the following: ["\n", ".", "?", "?", ";", ";", ",", ",", ""]. During segmentation, each delimiter in the knowledge document is traversed, and when a delimiter as shown above is encountered, the text is segmented. Then, each smallest granular text unit is sequentially concatenated. Before concatenating each smallest granular text unit, the number of text instances in the concatenated result is calculated. If the number of text instances does not exceed a preset block word instance threshold, concatenation is performed; otherwise, the text unit before concatenation is treated as a document block, and the text unit to be concatenated is used as the beginning of the next document block, repeating the above steps.

[0087] After obtaining multiple document blocks of the entire knowledge document, a vector index corresponding to each document block is constructed. That is, the encoding service is called to encode each document block to form a knowledge document vector, and a knowledge document library in FAISS format is formed.

[0088] The method for generating response messages disclosed herein searches for various types of prompt data associated with the question text, including historical related data, knowledge-related documents, and contextual related data, as target prompt information. This provides the question-answering model with multi-dimensional data support encompassing long-term memory, professional knowledge, and short-term memory, ensuring the accuracy of the generated response messages, improving the efficiency of each round of dialogue, and minimizing the occurrence of irrelevant answers during the dialogue. Furthermore, considering the question-answering model's ability to call word examples, the number of word examples for the target prompt information has been optimized to ensure the fit between the target prompt information and the question-answering model.

[0089] Figure 6 This is a block diagram of a response message generation apparatus according to one disclosed embodiment.

[0090] refer to Figure 6 This disclosure provides a response message generation apparatus 1000, comprising: a prompt data determination module 1002, used to process the question text and determine prompt data associated with the question text, wherein the prompt data includes contextual association data, historical association data, and associated knowledge documents; a target prompt information construction module 1004, used to optimize the quantity of prompt data based on the word example call threshold of the question-answering model to form target prompt information; and a response message generation module 1006, used to call the question-answering model to analyze the target prompt information and generate a response message corresponding to the question text.

[0091] Each module in the response message generation device 1000 is designed to solve each step of the response message generation method. The execution principle and steps are as described above and will not be repeated here.

[0092] The device 1000 may include corresponding modules that perform one or more steps in the flowchart described above. Therefore, each or more steps in the flowchart can be performed by a corresponding module, and the device may include one or more of these modules. A module may be one or more hardware modules specifically configured to perform a corresponding step, or implemented by a processor configured to perform a corresponding step, or stored in a computer-readable medium for implementation by a processor, or implemented through some combination thereof.

[0093] This hardware architecture can be implemented using a bus architecture. The bus architecture can include any number of interconnect buses and bridges, depending on the specific application and overall design constraints of the hardware. Bus 1100 connects various circuits, including one or more processors 1200, memory 1300, and / or hardware modules. Bus 1100 can also connect various other circuits 1400, such as peripherals, voltage regulators, power management circuits, external antennas, etc.

[0094] Bus 1100 can be an Industry Standard Architecture (ISA) bus, a Peripheral Component Interconnect (PCI) bus, or an Extended Industry Standard Component (EISA) bus, etc. Buses can be categorized as address buses, data buses, control buses, etc. For ease of representation, only one connection line is used in this diagram, but this does not imply that there is only one bus or only one type of bus.

[0095] Any process or method description in the flowcharts or otherwise herein can be understood as representing a module, segment, or portion of code comprising one or more executable instructions for implementing a particular logical function or process, and the scope of the preferred embodiments of this disclosure includes additional implementations in which functions may be performed not in the order shown or discussed, including substantially simultaneously or in reverse order depending on the functions involved, as will be understood by those skilled in the art to which embodiments of this disclosure pertain. The processor performs the various methods and processes described above. For example, the method embodiments of this disclosure may be implemented as software programs tangibly contained in a machine-readable medium, such as memory. In some embodiments, part or all of the software program may be loaded and / or installed via memory and / or a communication interface. When the software program is loaded into memory and executed by the processor, one or more steps of the methods described above may be performed. Alternatively, in other embodiments, the processor may be configured to perform one of the methods described above by any other suitable means (e.g., by means of firmware).

[0096] The logic and / or steps represented in the flowchart or otherwise described herein may be specifically implemented in any readable storage medium for use by, or in conjunction with, an instruction execution system, apparatus or device (such as a computer-based system, a processor-included system or other system that can fetch and execute instructions from, an instruction execution system, apparatus or device).

[0097] The response message generation device disclosed herein searches for various types of prompt data associated with the question text, including historical related data, knowledge-related documents, and contextual related data, as target prompt information. This provides the question-answering model with multi-dimensional data support, including long-term memory, professional knowledge, and short-term memory, ensuring the accuracy of the generated response messages, improving the efficiency of each round of dialogue, and minimizing the occurrence of irrelevant answers during the dialogue. Furthermore, considering the question-answering model's ability to call word examples, the number of word examples for the target prompt information has been optimized to ensure the fit between the target prompt information and the question-answering model.

[0098] For the purposes of this specification, a "readable storage medium" can be any means capable of containing, storing, communicating, propagating, or transmitting a program for use by or in conjunction with an instruction execution system, apparatus, or device. More specific examples (a non-exhaustive list) of readable storage media include: an electrical connection having one or more wires (electronic device), a portable computer disk drive (magnetic device), random access memory (RAM), read-only memory (ROM), erasable and programmable read-only memory (EPROM or flash memory), fiber optic devices, and portable read-only memory (CDROM). Furthermore, a readable storage medium can even be paper or other suitable media on which a program can be printed, since a program can be obtained electronically, for example, by optically scanning the paper or other medium, followed by editing, interpreting, or otherwise processing as necessary, and then stored in memory.

[0099] It should be understood that various parts of this disclosure can be implemented in hardware, software, or a combination thereof. In the above embodiments, multiple steps or methods can be implemented in software stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, it can be implemented using any one or a combination of the following techniques known in the art: discrete logic circuits having logic gates for implementing logical functions on data signals, application-specific integrated circuits (ASICs) having suitable combinational logic gates, programmable gate arrays (PGAs), field-programmable gate arrays (FPGAs), etc.

[0100] Those skilled in the art will understand that all or part of the steps of the methods described above can be implemented by a program instructing related hardware, and the program can be stored in a readable storage medium. When executed, the program includes one or a combination of the steps of the method implementation.

[0101] Furthermore, the functional units in the various embodiments of this disclosure can be integrated into a single processing module, or each unit can exist physically separately, or two or more units can be integrated into a single module. The integrated module can be implemented in hardware or as a software functional module. If the integrated module is implemented as a software functional module and sold or used as an independent product, it can also be stored in a readable storage medium. The storage medium can be a read-only memory, a disk, or an optical disk, etc.

[0102] Those skilled in the art should understand that the above embodiments are merely for illustrating the present disclosure and are not intended to limit the scope of the disclosure. Those skilled in the art can make other changes or modifications based on the above disclosure, and these changes or modifications still fall within the scope of the present disclosure.

Claims

1. A method of generating a response message, characterized by, include: The question text is processed to determine the prompt data associated with the question text, wherein the prompt data includes contextual association data, historical association data, and associated knowledge documents; Based on the word example retrieval threshold of the question-answering model, the quantity of the prompt data is optimized to form the target prompt information; as well as The question-and-answer model is invoked to analyze the target prompt information and generate a response message corresponding to the question text; After processing the question text and determining the prompt data associated with the question text, the process includes: integrating the question text, the prompt data, and the role description of the question-and-answer model to form the original prompt information; The word example retrieval threshold based on the question-answering model optimizes the quantity of the prompt data to form target prompt information, including: comparing the number of word examples in the original prompt information with the word example retrieval threshold of the question-answering model to obtain a comparison result; in response to the comparison result that the number of word examples is greater than the word example retrieval threshold, truncating multiple related knowledge documents in the prompt data and retaining any one of the related knowledge documents; and when the number of word examples in the original prompt information with a unique related knowledge document is less than or equal to the word example retrieval threshold, forming the target prompt information composed of contextual related data, historical related data, and the unique related knowledge document; After truncating multiple related knowledge documents in the prompt data and retaining any one of the related knowledge documents, the method further includes: when the number of word examples in the original prompt information containing a unique related knowledge document is greater than the word example call threshold, truncating the multi-turn dialogue information in the context association data and retaining the unique dialogue information that is directly adjacent to the question text in time sequence; in response to the number of word examples in the original prompt information containing the unique dialogue information being less than or equal to the word example call threshold, forming the target prompt information consisting of the unique dialogue information, the historical association data, and the unique related knowledge document; After truncating the dialogue information from multiple rounds in the context-related data and retaining only the unique dialogue information directly adjacent to the question text in time sequence, the method further includes: when the number of word examples in the original prompt information containing the unique dialogue information is greater than the word example call threshold, pre-calculating the unique dialogue information to form accelerated dialogue information; responding to the number of word examples in the original prompt information containing the accelerated dialogue information being less than or equal to the word example call threshold, forming the target prompt information consisting of the unique accelerated dialogue information, historical related data, and the unique related knowledge document; or responding to the number of word examples in the original prompt information containing the accelerated dialogue information being greater than the word example call threshold, deleting the context-related data and forming the target prompt information consisting of the historical related data and the unique related knowledge document. The step of processing the question text to determine the prompt data associated with the question text includes: The question text is converted into a question vector, wherein the question vector is used to represent the semantic information of the question text; Extract multiple knowledge-related documents from the knowledge document library that match the question vector; Extract multiple question-answer pairs associated with the question vector from the historical dialogue database to form historical association data consisting of multiple question-answer pairs; and Extract multi-turn dialogue information adjacent to the questioning time of the question text in the target scene to form context-related data; Before processing the question text and determining the prompt data associated with the question text, the process includes: The historical question-and-answer documents are processed to form a historical dialogue database comprising multiple dialogue document vectors, wherein the dialogue document vectors correspond to question-and-answer pairs in the historical question-and-answer documents. The process includes: discretizing the multiple historical question-and-answer documents to obtain multiple question-and-answer pairs with a target number of word examples; encoding each question-and-answer pair to obtain the dialogue document vector corresponding to the question-and-answer pair; and integrating the various dialogue document vectors to form the historical dialogue database.

2. An apparatus for generating a response message, characterized by comprising: include: The prompt data determination module is used to process the question text and determine the prompt data associated with the question text, wherein the prompt data includes contextual association data, historical association data, and associated knowledge documents; The target prompt information construction module is used to optimize the quantity of the prompt data based on the word example call threshold of the question-answering model in order to form target prompt information; as well as The response message generation module is used to call the question-and-answer model to analyze the target prompt information and generate a response message corresponding to the question text; After processing the question text and determining the prompt data associated with the question text, the process includes: integrating the question text, the prompt data, and the role description of the question-and-answer model to form the original prompt information; The word example retrieval threshold based on the question-answering model optimizes the quantity of the prompt data to form target prompt information, including: comparing the number of word examples in the original prompt information with the word example retrieval threshold of the question-answering model to obtain a comparison result; in response to the comparison result that the number of word examples is greater than the word example retrieval threshold, truncating multiple related knowledge documents in the prompt data and retaining any one of the related knowledge documents; and when the number of word examples in the original prompt information with a unique related knowledge document is less than or equal to the word example retrieval threshold, forming the target prompt information composed of contextual related data, historical related data, and the unique related knowledge document; After truncating multiple related knowledge documents in the prompt data and retaining any one of the related knowledge documents, the method further includes: when the number of word examples in the original prompt information containing a unique related knowledge document is greater than the word example call threshold, truncating the multi-turn dialogue information in the context association data and retaining the unique dialogue information that is directly adjacent to the question text in time sequence; in response to the number of word examples in the original prompt information containing the unique dialogue information being less than or equal to the word example call threshold, forming the target prompt information consisting of the unique dialogue information, the historical association data, and the unique related knowledge document; After truncating the dialogue information from multiple rounds in the context-related data and retaining only the unique dialogue information directly adjacent to the question text in time sequence, the method further includes: when the number of word examples in the original prompt information containing the unique dialogue information is greater than the word example call threshold, pre-calculating the unique dialogue information to form accelerated dialogue information; responding to the number of word examples in the original prompt information containing the accelerated dialogue information being less than or equal to the word example call threshold, forming the target prompt information consisting of the unique accelerated dialogue information, historical related data, and the unique related knowledge document; or responding to the number of word examples in the original prompt information containing the accelerated dialogue information being greater than the word example call threshold, deleting the context-related data and forming the target prompt information consisting of the historical related data and the unique related knowledge document. The step of processing the question text to determine the prompt data associated with the question text includes: The question text is converted into a question vector, wherein the question vector is used to represent the semantic information of the question text; Extract multiple knowledge-related documents from the knowledge document library that match the question vector; Extract multiple question-answer pairs associated with the question vector from the historical dialogue database to form historical association data consisting of multiple question-answer pairs; and Extract multi-turn dialogue information adjacent to the questioning time of the question text in the target scene to form context-related data; Before processing the question text and determining the prompt data associated with the question text, the process includes: The historical question-and-answer documents are processed to form a historical dialogue database comprising multiple dialogue document vectors, wherein the dialogue document vectors correspond to question-and-answer pairs in the historical question-and-answer documents. The process includes: discretizing the multiple historical question-and-answer documents to obtain multiple question-and-answer pairs with a target number of word examples; encoding each question-and-answer pair to obtain the dialogue document vector corresponding to the question-and-answer pair; and integrating the various dialogue document vectors to form the historical dialogue database.

3. An electronic device, comprising: It includes a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor executes the program to implement the method for generating a response message as described in claim 1.

4. A readable storage medium, characterized by, The readable storage medium stores a computer program, and the computer program is suitable for being loaded by the processor to execute the method for generating the response message according to claim 1.