A problem processing method, apparatus and device
By selecting historical fragments strongly correlated with the current text question from the agent's key information pool, and constructing prompt words to assist the large model in recognizing user intent, the problems of user intent bias and wasted computing resources caused by noisy information are solved, and accurate and reliable answers and performance improvements are achieved.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- NEW H3C TECH CO LTD
- Filing Date
- 2026-05-19
- Publication Date
- 2026-06-19
AI Technical Summary
When an agent processes text problems, the large historical text input to the model contains a lot of noisy information, which leads to a misunderstanding of the user's intent, wastes computational resources, and results in poor processing performance, making it impossible to provide accurate and reliable answers.
By selecting historical fragments strongly correlated with the current text question from the key information pool as key information, prompt words are constructed to assist the large model in recognizing user intent, dynamically updating the key information pool, and avoiding input noise information.
It improves the accuracy of large models in recognizing user intent, saves computing resources, provides accurate and reliable answers to questions, and enhances processing performance.
Smart Images

Figure CN122240797A_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of artificial intelligence technology, and in particular to a problem-solving method, apparatus, and device. Background Technology
[0002] An intelligent agent is an artificial intelligence entity designed for a specific industry. It perceives environmental information, autonomously decides actions based on these perceptions, and executes those actions to achieve pre-defined goals. Intelligent agents can be implemented as software programs or hardware. They are deeply integrated with industry-specific expertise and tools, providing customized problem-solving capabilities. Intelligent agents are widely used in industries such as transportation and customer service, processing user queries through integrated professional knowledge, such as customer service dialogues or data analysis tasks.
[0003] When an agent answers a current text question, it needs to input the current text question and historical text questions (i.e., dialogue history information, such as text questions from the previous 20 rounds) into a large model. The large model then answers the question based on the historical text questions and outputs the answer.
[0004] However, in the above methods, the historical text questions input to the large model contain a large amount of noisy information (such as irrelevant historical statements, redundant data, etc.), which leads to a bias in the large model's understanding of the user's intent and an inability to obtain accurate and reliable answers, i.e., incorrect answers. Moreover, the large model needs to consume a lot of computing resources to process the noisy information in the historical text questions, wasting computing resources and resulting in poor processing performance. Summary of the Invention
[0005] This application provides a problem-solving method, the method comprising: Obtain the current text question and historical text questions, wherein the historical text questions are text questions preceding the current text question, and the historical text questions include multiple historical segments; K target key information items are selected from multiple key information items in the key information pool; wherein, if there is a historical fragment that is strongly related to the current text question among the multiple historical fragments, then a historical fragment with a correlation degree greater than a preset threshold is selected as key information items and added to the key information pool; wherein, the correlation degree of the historical fragment represents the degree of correlation between the historical fragment and the current text question. Based on the current text question and the K key target information, construct prompt words; The answer to the current text question is determined based on the prompt words.
[0006] This application provides a problem-solving apparatus, the apparatus comprising: The acquisition module is used to acquire the current text question and the historical text questions. The historical text questions are the text questions preceding the current text question and include multiple historical segments. The selection module is used to select K target key information from multiple key information in the key information pool; wherein, if there is a historical segment among the multiple historical segments that is strongly related to the current text question, then the historical segment with a correlation degree greater than a preset threshold is selected as key information and added to the key information pool; the correlation degree of the historical segment indicates the degree of correlation between the historical segment and the current text question; The determination module is used to construct prompt words based on the current text question and the K target key information, and to determine the question answer corresponding to the current text question based on the prompt words.
[0007] This application provides an electronic device, including: a processor and a machine-readable storage medium, the machine-readable storage medium storing machine-executable instructions that can be executed by the processor; the processor is used to execute the machine-executable instructions to implement the problem-solving method executable in the example above of this application.
[0008] This application provides a computer program product, which includes a computer program that, when executed by a processor, implements the problem-solving method described above in this application.
[0009] This application provides a machine-readable storage medium storing machine-executable instructions that can be executed by a processor; wherein the processor is used to execute the machine-executable instructions to implement the problem-solving method described in the above example of this application.
[0010] As can be seen from the above technical solutions, in this embodiment, when answering the current text question, key information (i.e., some historical fragments from historical text questions) is input into the large model, instead of the entire content of historical text questions. This avoids inputting noisy information (such as irrelevant historical statements, redundant data, etc.) from historical text questions into the large model, enabling the large model to accurately identify user intent and obtain accurate and reliable answers. Furthermore, the large model does not require consuming significant computing resources to process noisy information from historical text questions, saving computing resources and exhibiting better processing performance. For example, K target key information pieces can be selected from multiple key information pieces in the key information pool, thereby assisting the large model in accurately identifying user intent and obtaining accurate and reliable answers. In addition, when there are historical fragments in multiple historical text questions that are strongly correlated with the current text question, historical fragments with a correlation greater than a preset threshold can be selected as key information and added to the key information pool, thereby dynamically updating the key information pool and ensuring that the key information pool retains important historical fragments. Attached Figure Description
[0011] Figure 1 This is a flowchart illustrating a problem-solving method in one embodiment of this application; Figure 2 This is a schematic diagram of a large-scale user problem processing system according to one embodiment of this application; Figure 3 This is a flowchart illustrating a multi-turn dialogue process in one embodiment of this application; Figure 4 This is a schematic diagram of the multi-turn dialogue layer processing in one embodiment of this application; Figure 5 This is a schematic diagram of the structure of a problem processing device according to one embodiment of this application; Figure 6 This is a hardware structure diagram of an electronic device according to one embodiment of this application. Detailed Implementation
[0012] This application proposes a problem-solving method that can be applied to electronic devices. The electronic device can be any device supporting intelligent agents, or any device supporting large models, such as personal computers, laptops, smartphones, IoT devices, cloud devices, servers, etc. There is no limitation on the type of electronic device. See also... Figure 1 The diagram shown illustrates a problem-solving method, which includes: Step 101: Obtain the current text question and historical text questions. Historical text questions can be text questions preceding the current text question, and historical text questions include multiple historical segments.
[0013] Step 102: Select K target key information from multiple key information in the key information pool; where, if there are historical fragments that are strongly related to the current text question among multiple historical fragments, select historical fragments with a correlation degree greater than a preset threshold as key information and add them to the key information pool; where, the correlation degree of historical fragments represents the degree of correlation between the historical fragment and the current text question.
[0014] Step 103: Construct prompt words based on the current text question and K key target information.
[0015] Step 104: Determine the answer to the current text question based on the prompt words.
[0016] In one example, a first association score can be determined between the historical text question and the current text question; if the first association score is less than a reference threshold, then it is determined that there is a historical segment among multiple historical segments that is strongly associated with the current text question; wherein, the reference threshold is determined based on the dialogue rounds of the historical text question and the configured compression intensity, the reference threshold is positively correlated with the dialogue rounds, and the reference threshold is negatively correlated with the compression intensity.
[0017] In one example, determining the first association score between the historical text question and the current text question may include, but is not limited to: inputting the current text question and the historical text question into H self-attention network layers of a Transformer model to obtain H weight matrices output by the H self-attention network layers, where H is greater than 1; wherein, the weight matrix includes the attention score of the self-attention network layer for each historical segment; for each historical segment, determining the degree of association of the historical segment based on the attention scores of the H self-attention network layers for that historical segment; and determining the first association score based on the degree of association of each historical segment.
[0018] In one example, a self-attention network layer may include a first sub-network, a second sub-network, and a third sub-network. The current text question and the historical text questions are input into the self-attention network layer to obtain the weight matrix output by the self-attention network layer. This may include, but is not limited to: inputting the current text question into the first sub-network to obtain the query vector of the current text question; inputting the historical text questions into the second sub-network to obtain the key vector of the historical text questions; and inputting the query vector and the key vector into the third sub-network to obtain the weight matrix.
[0019] In one example, the key information pool includes multiple key information items and the importance score of each key information item; the importance score of each key information item is dynamically decayed according to a specified time period; K target key information items are selected from the multiple key information items in the key information pool, which may include, but is not limited to: selecting the K key information items with the largest importance scores from the key information pool as the K target key information items based on the importance scores of each key information item in the key information pool.
[0020] In one example, constructing prompts based on the current text question and K key target information can include, but is not limited to: compressing historical text questions to obtain context text questions; and constructing prompts based on the current text question, context text questions, and K key target information.
[0021] In one example, compressing a historical text question to obtain a context text question may include, but is not limited to: compressing a historical text question to obtain a compressed text question; determining a second association score between the compressed text question and the current text question; determining an information retention rate based on the first and second association scores; wherein the information retention rate represents the proportion of original information of the historical text question retained by the compressed text question; if the information retention rate meets the configured retention conditions, then the compressed text question is identified as the context text question; or, if the information retention rate does not meet the retention conditions, then the operation of compressing the historical text question to obtain the compressed text question is returned.
[0022] As can be seen from the above technical solutions, in this embodiment, when answering the current text question, key information (i.e., some historical fragments from historical text questions) is input into the large model, instead of the entire content of historical text questions. This avoids inputting noisy information (such as irrelevant historical statements, redundant data, etc.) from historical text questions into the large model, enabling the large model to accurately identify user intent and obtain accurate and reliable answers. Furthermore, the large model does not require consuming significant computing resources to process noisy information from historical text questions, saving computing resources and exhibiting better processing performance. For example, K target key information pieces can be selected from multiple key information pieces in the key information pool, thereby assisting the large model in accurately identifying user intent and obtaining accurate and reliable answers. In addition, when there are historical fragments in multiple historical text questions that are strongly correlated with the current text question, historical fragments with a correlation greater than a preset threshold can be selected as key information and added to the key information pool, thereby dynamically updating the key information pool and ensuring that the key information pool retains important historical fragments.
[0023] The technical solutions described above in the embodiments of this application will be explained below in conjunction with specific application scenarios.
[0024] When answering a current text question, both the current and historical text questions need to be input into a large model. The large model then answers the question based on these two texts and outputs the correct answer. However, in this approach, the historical text questions input into the large model contain a large amount of noise (such as irrelevant historical statements and redundant data), leading to a bias in the large model's understanding of the user's intent and resulting in an inaccurate and unreliable answer. Furthermore, the large model consumes significant computational resources to process the noise in the historical text questions, wasting resources and resulting in poor performance.
[0025] The large-scale model employs an end-to-end natural language processing framework, which suffers from the following problems when facing complex business scenarios: Limited continuity in multi-turn dialogues: The large-scale model uses a fixed-length sliding window to store dialogue history. When handling complex cross-session business, key information may be discarded after the dialogue turns exceed the window capacity, forcing users to re-enter information, severely impacting user experience and business efficiency. Insufficient key information recognition capability: It cannot effectively distinguish between key information and ordinary text in the dialogue history, leading to the loss of key information data during information compression, thus depriving business decisions of accurate basis. Lack of dynamic adaptability: It cannot dynamically adjust the information retention strategy according to the dialogue turns, causing a sharp decline in system performance in long dialogue scenarios. For example, the tender number in the early dialogues will almost certainly be lost after 50 dialogue turns, causing business processing interruptions.
[0026] To address the aforementioned findings, this application proposes a problem-solving method. This method can be implemented using large models (such as pre-trained language models based on the Transformer architecture, possessing natural language understanding and generation capabilities) and can achieve large-model dialogue understanding based on dynamic context compression. This problem-solving method can be applied to scenarios such as intelligent customer service and knowledge-based question answering, as well as professional business scenarios such as bidding, financial consulting, and government affairs Q&A (i.e., scenarios requiring high-precision entity recognition and long-term dialogue continuity). Dynamic context compression dynamically adjusts the retention strategy based on the information density of each token in the dialogue history.
[0027] The problem-solving method in this embodiment can improve the following problems: The system suffers from several challenges: Context loss in long dialogues: It cannot maintain the continuity of key information within limited storage capacity. Insufficient key information recognition: It cannot effectively distinguish between key information in the dialogue history and ordinary text. Lack of dynamic adaptability: It cannot dynamically adjust information retention strategies based on dialogue rounds to ensure system stability in long dialogue scenarios.
[0028] See Figure 2As shown in the embodiments of this application, a four-layer collaborative large-model user question processing system is proposed. This system can include a noun matching layer, a multi-turn dialogue layer, a long-term memory layer, and a path planning layer. The noun matching layer implements the noun matching process, enabling standardization and accurate retrieval of business entities. This solves the problem of insufficient entity retrieval accuracy and effectively handles variant representations of entities in business scenarios (such as company abbreviations and project aliases), avoiding the omission of key information. The multi-turn dialogue layer implements the multi-turn dialogue process, ensuring the continuity of long-term dialogue context and solving the problem of continuity breaks in multi-turn dialogues. It retains early key parameters in long dialogues, preventing interruptions in business processing flows. The long-term memory layer implements the long-term memory process, providing enhanced memory storage and retrieval of business metadata. This solves the problem of memory being disconnected from business scenarios and binds long-term memory to business parameters, resulting in high relevance of historical information retrieval. The path planning layer implements the path planning process, dynamically mapping intent recognition to business processes. This solves the problem of intent routing mismatch with business topology and adapts to dynamic business links.
[0029] The noun matching layer, multi-turn dialogue layer, long-term memory layer, and path planning layer can achieve context pass-through through cross-layer state synchronization interfaces, forming a closed-loop collaborative mechanism. For the noun matching process, noun matching refers to the process of identifying and standardizing business entities (such as company names and project numbers) from user input. For the multi-turn dialogue process, it refers to a continuous dialogue management mechanism spanning multiple interaction rounds, requiring the maintenance of context consistency. For the long-term memory process, it refers to the persistent storage and efficient retrieval of historical dialogues and business parameters, transcending the limitations of a large model context window. For the path planning process, it refers to the ability to dynamically select and execute multiple business processing flows based on user intent.
[0030] This embodiment describes the processing procedure of the multi-turn dialogue layer. The processing procedures of the noun matching layer, long-term memory layer, and path planning layer will not be elaborated upon here. In the multi-turn dialogue layer, to address the problems of lost context in long dialogues, insufficient key information recognition, and lack of dynamic adaptability, the multi-turn dialogue layer employs a dynamic context compression method based on attention entropy (first association score).
[0031] See Figure 2As shown, the multi-turn dialogue layer involves a short-term memory loop and a key information pool. The short-term memory loop stores historical text questions, while the key information pool stores multiple key pieces of information. Based on the short-term memory loop and the key information pool, the multi-turn dialogue layer employs an attention entropy-based dynamic context compression method, namely AEDCC (Attention Entropy-based Dynamic Context Compression). AEDCC identifies key information and performs intelligent compression based on an attention mechanism.
[0032] In one example, at the multi-turn dialogue layer, a dynamic context compression method based on attention entropy is used to address the loss of key parameters (key information) in long dialogue scenarios. For example, see... Figure 3 The diagram shown illustrates a multi-turn dialogue process, which may include the following steps: Step 301: Obtain the historical text questions of the current text question. The historical text questions can be the text questions preceding the current text question, and the historical text questions can include multiple historical segments.
[0033] In one example, a user can enter multiple text questions in multiple rounds. For the current text question, the text questions from the previous A rounds can be used as historical text questions, where A can be a positive integer, such as the text questions from the previous 20 rounds.
[0034] Historical text problems can include multiple tokens, also known as lexical units. In AI, a token represents the smallest unit of text processing, which can be a word, subword, or character. For example, in natural language processing, text is segmented into tokens for the model to process. In this embodiment, the tokens in the historical text problem are referred to as historical segments; however, historical segments can also be called historical lexical units.
[0035] Step 302: Input the current text question and the historical text questions into H self-attention network layers of the Transformer model to obtain H weight matrices output by the H self-attention network layers, where H is greater than or equal to 1. The weight matrices include the attention scores of the self-attention network layers for each historical segment.
[0036] In one example, a self-attention layer can be a core component of a Transformer model, enabling it to capture dependencies between different positions in the input sequence without relying on the structure of a recurrent neural network (RNN) or convolutional neural network (CNN). The core idea of a self-attention layer is to calculate the relevance (or attention) score of each element in the input sequence to other elements, and then update the representation of each element based on these relevance scores. This self-attention mechanism allows the Transformer model to pay attention to other relevant elements in the sequence while processing the current element, thereby capturing richer contextual information.
[0037] To improve the representational and generalization capabilities of the Transformer model, the self-attention network layers can employ a multi-head attention mechanism. This means the Transformer model can include H self-attention network layers, where H can be a positive integer greater than 1. These H layers project the input sequence into multiple distinct subspaces, allowing for independent computation of self-attention within each subspace. This multi-head attention mechanism enables the Transformer model to capture different dependencies in different subspaces, thereby enhancing its representational power and helping to mitigate the vanishing and exploding gradient problems, as each subspace has its own gradient path.
[0038] For each self-attention network layer, the self-attention network layer determines the query vector based on the current text question, determines the key vector based on the historical text questions, and processes the query vector and key vector to obtain a weight matrix, which includes the attention score of the self-attention network layer for each historical segment.
[0039] For example, this self-attention network layer determines the query vector based on the current text question, determines the key vector based on the historical text questions, and processes the query vector, key vector, and value vector to obtain the weight matrix. The value vector can be the same as the query vector, or the value vector can be the same as the key vector, or the value vector can be a fixed vector value, or the value vector can be obtained in other ways.
[0040] For example, a self-attention network layer can include a first sub-network, a second sub-network, and a third sub-network. The current text question is input into the first sub-network to obtain the query vector, and historical text questions are input into the second sub-network to obtain the key vector. The query vector and key vector are then input into the third sub-network to obtain the weight matrix. Alternatively, the query vector, key vector, and value vector are input into the third sub-network to obtain the weight matrix. The value vector is the same as the query vector, or the value vector is the same as the key vector.
[0041] For example, the Transformer model includes self-attention network layer 1, self-attention network layer 2, ..., self-attention network layer H. After inputting the current text question and historical text questions into self-attention network layer 1, self-attention network layer 1 can determine the query vector based on the current text question. This can be achieved by processing the current text question using either a linear layer or a convolutional layer (i.e., the first sub-network can be either a linear layer or a convolutional layer) to obtain the query vector (which can be called the Q-vector). There are no restrictions on this processing method; the query vector can be denoted as... The self-attention network layer 1 can determine the key vector based on the historical text problem. For example, it can process the historical text problem using a linear layer or a convolutional layer (i.e., the second sub-network can be either a linear layer or a convolutional layer, and the second sub-network differs from the first sub-network) to obtain the key vector (which can be called the K-vector). There are no restrictions on this processing method; the key vector can be denoted as... The self-attention network layer 1 can also determine the value vector (which can be called the V vector). For example, the query vector can be used as the value vector, or the key vector can be used as the value vector, or other methods can be used to obtain the value vector.
[0042] For query vectors Query vector It can also be called the current query vector, query vector The type can be a vector (a floating-point array with dimensions consistent with the hidden layers of the large model), and the query vector... Used to encode the semantic features of the current user input. For example, when a user inputs "query XXX tender document", the query vector... Used to capture the core intent of "queries" and "bids". Query vector As a query input for the attention mechanism, it can drive the relevance filtering of historical information.
[0043] For key vectors key vector It can also be called the history key vector, key vector The type can be a vector (an array of floating-point numbers) or a key vector. Semantic features used to encode the historical dialogue context (historical text questions) are stored in a circular cache of the multi-turn dialogue layer (e.g., the most recent 20 turns of dialogue). These features are then processed by a self-attention network layer 1 based on the query vector. and key vector It can calculate the correlation strength between the current text question and the historical context, and supports parameter passing across rounds.
[0044] After obtaining the query vector, key vector, and value vector, the self-attention network layer 1 can process these vectors to obtain the weight matrix. That is, the third sub-network of self-attention network layer 1 processes these vectors to obtain the weight matrix. The network structure of this third sub-network is not restricted; its input features are the query vector, key vector, and value vector, and its output feature is the weight matrix. For example, the weight matrix can be calculated using the following formula: In the above formula, Q can represent the query vector, K can represent the key vector, and V can represent the value vector. d k The dimension of a vector can be represented, such as the dimension of a key vector. The dimension of a key vector is its length or size. For example, if a key vector is 512-dimensional, then the vector dimension is... d k The value is 512. The purpose of softmax is to make the sum of the probability distributions of the output weights equal to 1.
[0045] Attention(Q, K, V) represents the weight matrix, which is the output feature of the self-attention network layer 1. That is, after inputting the query vector, key vector, and value vector into the self-attention network layer 1, the self-attention network layer 1 can output the weight matrix, which can include the attention score of the self-attention network layer 1 for each historical segment (i.e., each historical segment in the historical text question).
[0046] For example, when converting the current text question into a query vector, we obtain a query vector of dimension P, where P can be a positive integer, meaning the query vector includes P vector values. When converting historical text questions into key vectors, if the historical text question includes W historical segments, we obtain a key vector of dimension P × W, where W can be a positive integer, meaning the key vector includes P × W vector values. For instance, for each historical segment of the historical text question, the key vector includes P vector values corresponding to that segment, and the W historical segments correspond to P × W vector values, which together form the key vector.
[0047] After inputting the query vector, key vector, and value vector into the self-attention network layer 1, the self-attention network layer 1 can output a weight matrix of dimension W, that is, the weight matrix includes W eigenvalues. The eigenvalue of the first element of the weight matrix represents the attention score of the self-attention network layer 1 for historical segment 1. The eigenvalue of the second element of the weight matrix represents the attention score of the self-attention network layer 1 for historical segment 2. And so on, the eigenvalue of the Wth element of the weight matrix represents the attention score of the self-attention network layer 1 for historical segment W. In this way, the attention score of the self-attention network layer 1 for each historical segment can be obtained.
[0048] Similarly, we can obtain the attention score of self-attention network layer 2 for each historical segment, the attention score of self-attention network layer 3 for each historical segment, and so on. For H self-attention network layers, we can obtain the attention score of each of the H self-attention network layers for each historical segment.
[0049] Step 303: For each historical segment, determine the degree of association of the historical segment based on the attention scores of H self-attention network layers for that historical segment. The degree of association can also be called the weight.
[0050] For example, based on the attention scores of H self-attention network layers for this historical segment, these attention scores (i.e., the H attention scores) can be weighted to obtain the total attention score. When weighting the H attention scores, the weighting coefficients of different self-attention network layers (attention scores) can be the same, or the weighting coefficients of different self-attention network layers (attention scores) can be different. For example, the weighting coefficient of self-attention network layer 1 is greater than the weighting coefficient of self-attention network layer 2, the weighting coefficient of self-attention network layer 2 is greater than the weighting coefficient of self-attention network layer 3, and so on.
[0051] For example, based on the total attention score and the total number H of self-attention network layers, the degree of association of the historical segment can be determined. For instance, the quotient of the total attention score and H can be used as the degree of association of the historical segment, which indicates the degree of association between the historical segment and the current text problem.
[0052] In summary, for each historical segment in the historical text problem, the degree of correlation of that segment can be obtained, thus revealing the degree of correlation of each historical segment within the historical text problem. The degree of correlation of each historical segment can represent its contribution to the current text problem. Clearly, based on the current text problem and the historical text problem, the degree of correlation of each historical segment can be determined, and consequently, the contribution of each historical segment to the current text problem can be obtained.
[0053] In one example, regarding the relevance of historical fragments (which can also be denoted as Wtoken or Token weight; Token weight will be used as an example), the Token weight is a scalar (floating-point number, range [0, 1]). The role of Token weight is to quantify the importance of a particular Token (historical fragment) in the historical dialogue (historical text question). The higher the Token weight, the greater its contribution to the current dialogue (current text question). For example, in a bidding scenario, the weight of the "bid number" Token might be close to 1.0, while the weight of auxiliary words (such as "of") might be close to 0. Token weight serves as a decision threshold for dynamic memory compression (Wtoken > 0.7 is permanently retained), preventing the loss of key parameters (key information). That is, historical fragments with high Token weights are permanently retained instead of being deleted.
[0054] For the self-attention network layer (head), head represents the attention head index (i.e., the index of H self-attention network layers), and head is of integer type (ranging from 1 to H). The role of head is to identify a specific head in the multi-head attention mechanism. The Transformer model uses multi-head attention (multiple parallel attention layers) to capture semantic dependencies of different dimensions. head traverses all heads to aggregate global information. H self-attention network layers are used to ensure that weight calculation covers semantic, syntactic, and other features, improving robustness.
[0055] The total number of self-attention network layers, H, represents the total number of attention heads. H is an integer (e.g., 12 or 16). H is the total number of heads in the multi-head attention mechanism, used to normalize the weights (dividing by H to avoid numerical bias). H is used to standardize weight calculations, making token weights comparable across different model configurations.
[0056] Step 304: Determine the first association score based on the degree of association of each historical segment. The first association score represents the degree of association between the historical text question and the current text question, i.e., determine the first association score between the historical text question and the current text question. For example, the first association score can be attention entropy, or other types of parameters can be used, as long as they can reflect the degree of association. In the following embodiments, attention entropy is used as an example for the first association score, and the implementation process of other parameters is similar.
[0057] In one example, attention entropy can be the information entropy calculated based on the attention distribution of the Transformer model. Attention entropy represents the degree of correlation between the historical text question and the current text question. A lower attention entropy (first correlation score) indicates more concentrated attention, meaning that among multiple historical fragments of the historical text question, there are fragments strongly correlated with the current text question. A higher attention entropy (first correlation score) indicates a more flattened attention distribution, meaning that among multiple historical fragments of the historical text question, there are no fragments strongly correlated with the current text question. A low attention entropy (first correlation score) indicates that the Transformer model focuses its attention on a few key historical fragments (i.e., fragments strongly correlated with the current text question), which is crucial information that needs to be preserved during compression. A high attention entropy (first correlation score) indicates that the Transformer model distributes its attention across most historical fragments.
[0058] In one example, attention entropy can be calculated using the following formula: In the above formula, H attn It can represent attention entropy. a i Attention weights can represent the degree of association of the i-th historical segment (the degree of association can also be called the weight of the historical segment, i.e., the attention weight). a i This can be understood as an attention probability distribution. The value of i ranges from 1 to Y, where Y represents the total number of historical segments. Attention weights. a i Attention entropy represents the degree of attention the Transformer model pays to historical segments. H attn This measures the degree of concentration of attention distribution, called attention entropy. H att It could be the probability of attention.
[0059] For example, the Attention Entropy-Based Dynamic Context Compression (AEDCC) method is built on the information bottleneck principle. Its goal is to minimize the mutual information between the compressed context C and the original dialogue history X, while maximizing the mutual information between the compressed context C and the task output Y.
[0060] For example, the above requirement can be expressed as: min I(X;C) - βI(C;Y). I(X;C) represents the mutual information between X and C, used to measure the amount of original information retained in the compressed context. I(C;Y) represents the mutual information between C and Y, used to measure the contribution of the compressed context to the task output. β represents the information compression coefficient, used to control the balance between information retention and compression. The information compression coefficient is configured according to actual needs, such as 0.85, which achieves the optimal balance between information retention and compression efficiency on the test set. The information compression coefficient can be adjusted within the range of [0.75, 0.95] according to business scenario requirements to adapt to different compression needs.
[0061] The information bottleneck theory requires minimizing I(X;C) - βI(C;Y). Mathematical derivation proves that minimizing I(X;C) - βI(C;Y) is equivalent to minimizing attention entropy, while ensuring the retention of key information. Based on this, attention entropy can be calculated, and the retention of key information can be ensured based on this entropy. When calculating attention entropy, the attention score between the current query vector and the historical dialogue key vectors can be calculated and converted into a probability distribution using softmax, thus determining the attention entropy based on the relevance of each historical segment.
[0062] Step 305: Determine whether the first association score of the current text question is less than the reference threshold.
[0063] If so, i.e., the first association score is less than the reference threshold, then step 306 can be executed. When the first association score is less than the reference threshold, it means that among the multiple historical fragments of the historical text question, there are historical fragments that are strongly associated with the current text question. In other words, the Transformer model focuses its attention on a few key historical fragments (i.e., historical fragments that are strongly associated with the current text question). Thus, historical fragments that are strongly associated with the current text question (historical fragments with an association degree greater than the preset threshold are considered strongly associated historical fragments) are key information and need to be retained in the key information pool. The relevant process can be found in step 306.
[0064] If not, i.e., the first association score is not less than the reference threshold, then step 307 can be executed. When the first association score is not less than the reference threshold, it means that there are no historical fragments among the multiple historical fragments of the historical text question that are strongly associated with the current text question. In other words, the Transformer model distributes its attention to most historical fragments. Thus, there is no key information in multiple historical fragments, and there is no need to retain key information in the key information pool.
[0065] In one example, the reference threshold can be configured according to actual needs, or it can be determined based on the number of dialogue turns in the historical text questions and the configured compression strength (i.e., the information compression coefficient). For instance, the reference threshold is positively correlated with the number of dialogue turns and negatively correlated with the compression strength. For example, the reference threshold can be derived using the following formula: H_threshold = log(A) - β. H_threshold represents the reference threshold, A represents the number of dialogue turns in the historical text questions, and β represents the compression strength.
[0066] For example, dialogue turn A can represent the length of the dialogue history. In multi-turn dialogue layers, a short-term memory (SSM) ring is involved, which stores historical text questions. Assuming the SSM ring stores text questions from the 20 turns preceding the current text question, then dialogue turn A can be 20. For A turns of dialogue, the theoretically expected value of the attention entropy is log(N). Regarding the compression strength β, experiments have determined that the optimal value for compression strength β is 0.85, which achieves the best balance between information preservation and compression efficiency on the test set.
[0067] Step 306: Select candidate historical fragments (one or more historical fragments) from multiple historical fragments of the historical text problem, use the candidate historical fragments as key information, and add the selected key information (i.e., candidate historical fragments) to the key information pool. After step 306, step 307 can be executed.
[0068] For example, candidate historical fragments are historical fragments with a relevance greater than a preset threshold, and the relevance of a historical fragment indicates its relevance to the current text question. Referring to the steps above, the relevance of each historical fragment in the historical text question has been determined. Therefore, historical fragments with a relevance greater than the preset threshold can be used as candidate historical fragments and added to the key information pool.
[0069] In one example, when the first correlation score is less than a reference threshold, it indicates that there are historical fragments among multiple historical fragments that are strongly correlated with the current text question. These historical fragments are key information and are retained in the key information pool. Based on this, historical fragments that are strongly correlated with the current text question can be candidate historical fragments, which are then added to the key information pool as key information.
[0070] In one example, for each historical fragment of a historical text question, if the correlation degree of the historical fragment is greater than a pre-configured preset threshold (which can be configured according to actual needs), it indicates that the historical fragment is strongly related to the current text question and can be considered as a candidate historical fragment. If the correlation degree of the historical fragment is not greater than the preset threshold, it can be excluded from the candidate historical fragment list. Based on the above processing, candidate historical fragments can be selected from multiple historical fragments of a historical text question.
[0071] In one example, the key information pool can include multiple key information items and their importance scores. Based on this, when adding a candidate historical fragment as key information to the key information pool, the relevance of that candidate historical fragment is used as its importance score, and this importance score is recorded in the key information pool. In summary, candidate historical fragments are selected from multiple historical fragments, added to the key information pool as key information, and their relevance is used as their importance score.
[0072] In one example, a key information pool can include multiple key information items and their importance scores. The importance scores of each key information item dynamically decay over a specified time period, such as decreasing by 5% every specified time period. For instance, the key information items in the pool can undergo exponential decay at a rate of 0.05 (daily decay rate). For example, using a one-day time period as an example, assuming the initial importance score of a key information item is 1, then on the first day after the key information is stored, its importance score is 0.95 (1 × 95%), on the second day after the key information is stored, its importance score is 0.90 (0.95 × 95%), and so on, decreasing by 5% each day. Based on this processing, the importance scores of each key information item in the key information pool dynamically decay, rather than remaining constant.
[0073] For key information in the key information pool, which can also be called key business parameters, these are parameters that are of critical significance in a specific business scenario, such as tender number, project type, amount range, time point, etc. They are the focus of attention and need to be kept intact during long conversations.
[0074] Step 307: Select K target key information from multiple key information in the key information pool.
[0075] For example, based on the importance scores of each key piece of information in the key information pool, K key pieces of information with high importance scores can be selected from multiple key pieces of information in the key information pool as K target key pieces of information, where K can be a positive integer.
[0076] For example, if the first association score is less than the reference threshold, the key information pool is first updated. Based on the updated key information pool, the K key information pieces with the highest importance scores are selected as the K target key information pieces. Alternatively, if the first association score is not less than the reference threshold, the key information pool does not need to be updated; the K key information pieces with the highest importance scores are directly selected from the key information pool as the K target key information pieces.
[0077] Step 308: Compress the historical text question to obtain the context text question.
[0078] In one example, historical text questions can be compressed to obtain compressed text questions. For instance, historical text questions can be input into a large model, which can then compress them, such as removing noise (irrelevant historical statements, redundant data, and useless words), resulting in compressed text questions. Another example is that historical text questions can be input into a large model, which can then generate a text summary based on the historical text questions, including useful information from the historical text questions.
[0079] Of course, the above are just two examples of compression methods, and no limitation is made on this method. Regarding how to compress historical text, this embodiment does not impose any restrictions; as long as information compression is achieved, it is acceptable.
[0080] In one example, after obtaining the compressed text question, a second association score can be determined between the current text question and the compressed text question. For instance, the current text question and the compressed text question (i.e., replacing historical text questions with the compressed text question) are input into H self-attention network layers of a Transformer model, resulting in H weight matrices output by the H self-attention network layers. For each historical segment within the compressed text question, the association degree of that historical segment is determined based on the attention scores of the H self-attention network layers for that historical segment. The second association score is then determined based on the association degree of each historical segment. The process for determining the second association score is similar to that for the first association score; simply replace the historical text question with the compressed text question.
[0081] In one example, after obtaining the second association score, the information retention rate is determined based on the first association score (determined based on the current text question and historical text questions) and the second association score (determined based on the current text question and the compressed text question). The information retention rate represents the proportion of original information from the historical text questions retained by the compressed text question. For example, the information retention rate can be determined using the following formula: ; In the above formula, retention_rate can represent the information retention rate, which is the proportion of original information retained in the context after compression. It is a core indicator used to evaluate the compression effect. H(original) can represent the first association score, and H(compressed) can represent the second association score.
[0082] In one example, after obtaining the information retention rate, if the information retention rate meets the configured retention conditions, such as the information retention rate being greater than a preset threshold (the preset threshold can be configured according to actual needs, such as 0.93), then the compressed text problem can be identified as a context text problem, and step 309 can be executed.
[0083] After obtaining the information retention rate, if the information retention rate does not meet the retention conditions, such as being less than a preset threshold, the historical text question can be recompressed to obtain a compressed text question. For example, the large model is notified that the information retention rate does not meet the retention conditions, triggering the large model to recompress the historical text question, and so on. There are no restrictions on the recompression process of this historical text question.
[0084] Step 309: Construct prompt words based on the current text question, the context text question, and K key target information, and determine the answer to the current text question based on the prompt words.
[0085] For example, the prompt can be restructured, and it can include the current text question, the context text question, and K key target information. Within the prompt, key business parameters (such as tender number, project type, and amount range) are retained through the K key target information. By converting historical text questions into context text questions, redundant dialogue content from historical text questions can be removed, ensuring that the prompt adheres to token-restricted content.
[0086] You can input the prompt words into the large model, and the large model will determine the answer to the current text question based on the prompt words. There are no restrictions on the processing of this large model; as long as the answer to the question is obtained, it is fine.
[0087] In one example, see Figure 4The diagram illustrates the processing of a multi-turn dialogue layer. This layer employs a two-layer buffer architecture, comprising a short-term memory ring and a key information pool. The short-term memory ring, implemented using a circular queue, stores historical text questions, such as the dialogue text from the previous 20 rounds (configurable). This allows the circular queue to cache the most recent A rounds (e.g., the 20th round), meaning it caches the text questions from the A rounds preceding the current text question (historical text questions). Obviously, text questions in the circular queue will be overwritten; for example, in the 21st round of dialogue, the text question from the 1st round will be missing, and in the 22nd round, the text question from the 2nd round will be missing, and so on. Each time the current text question is obtained, it is updated in the circular queue, and the earliest text question is deleted. The key information pool, implemented using a priority queue, stores filtered key business parameters (i.e., key information).
[0088] In the multi-turn dialogue layer, the first association score of the current text question can be determined based on the historical text questions within the short-term memory ring. If the first association score is less than a reference threshold (e.g., 1.2), it is determined that the historical text question contains key business parameters, and the key information pool needs to be updated (i.e., the key business parameters are added to the key information pool). K target key information is selected from the key information pool. If the first association score is not less than the reference threshold (e.g., 1.2), K key information is directly selected from the key information pool as K target key information. Then, the prompt words are reconstructed based on the K target key information, and the prompt words are input into the large model for processing to obtain the question answer corresponding to the current text question.
[0089] In one example, at the multi-turn dialogue layer, in addition to using dynamic context compression based on attention entropy, an entropy-sensitive sampling compression method can also be used. For instance, the attention entropy value of each token can be calculated, and the tokens with the lowest attention entropy values can be sorted from low to high. The top K tokens with the lowest attention entropy values can be retained, and the dialogue history can be reconstructed as the retained token sequence. This approach is suitable for application scenarios where coherence requirements are not high.
[0090] For example, in simple consultation scenarios, the multi-turn dialogue layer can use an entropy-sensitive sampling compression method to quickly implement basic functions.
[0091] In one example, the system collaboration workflow can be as follows: User input text question: When was the bidding for XX Jiaotong's ZH-2024-038 project opened? In the multi-turn dialogue layer: the attention entropy is calculated based on the historical question text, the key parameter (i.e. key information) is identified as "ZH-2024-038", the key parameter is retained in the key information pool, and redundant dialogue content is compressed.
[0092] Then, the reconstructed prompt is: query the bidding time of project ZH-2024-038 with unified social credit code "91000000000000000B". The prompt may also include contextual text questions, etc. Based on this prompt, the large model accurately returns "Project ZH-2024-038 was opened for bidding on March 15, 2024".
[0093] In one example, an implementation of a bidding scenario can demonstrate a specific application in a bidding scenario. The user inputs the text question: When was the bidding for XX Transportation's ZH-2024-038 project opened? Multi-turn dialogue layer processing (assuming the current text question is the 35th round of dialogue): Short-term memory ring: stores the content of the most recent 20 rounds of dialogue; Key information pool: extracts key parameters from historical dialogues; Attention entropy calculation: calculates attention entropy based on the weight of each historical segment; Multi-turn to single-turn prompt word reconstruction: retains key parameters: "Unified Social Credit Code 91000000000000000B" and "ZH-2024-038".
[0094] Redundant content was removed from the historical text question, and the prompt was reconstructed as follows: Query the bidding time of project ZH-2024-038 with unified social credit code "91000000000000000B". Based on this prompt, the large model accurately returned "Project ZH-2024-038 was opened for bidding on March 15, 2024".
[0095] In one example, a performance demonstration is presented for a 50-round dialogue scenario, simulating the complete process of a user conducting 50 rounds of bidding consultation. The dialogue scenario describes the user starting with a project inquiry and gradually delving into multiple stages, including qualification requirements, bidding procedures, and bid bond payment. Key business parameters include: project number (ZH-2024-038), project type (EPC), amount range (123.456 million yuan), and bid opening time (March 15, 2024). The first round inputs the project number, and the 50th round inquires, "How should the bid bond for the previously mentioned 123.456 million yuan EPC project be paid?". Based on this, the processing flow is as follows: Multi-turn dialogue layer: Short-term memory ring: stores the most recent 20 rounds of dialogue; Key information pool: retains key parameters such as project number, project type, amount range, bidding time, etc. through the AEDCC algorithm.
[0096] System output: The large model accurately answers the question, "The deposit for EPC project ZH-2024-038 must be paid before March 10, 2024, and the amount is 2% of the contract price, i.e., RMB 2,469,120." Based on the above processing method, it can successfully identify and retain all key parameters, accurately answering the user's question.
[0097] In one example, the test dataset is constructed as follows: Business scenario data: Real bidding data from 2020-2024, obtained from the public service platform for bidding and tendering, includes 15,682 bidding projects and information on 28,431 companies. Dialogue simulation data: Based on real user consultation records, 5 domain experts simulated 50 rounds of dialogue scenarios, resulting in 2,350 dialogue samples. Evaluation metrics: Entity retrieval F1 score: Evaluates the accuracy and recall rate of entity recognition; 50-round dialogue completion rate: Evaluates the system's ability to retain key parameters in long dialogue scenarios; Long-tail question response time: Evaluates the efficiency in handling complex business questions; Key parameter retention rate: Evaluates the retention effect of key business parameters in multi-round dialogues. Business scenario comparison: Bidding and tendering scenario: Focuses on key parameters such as company name, bidding number, and amount; Financial compliance scenario: Focuses on company code, transaction amount, and time range; Government consultation scenario: Focuses on policy number, scope of application, and execution time. By comparing performance in different business scenarios, the universality of this embodiment is verified.
[0098] Comparative testing revealed that the technical solution in this embodiment achieves breakthrough performance in long dialogue scenarios: task completeness reaches 92% in a 50-turn dialogue scenario, a significant improvement over the fixed-window solution (61% completion rate), with a key parameter retention rate of 92.7%. Response efficiency is greatly improved: response time for long-tail questions is reduced from 8.3s to 3.1s, reducing user waiting time. Resource consumption is significantly reduced: through dynamic context compression, the consumption of large model tokens is reduced by 58.3%, lowering computational resource costs.
[0099] In this embodiment, a dynamic context compression algorithm based on attention entropy is adopted in the multi-turn dialogue layer. Based on the theory of maximizing key information density of attention entropy, it breaks through the context length limitation of fixed windows. In a 50-turn dialogue scenario, the key parameter retention rate is 92.7%, and the task completion rate is improved by 31.4%. An attention entropy threshold determination theory is proposed: H_threshold=log(N)-β, where N is the number of historical dialogue turns and β is the compression coefficient (0.85). An information retention rate metric is designed. Dynamic retention and decay of key information are achieved, with a decay rate λ=0.05 (daily decay rate). A multi-turn to single-turn prompt word reconstruction technology is proposed. Based on the semantic preservation reconstruction mechanism of the key information pool, key business parameters are retained while maintaining semantic coherence. After reconstruction, the business intent retention rate of the prompt words reaches 98.2%.
[0100] Based on the same concept as the above method, this application proposes a problem-solving apparatus, see [link to previous application]. Figure 5 The diagram shown is a structural schematic of the problem-solving device, which may include: The acquisition module 51 is used to acquire the current text question and the historical text questions, wherein the historical text questions are the text questions preceding the current text question and the historical text questions include multiple historical segments; The selection module 52 is used to select K target key information from multiple key information in the key information pool; if there is a historical segment in the multiple historical segments that is strongly related to the current text question, then the historical segment with a correlation degree greater than a preset threshold is selected from the multiple historical segments as key information and added to the key information pool; the correlation degree of the historical segment indicates the degree of correlation between the historical segment and the current text question; The determination module 53 is used to construct prompt words based on the current text question and the K target key information, and to determine the question answer corresponding to the current text question based on the prompt words.
[0101] In one example, the determining module 53 is further configured to determine a first association score between the historical text question and the current text question; if the first association score is less than a reference threshold, then it is determined that there is a historical segment among the plurality of historical segments that is strongly associated with the current text question; wherein, the reference threshold is determined based on the dialogue rounds of the historical text question and the configured compression intensity, wherein, the reference threshold is positively correlated with the dialogue rounds and negatively correlated with the compression intensity.
[0102] In one example, when determining the first association score between the historical text question and the current text question, the determining module 53 specifically performs the following steps: inputting the current text question and the historical text question into H self-attention network layers of the Transformer model to obtain H weight matrices output by the H self-attention network layers, where H is greater than 1; wherein, the weight matrix includes the attention score of the self-attention network layer for each historical segment; for each historical segment, determining the association degree of the historical segment based on the attention score of the H self-attention network layers for that historical segment; and determining the first association score based on the association degree of each historical segment.
[0103] In one example, the self-attention network layer includes a first sub-network, a second sub-network, and a third sub-network. When the determining module 53 inputs the current text question and the historical text question into the self-attention network layer to obtain the weight matrix output by the self-attention network layer, it specifically performs the following steps: inputting the current text question into the first sub-network to obtain a query vector; inputting the historical text question into the second sub-network to obtain a key vector; and inputting the query vector and the key vector into the third sub-network to obtain the weight matrix.
[0104] In one example, the key information pool includes multiple key information items and an importance score for each key information item; wherein, the importance score of each key information item is dynamically decayed according to a specified time period. When the selection module 52 selects K target key information from multiple key information in the key information pool, it specifically selects K key information with high importance scores from the key information pool as the K target key information based on the importance scores of each key information in the key information pool.
[0105] In one example, when the determining module 53 constructs prompt words based on the current text question and the K target key information, it is specifically used to: compress the historical text question to obtain the context text question; and construct prompt words based on the current text question, the context text question, and the K target key information. Specifically, when the determining module 53 compresses the historical text question to obtain the context text question, it compresses the historical text question to obtain the compressed text question. Determine a second correlation score between the compressed text question and the current text question; The information retention rate is determined based on the first association score and the second association score; wherein, the information retention rate represents the proportion of the original information of the historical text question retained after compression. If the information retention rate meets the configured retention conditions, the compressed text problem is identified as the context text problem; or, if the information retention rate does not meet the retention conditions, the operation of compressing the historical text problem to obtain the compressed text problem is returned.
[0106] Based on the same concept as the above method, this application proposes an electronic device, see [link to previous application]. Figure 6 As shown, the electronic device includes a processor 61 and a machine-readable storage medium 62, the machine-readable storage medium 62 storing machine-executable instructions that can be executed by the processor 61; the processor 61 is used to execute the machine-executable instructions to implement the problem-solving method disclosed in the above example of this application.
[0107] Based on the same application concept as the above method, this application embodiment also provides a machine-readable storage medium storing a plurality of computer instructions, which, when executed by a processor, can implement the problem-solving method disclosed in the above examples of this application.
[0108] The aforementioned machine-readable storage medium can be any electronic, magnetic, optical, or other physical storage device that can contain or store information, such as executable instructions, data, etc. For example, machine-readable storage media can be: RAM (Random Access Memory), volatile memory, non-volatile memory, flash memory, storage drives (such as hard disk drives), solid-state drives, any type of storage disk (such as optical discs, DVDs, etc.), or similar storage media, or combinations thereof.
[0109] Based on the same concept as the methods described above, this application also provides a computer program product, which may include a computer program. When executed by a processor, the computer program implements the problem-solving method disclosed in the examples above.
[0110] Those skilled in the art will understand that embodiments of this application can be provided as methods, systems, or computer program products. Therefore, this application can take the form of a completely hardware embodiment, a completely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, embodiments of this application can take the form of a computer program product implemented on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer-usable program code.
[0111] The above description is merely an embodiment of this application and is not intended to limit the scope of this application. Various modifications and variations can be made to this application by those skilled in the art. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of this application should be included within the scope of the claims of this application.
Claims
1. A problem-solving method, characterized in that, The method includes: Obtain the current text question and historical text questions, wherein the historical text questions are text questions preceding the current text question, and the historical text questions include multiple historical segments; K target key information items are selected from multiple key information items in the key information pool; wherein, if there is a historical fragment that is strongly related to the current text question among the multiple historical fragments, then a historical fragment with a correlation degree greater than a preset threshold is selected as key information items and added to the key information pool; wherein, the correlation degree of the historical fragment represents the degree of correlation between the historical fragment and the current text question. Based on the current text question and the K key target information, construct prompt words; The answer to the current text question is determined based on the prompt words.
2. The method according to claim 1, characterized in that, The method further includes: Determine a first correlation score between the historical text question and the current text question; If the first association score is less than a reference threshold, then it is determined that there is a historical segment among the plurality of historical segments that is strongly associated with the current text question; wherein, the reference threshold is determined based on the dialogue rounds of the historical text question and the configured compression intensity, wherein, the reference threshold is positively correlated with the dialogue rounds and negatively correlated with the compression intensity.
3. The method according to claim 2, characterized in that, Determining the first correlation score between the historical text question and the current text question includes: The current text question and the historical text question are input into H self-attention network layers of the Transformer model to obtain H weight matrices output by the H self-attention network layers, where H is greater than 1; wherein, the weight matrix includes the attention score of the self-attention network layer for each historical segment; For each historical segment, the degree of association of the historical segment is determined based on the attention scores of the H self-attention network layers for that historical segment; The first association score is determined based on the degree of association between each historical segment.
4. The method according to claim 3, characterized in that, The self-attention network layer includes a first sub-network, a second sub-network, and a third sub-network. The current text question and the historical text question are input into the self-attention network layer to obtain the weight matrix output by the self-attention network layer, which includes: The current text question is input into the first sub-network to obtain the query vector; The historical text question is input into the second sub-network to obtain the key vector; The query vector and the key vector are input into the third sub-network to obtain the weight matrix.
5. The method according to claim 1, characterized in that, The key information pool includes multiple key information items and importance scores for each key information item; the importance scores of each key information item are dynamically decayed according to a specified time period. The step of selecting K target key information items from multiple key information items in the key information pool includes: Based on the importance scores of each key piece of information in the key information pool, K key pieces of information with high importance scores are selected from the key information pool as the K target key pieces of information.
6. The method according to any one of claims 2-4, characterized in that, The construction of prompt words based on the current text question and the K target key information includes: The historical text question is compressed to obtain the context text question; prompt words are constructed based on the current text question, the context text question, and the K target key information.
7. The method according to claim 6, characterized in that, The compression of the historical text question to obtain the context text question includes: The historical text problem is compressed to obtain the compressed text problem; Determine a second correlation score between the compressed text question and the current text question; The information retention rate is determined based on the first association score and the second association score; wherein, the information retention rate represents the proportion of the original information of the historical text question retained after compression. If the information retention rate meets the configured retention conditions, the compressed text problem is identified as the context text problem; or, if the information retention rate does not meet the retention conditions, the operation of compressing the historical text problem to obtain the compressed text problem is returned.
8. A problem-solving device, characterized in that, The device includes: The acquisition module is used to acquire the current text question and the historical text questions. The historical text questions are the text questions preceding the current text question, and the historical text questions include multiple historical segments. The selection module is used to select K target key information from multiple key information in the key information pool; wherein, if there is a historical segment among the multiple historical segments that is strongly related to the current text question, then the historical segment with a correlation degree greater than a preset threshold is selected as key information and added to the key information pool; the correlation degree of the historical segment indicates the degree of correlation between the historical segment and the current text question; The determination module is used to construct prompt words based on the current text question and the K target key information, and to determine the question answer corresponding to the current text question based on the prompt words.
9. The apparatus according to claim 8, characterized in that, The determining module is further configured to determine a first association score between the historical text question and the current text question; if the first association score is less than a reference threshold, then it is determined that there is a historical segment among the plurality of historical segments that is strongly associated with the current text question; wherein, the reference threshold is determined based on the dialogue rounds of the historical text question and the configured compression intensity, wherein, the reference threshold is positively correlated with the dialogue rounds and negatively correlated with the compression intensity.
10. The apparatus according to claim 9, characterized in that, The determining module, when determining the first association score between the historical text question and the current text question, specifically performs the following steps: inputting the current text question and the historical text question into H self-attention network layers of the Transformer model to obtain H weight matrices output by the H self-attention network layers, where H is greater than 1; wherein, the weight matrix includes the attention score of the self-attention network layer for each historical segment; for each historical segment, determining the association degree of the historical segment based on the attention score of the H self-attention network layers for that historical segment; and determining the first association score based on the association degree of each historical segment.
11. The apparatus according to claim 9 or 10, characterized in that, The determining module, when constructing prompt words based on the current text question and the K target key information, specifically performs the following functions: compressing the historical text question to obtain the context text question; and constructing prompt words based on the current text question, the context text question, and the K target key information. Specifically, when the determining module compresses the historical text question to obtain the context text question, it compresses the historical text question to obtain the compressed text question. Determine a second correlation score between the compressed text question and the current text question; The information retention rate is determined based on the first association score and the second association score; wherein, the information retention rate represents the proportion of the original information of the historical text question retained after compression. If the information retention rate meets the configured retention conditions, the compressed text problem is identified as the context text problem; or, if the information retention rate does not meet the retention conditions, the operation of compressing the historical text problem to obtain the compressed text problem is returned.
12. An electronic device, characterized in that, include: A processor and a machine-readable storage medium, the machine-readable storage medium storing machine-executable instructions that can be executed by the processor; The processor is configured to execute machine-executable instructions to implement the method of any one of claims 1-7.