A question answering method based on context iteration, electronic device and storage medium
By constructing a question queue and dynamic evaluation, and iteratively processing questions based on a shared task context, the problem of incomplete and unreliable answers in existing question-answering systems for complex open-domain questions is solved, achieving efficient and accurate answer generation.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- BEIJING YUCHEN SHIMEI SCI & TECH
- Filing Date
- 2026-03-13
- Publication Date
- 2026-06-19
AI Technical Summary
Existing question-answering systems based on large language models lack deep reasoning capabilities when dealing with complex open-domain problems. The single-round processing method leads to incomplete answers or factual errors, and the quality of the initial retrieval results is difficult to recover.
A question queue is constructed to prioritize the initial user questions. Sub-questions are processed through dynamic evaluation and iterative processing. Multi-round iterative reasoning is performed using a shared task context to generate hierarchical prompts. The quality of the answers is evaluated, and targeted corrections are made based on the evaluation results.
It significantly improves the accuracy, reliability, and comprehensiveness of answers to complex open-domain problems. Through dynamic decomposition and iterative processing, it optimizes the reasoning path and ensures the completeness and reliability of the answer.
Smart Images

Figure CN122242736A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of intelligent question answering technology, and in particular to a question answering method, electronic device, and storage medium based on context iteration. Background Technology
[0002] Currently, question-answering systems based on large language models, especially those employing retrieval-enhanced generative architectures, have become the mainstream approach for handling knowledge-intensive tasks. These systems typically follow a one-time "retrieval-generation" process: first, relevant document fragments are retrieved based on the user's query, and then the large language model directly generates the final answer based on the retrieved context.
[0003] However, this traditional paradigm has significant limitations. First, its single-round, linear processing method lacks deep reasoning and proactive exploration capabilities, making it difficult to handle complex open-domain problems involving multi-step logical deduction, implicit premises, or the need to integrate information from multiple sources. Second, the system relies entirely on the quality of the initial retrieval results. Once relevant key information is not covered by the initial retrieval, the subsequent generation process cannot recover, which can easily lead to incomplete answers or factual errors. The accuracy and reliability of the output answers still need to be improved. Summary of the Invention
[0004] To address the aforementioned technical problems, this invention provides a context-based iterative question-answering method, electronic device, and storage medium. This method enables targeted correction of question defects and iterative processing based on updated context, achieving dynamic decomposition and in-depth understanding of complex, open-domain questions, and significantly improving the accuracy, reliability, and comprehensiveness of the final answer.
[0005] According to a first aspect of the present invention, a context-based question-answering method is provided, comprising the following steps: S1, construct a problem queue and determine the current problem to be processed from the problem queue; the problem queue contains at least the initial user problem, and the initial user problem is placed at the end of the problem queue.
[0006] S2 generates prompts based on the current shared task context and the current problem to be processed, and inputs them into the preset large language model to obtain the answer to be output and to evaluate the quality of the answer to be output.
[0007] S3. If the quality assessment result is passed, the current pending problem is removed from the problem queue and it is determined whether the current pending problem is the initial user problem. If it is, the final answer is generated and output based on the shared task context. If it is not, a new current pending problem is determined from the problem queue and the process returns to step S2.
[0008] S4. If the quality assessment result is unsuccessful, then if the preset resource constraint threshold is not reached, determine the operation to be executed from the preset operation set according to the quality assessment result, update the execution result information of the operation to be executed to the shared task context, and then return to the execution step S2 for iteration until the final answer is output; wherein, when a sub-problem is generated after executing the operation to be executed, the sub-problem is placed at the head of the problem queue to update the current problem to be processed.
[0009] According to a second aspect of the present invention, a non-transitory computer-readable storage medium is provided, wherein at least one instruction or at least one program is stored therein, the at least one instruction or the at least one program being loaded and executed by a processor to implement the above-described context-based iterative question-answering method.
[0010] According to a third aspect of the present invention, an electronic device is provided, including a processor and the aforementioned non-transitory computer-readable storage medium.
[0011] The present invention has at least the following beneficial effects: This invention provides a context-based iterative question-answering method. First, a question queue is constructed, and the current question to be processed is determined from the queue. The initial user question is placed at the tail of the queue, while newly discovered, potentially more fundamental, sub-questions are placed at the head of the queue for priority processing. This allows the knowledge gained from previously solved questions to accumulate in the shared task context, enabling it to be directly utilized by all subsequent questions. Dynamic decomposition of questions enhances the understanding of complex open-domain problems. Then, based on the current shared task context and the current question to be processed, prompt information is generated and input into a preset large language model, providing support for the large language model. The system employs a hierarchical and context-rich instruction set. After obtaining the answer to be output, it performs a quality assessment on the answer. If the quality assessment passes, it chooses to directly output the final answer or continue execution, depending on whether it is the last sub-question in the question queue. If the quality assessment fails, it determines the operation to be executed and updates the execution result information to the shared task context to iteratively generate prompt information until the final answer is output. By accurately mapping the conclusion of failing the assessment to specific execution operations, it achieves targeted correction of problem defects, thereby systematically guiding the reasoning path and significantly improving the efficiency of iteration and the accuracy, reliability, and comprehensiveness of the generated answer. Attached Figure Description
[0012] To more clearly illustrate the technical solutions in the embodiments of the present invention, the accompanying drawings used in the description of the embodiments will be briefly introduced below. Obviously, the accompanying drawings described below are only some embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.
[0013] Figure 1 A flowchart of a context-based iterative question-answering method provided in an embodiment of the present invention. Detailed Implementation
[0014] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.
[0015] This invention provides a question-answering method based on context iteration, such as... Figure 1 As shown, the method includes the following steps: S1. Construct a problem queue and determine the current problem to be processed from the problem queue; the problem queue contains at least the initial user problem, and the initial user problem is placed at the tail of the problem queue; it can be understood that: the initial state of the problem queue is that it only contains the initial user problem. When a sub-problem is generated in the reflection operation, the sub-problem is added to the head of the problem queue, while the initial user problem is still at the tail of the problem queue, so as to give priority to generating the answer content corresponding to the sub-problem, and finally output the answer to the initial user problem.
[0016] Specifically, the current problem to be addressed is determined through the following steps: S101, If the problem queue contains only the initial user problem, then the initial user problem is identified as the current problem to be processed.
[0017] S102, add the sub-problems generated after the reflection operation to the problem queue; if only one sub-problem is generated, then the generated sub-problem is determined as the current problem to be processed; if at least two sub-problems are generated, then analyze the importance of each sub-problem to rank them, and determine the sub-problem with the highest importance as the current problem to be processed. For example, the importance of each sub-problem can be analyzed by using a preset large language model.
[0018] As described above, by constructing a problem queue, the current problem to be processed is always extracted from the head of the queue in each round of inference iteration, while the initial user problem is executed last. This design ensures that newly discovered, potentially more fundamental, sub-problems are given priority, while guaranteeing that the system will eventually return to solving the original problem. In this process, a continuously accumulating shared task context is maintained, and the knowledge gained from solving previously solved problems can be accumulated in this context and directly utilized by all subsequent problems, thus enhancing the knowledge base and improving the accuracy and robustness of the solution to the original complex problem. Moreover, the queue-based scheduling provided in this application, compared with the traditional recursive, depth-first strategy, effectively avoids uncontrollable resource consumption and potential computational path loss caused by infinite recursion or excessive nesting, achieving a good balance between exploration depth and breadth.
[0019] In a preferred embodiment, if there are at least two sub-problems, the priority score of each sub-problem is calculated based on several preset evaluation indicators, and the sub-problems are inserted into the head of the problem queue in descending order of priority score. The sub-problems at the head of the problem queue are taken as the current problems to be processed, while the initial user problem is kept at the tail of the problem queue.
[0020] Preferably, the preset evaluation indicators include at least the basic question score, the degree of missing context, and the degree of question dependency.
[0021] Specifically, the basic question score is used to quantify the strength of the association between the corresponding sub-question and the initial user question in terms of conceptual and semantic centrality; it can be understood as follows: the basic question score is obtained by calculating the semantic similarity between the sub-question and the initial user question.
[0022] Specifically, the context missingness is used to quantify the degree of information missing in the current shared task context for answering the corresponding sub-question, and the calculation formula is as follows: C=1-S max Where C represents the degree of context missing, S max The highest semantic similarity among all knowledge fragments shared in the context between the sub-problem and the current task.
[0023] Specifically, the problem dependency correlation is used to quantify the contribution of solving the corresponding sub-problem to advancing other unsolved problems in the problem queue. The calculation formula is as follows: , where D i S(q) represents the contribution of the i-th subproblem to advancing other unsolved problems in the problem queue. i q j Let ) represent the i-th subproblem and the other unsolved problems q in the problem queue. j The semantic similarity is given by N, where N is the total number of questions in the current question queue.
[0024] As described above, by inserting priority sub-problems at the head of the queue and always keeping the problem at the tail, depth-first exploration is encouraged while ensuring that the final goal is not forgotten. This effectively balances the depth and breadth of exploration. Through dynamic priority sorting, the system can identify and prioritize the most basic, core, or information gap-filling sub-problems, thereby optimizing the exploration path, ensuring the efficient and reasonable construction of the shared task context, and improving the reliability of the final answer.
[0025] Furthermore, the calculation of the priority score for each sub-problem based on several preset evaluation indicators includes the following steps: S401, obtain the resolution status records of all processed sub-problems within the preset sliding time window and the corresponding preset evaluation index values; wherein, the resolution status records include successfully resolved and unsuccessfully resolved.
[0026] S402, for each preset evaluation index, calculate the preset evaluation index value corresponding to the successfully solved sub-problem and the preset evaluation index value corresponding to the unsuccessfully solved sub-problem, and calculate the difference between the two as the feature discrimination degree of the preset evaluation index itself within the current sliding time window.
[0027] S403, calculate the initial adjustment weight of each preset evaluation index based on the feature discrimination of each preset evaluation index, and smoothly merge the initial adjustment weight with the preset evaluation index weight of the previous iteration to obtain the updated weight of each preset evaluation index; wherein, the initial state of the weight of each preset evaluation index is an average weight distribution.
[0028] Specifically, the initial adjustment weights of each preset evaluation index are calculated using the softmax function, and the calculation formula is as follows: Among them, W d H is the initial adjustment weight for the d-th preset evaluation indicator. d Let H be the feature discrimination of the d-th preset evaluation index within the current sliding time window, σ be the preset temperature parameter used to control the adjustment range, and H be the feature discrimination of the d-th preset evaluation index within the current sliding time window. g Let g be the feature discrimination of the g-th preset evaluation index within the current sliding time window, and k be the total number of preset evaluation indicators.
[0029] Preferably, the initial adjustment weights are smoothly integrated with the preset evaluation index weights of the previous iteration using an exponential moving average algorithm. Those skilled in the art are familiar with the specific calculation method of the exponential moving average algorithm, and it will not be described in detail here.
[0030] S404: Based on the updated weights of each preset evaluation index, the preset evaluation index values of each subproblem are weighted and summed to obtain the priority score of each subproblem.
[0031] As described above, by analyzing historical success and failure cases, the weights of each preset evaluation indicator are automatically adjusted, enabling the priority ranking criteria to adapt to specific task and context characteristics and dynamically optimize the problem exploration strategy. In this process, a sliding window-based weight smoothing update mechanism is also introduced, which not only ensures rapid response based on the latest feedback, but also avoids strategy instability caused by fluctuations in single evaluation results, ensuring the continuous reliability and robustness of the scheduling strategy. This is conducive to generating logically reliable shared task contexts with a deep understanding of complex open-domain problems, thereby ensuring the reliability and accuracy of the final answer.
[0032] S2 generates prompts based on the current shared task context and the current problem to be processed, and inputs them into the preset large language model to obtain the answer to be output and to evaluate the quality of the answer to be output.
[0033] Specifically, the shared task context includes at least the history, all questions and keywords that have been asked, accumulated knowledge, and unaccessed Uniform Resource Locators.
[0034] In one specific embodiment, the prompt message is generated through the following steps: S201, Set the system instruction target; the system instruction target includes at least the role of the agent in the preset large language model, the preset inference principle followed, and the preset data response format. For example, the role of the agent is a senior AI research assistant, the preset inference principle is multi-step inference, and the preset data response format is structured data conforming to JSON Schema.
[0035] S202, dynamically combine several information categories to construct a contextual situation; the several information categories include verified knowledge fragments, behavioral history records, and evaluation feedback for the answer to be output.
[0036] Specifically, the knowledge fragments are those that have been verified and accumulated during previous iterations.
[0037] Specifically, the evaluation feedback for the output answer refers to the output answers that failed the evaluation and the improvement strategies summarized from them.
[0038] S203, based on the current shared task context, obtain the current operation constraints; the operation constraints are used to disable specified operations in a preset operation set. For example, when there is no available Uniform Resource Locator in memory, access operations are prohibited, thereby guiding the direction of subsequent operation decisions, ensuring the rationality and efficiency of their action paths, and avoiding invalid loops.
[0039] S204 combines the system instruction target, context, and current operational constraints to generate a prompt message.
[0040] As described above, by constructing multiple structured information, the various functions in the prompt information can be clearly defined, thereby providing a hierarchical and context-rich instruction set for large language models. This overcomes the shortcomings of current advanced inference models that have the potential to automatically generate some instructions, but fail to achieve fine-grained control over agent behavior and optimize the efficiency of utilizing limited context windows. This method advocates the use of the above dynamic combination approach, which can ensure a high degree of controllability and stability of the interaction process, thereby significantly improving the reliability and efficiency of complex task execution.
[0041] Furthermore, the quality assessment of the output answer includes the following steps: S210, dynamically determine the corresponding set of evaluation criteria based on the type of the current problem to be addressed; this can be understood as: different sets of evaluation criteria are set for different problem types. For example, if the current problem to be addressed is a factual problem, the corresponding set of evaluation criteria includes accuracy, completeness, verifiability, consistency, relevance, and timeliness; if the current problem to be addressed is an analytical problem, the corresponding set of evaluation criteria includes multi-angle coverage, sufficiency of argumentation, balance of positions, logical coherence, and innovativeness.
[0042] S220 compares the answer to be output with the preset reference example for each evaluation criterion in the evaluation criterion set, and generates a judgment sub-result and judgment basis for the evaluation criterion itself. It can be understood as: if there is an evaluation criterion that does not meet the requirements, the answer to be output is identified as having a knowledge gap, so that the knowledge gap can be analyzed and corresponding sub-questions can be generated in the subsequent reflection operation.
[0043] S230, integrate the judgment sub-results of all evaluation criteria to generate a comprehensive evaluation result; the comprehensive evaluation result includes at least a Boolean judgment value representing whether the quality evaluation result passes or fails, and a judgment reasoning chain composed of the judgment basis of each evaluation criterion.
[0044] As described above, a dedicated evaluation module objectively and systematically assesses the quality of the generated output answer. During the assessment process, a consistency comparison is performed with reference to preset examples. This approach has higher stability and reliability compared to the model's self-evaluation strategy. The evaluation results also include a detailed reasoning chain, which records the judgment logic and basis for each evaluation criterion. Through this standardized and multi-dimensional evaluation mechanism, the accuracy, completeness, and relevance of the final output answer are improved.
[0045] S3, if the quality assessment result is passed, remove the current pending problem from the problem queue and determine whether the current pending problem is the initial user problem; if so, generate and output the final answer based on the shared task context; if not, determine a new current pending problem from the problem queue and return to step S2; this can be understood as: each time a new current pending problem is determined in the problem queue, the problem at the head of the problem queue is taken as the current pending problem.
[0046] As described above, by determining whether a sub-problem that has passed evaluation is the initial user problem, the system can automatically schedule the next problem in the problem queue when a sub-problem is determined to have passed evaluation. This allows all subsequent problems to directly utilize the accumulated knowledge gained from previously solved problems, ensuring that the system only outputs the final answer when the answer to the initial user problem is verified as passed. This achieves structured and sequential processing of complex problems, improving the accuracy and completeness of the final answer.
[0047] S4, if the quality assessment result is unsuccessful, then before the preset resource constraint threshold is reached, determine the operation to be executed from the preset operation set based on the quality assessment result, update the shared task context with the execution result information of the operation to be executed, and then return to execution step S2 for iteration until the final answer is output. When a sub-problem is generated after executing the operation to be executed, the sub-problem is placed at the head of the problem queue to update the current problem to be processed. In another case, if the preset resource constraint threshold has been reached, then the final answer is forcibly generated and output based on the current shared task context.
[0048] Specifically, the preset operation set includes reflection operations, search operations, access operations, and encoding operations; wherein, the sub-problems generated when performing the reflection operation are placed at the head of the problem queue.
[0049] In a specific embodiment, the rule for determining the operation to be executed from the preset operation set based on the quality assessment result is as follows: S401, Analyze the decision reasoning chain in the comprehensive evaluation result.
[0050] S402, if the determination reasoning chain indicates that the coverage of the answer to be output is insufficient, then the operation to be performed is determined to be a reflection operation, so as to analyze the current shared task context and generate at least one sub-problem to be added to the head of the problem queue.
[0051] S403, if the determination reasoning chain indicates that the reliability of the answer to be output is insufficient, then the operation to be performed is determined to be a search operation and an access operation.
[0052] S404, if the judgment reasoning chain indicates that the answer to be output requires an encoding operation, then the operation to be performed is determined to be an encoding operation; the encoding operation is any one of programmed calculation, calling an external application programming interface, or processing structured data. For example, in scenarios such as mathematical operations and statistical modeling, programmed calculation is used; in scenarios such as acquiring real-time data or external data, calling an external application programming interface is used; in scenarios such as data cleaning, parsing, and information extraction, processing structured data is used.
[0053] The above approach precisely maps the conclusion of failing the evaluation to specific operations, enabling targeted correction of the problem's defects. This systematically guides the reasoning path, significantly improving the efficiency of iteration and the effectiveness of the generated answers. Furthermore, updating the execution results of the operations to the shared task context provides an information basis for the next iteration, ensuring the reliability of the generated answers.
[0054] Specifically, the preset resource constraint thresholds include a preset maximum number of consecutive failed evaluations and a maximum token usage. For example, the system initializes a step counter to count the number of consecutive failed evaluations.
[0055] Preferably, the maximum token usage meets the following conditions: , among which, T max For the maximum token usage, T base S is the preset token budget. h S represents the number of times historical evaluations have passed in the current session. t T represents the total number of iterations that have occurred in the current session. u η represents the amount of tokens used in the current session, and ζ are both adjustment coefficients.
[0056] The above formula can be understood as: Maximum token budget = Basic token budget × (1 + Success reward factor) × (1 - Consumption penalty factor). As the success reward factor increases, the system will increase the maximum token budget to encourage further exploration to obtain better answers. The consumption penalty factor is based on the resource consumption progress. As the task progresses, the consumption progress increases, and the consumption penalty factor decreases. The system will dynamically reduce the maximum token budget to prevent excessive consumption in the later stages. That is, based on the real-time success rate and resource consumption progress of the task, the remaining available resources of the task can be dynamically and automatically adjusted. By setting the maximum token usage, a clear resource budget boundary is provided for the entire question-answering task, preventing the loss of control over computational costs caused by infinite iteration or complex reasoning. Furthermore, the final answer output mechanism can be dynamically triggered, forcing the generation and output of the answer based on the accumulated context before resources are exhausted, ensuring the economic efficiency, feasibility, and reliability of this question-answering method.
[0057] In one embodiment, step S4 further includes the following steps: S410: When performing a search operation, the semantic similarity between the current search request and historical search records is calculated. If the semantic similarity exceeds a preset similarity threshold, the current search request is filtered out. During this process, after filtering out duplicates from the current search request, the deduplication result is recorded in the shared task context. The search operation is closed or the result of the default search operation is set to null to influence the next decision.
[0058] Preferably, the embedding model jina-embeddings-v3 is used to calculate semantic similarity, so as to make the similarity control more accurate and reliable, thereby ensuring the deduplication effect.
[0059] S420, when the semantic similarity does not exceed a preset similarity threshold, the current search request is input to a preset search rewriting module for optimization; the optimization includes at least keyword optimization, synonym expansion, and expression transformation adapted to a specified retrieval algorithm. For example, the specified retrieval algorithm may be the BM25 algorithm.
[0060] S430: The optimized target search request is used as the new current search request. The process returns to step S41. If the semantic similarity still does not exceed the preset similarity threshold, a network search is performed based on the new current search request, and the search results and the current search record are stored in the shared task context.
[0061] As described above, by performing deduplication, rewriting, and further deduplication on search requests, the effectiveness of search requests and the quality of results can be improved. Deduplication can eliminate completely duplicate or highly semantically similar requests, avoiding redundant calculations. Rewriting optimization can cover relevant information under different languages, styles, and content formats, thereby improving the recall and accuracy of search results.
[0062] In another embodiment, step S4 further includes the following steps: S41, when performing an access operation, obtain the list of Uniform Resource Locators returned by the corresponding search operation from the shared task context.
[0063] S42, perform format normalization processing on each Uniform Resource Locator (URL) in the list of URLs, and filter out URLs that have been accessed based on the historical access records stored in the shared task context to obtain a set of links to be accessed; this can be understood as: the set of links to be accessed refers to URLs that have not yet been accessed.
[0064] S43 employs a parallel asynchronous mechanism, simultaneously initiating content retrieval requests for several links in the set of links to be accessed, and calling a pre-defined content extraction tool to extract the target text content. For example, the content extraction tool can use the JinaReader API. When using the parallel asynchronous mechanism, a limit is imposed on the number of Uniform Resource Locators (URLs) that can be accessed within a single step. This aims to reasonably control the memory usage of a single iteration and ensure the long-term stability of the system.
[0065] S44: The extracted target text content is stored as an independent knowledge entry, associated with the corresponding link, in the shared task context. At the same time, the content acquisition results of all links in the set of links to be accessed are integrated into the shared task context. This can be understood as: the content acquisition results of all links include the content acquired successfully or the information of failure, providing the agent with comprehensive information feedback.
[0066] As described above, by filtering out previously accessed Uniform Resource Locators (URLs), we can avoid redundant information collection and wasted computing resources. Furthermore, the use of parallel and asynchronous mechanisms can optimize access efficiency, thereby efficiently extracting structured text content and ensuring the structured preservation of the original network information, providing high-quality information input for subsequent reasoning steps.
[0067] Embodiments of the present invention also provide a non-transitory computer-readable storage medium that can be disposed in an electronic device to store at least one instruction or at least one program related to implementing a method in the method embodiments, wherein the at least one instruction or the at least one program is loaded and executed by the processor to implement the context-based iterative question-answering method provided in the above embodiments.
[0068] Embodiments of the present invention also provide an electronic device, including a processor and the aforementioned non-transitory computer-readable storage medium.
[0069] While specific embodiments of the invention have been described in detail by way of example, those skilled in the art should understand that the examples are for illustrative purposes only and not intended to limit the scope of the invention. It should also be understood that various modifications can be made to the embodiments without departing from the scope and spirit of the invention. The scope of the invention is defined by the appended claims.
Claims
1. A question-answering method based on context iteration, characterized in that, The method includes the following steps: S1, construct a problem queue and determine the current problem to be processed from the problem queue; the problem queue contains at least the initial user problem, and the initial user problem is placed at the end of the problem queue; S2, based on the current shared task context and the current problem to be processed, generates prompt information and inputs it into the preset large language model to obtain the answer to be output and to evaluate the quality of the answer to be output; S3. If the quality assessment result is passed, remove the current pending problem from the problem queue and determine whether the current pending problem is the initial user problem; if so, generate and output the final answer based on the shared task context. If not, then identify a new current problem to be processed from the problem queue and return to step S2. S4. If the quality assessment result is unsuccessful, then if the preset resource constraint threshold is not reached, determine the operation to be executed from the preset operation set according to the quality assessment result, update the execution result information of the operation to be executed to the shared task context, and then return to the execution step S2 for iteration until the final answer is output; wherein, when a sub-problem is generated after executing the operation to be executed, the sub-problem is placed at the head of the problem queue to update the current problem to be processed.
2. The question-answering method based on context iteration according to claim 1, characterized in that, In step S1, the current problem to be addressed is determined through the following steps: S101, If the problem queue contains only the initial user problem, then the initial user problem is identified as the current problem to be processed; S102, add the sub-problems generated after performing the reflection operation to the problem queue; If only one subproblem is generated, then the generated subproblem is identified as the current problem to be solved; If at least two subproblems are generated, the importance of each subproblem is analyzed to rank them, and the subproblem with the highest importance is identified as the current problem to be processed.
3. The question-answering method based on context iteration according to claim 1, characterized in that, In step S2, the prompt message is generated through the following steps: S201, Set system instruction objectives; the system instruction objectives include at least the roles of the agents in the preset large language model, the preset reasoning principles they follow, and the preset data response format; S202, dynamically combine several information categories to construct a contextual situation; the several information categories include verified knowledge fragments, behavioral history records, and evaluation feedback for the answer to be output; S203, based on the current shared task context, obtain the current operation constraints; the operation constraints are used to disable specified operations in the preset operation set; S204 combines the system instruction target, context, and current operational constraints to generate a prompt message.
4. The question-answering method based on context iteration according to claim 1, characterized in that, In step S2, the quality assessment of the output answer includes the following steps: S210, dynamically determine the corresponding set of evaluation criteria based on the type of the current problem to be processed; S220: For each evaluation criterion in the evaluation criterion set, the answer to be output is compared with the preset reference example for consistency, and a judgment sub-result and judgment basis are generated for the evaluation criterion itself. S230, integrate the judgment sub-results of all evaluation criteria to generate a comprehensive evaluation result; the comprehensive evaluation result includes at least a Boolean judgment value representing whether the quality evaluation result passes or fails, and a judgment reasoning chain composed of the judgment basis of each evaluation criterion.
5. The question-answering method based on context iteration according to claim 1, characterized in that, The preset resource constraint thresholds include the preset maximum number of consecutive failed evaluations and the maximum token usage; The maximum token usage meets the following conditions: , among which, T max For the maximum token usage, T base S is the preset token budget. h S represents the number of times historical evaluations have passed in the current session. t T represents the total number of iterations that have occurred in the current session. u η represents the amount of tokens used in the current session, and ζ are both adjustment coefficients.
6. The question-answering method based on context iteration according to claim 4, characterized in that, The preset operation set includes reflection operations, search operations, access operations, and encoding operations; and in step S4, the rules for determining the operation to be executed from the preset operation set based on the quality assessment results are as follows: S401, Analyze the decision reasoning chain in the comprehensive evaluation result; S402, if the judgment reasoning chain indicates that the coverage of the answer to be output is insufficient, then the operation to be performed is determined to be a reflection operation, so as to analyze the current shared task context and generate at least one sub-problem to be added to the head of the problem queue; S403, if the determination reasoning chain indicates that the reliability of the answer to be output is insufficient, then the operation to be performed is determined to be a search operation and an access operation; S404, if the determination reasoning chain indicates that the answer to be output needs to be encoded, then the operation to be performed is determined to be an encoding operation; the encoding operation is any one of programmed calculation, calling an external application interface, or processing structured data.
7. The question-answering method based on context iteration according to claim 1, characterized in that, Step S4 also includes the following steps: S410, when performing a search operation, calculate the semantic similarity between the current search request and historical search records, and filter out the current search request when the semantic similarity exceeds a preset similarity threshold; S420, when the semantic similarity does not exceed the preset similarity threshold, the current search request is input to the preset search rewriting module for optimization; the optimization includes at least keyword optimization, synonym expansion and expression conversion adapted to the specified retrieval algorithm; S430: The optimized target search request is used as the new current search request. The process returns to step S41. If the semantic similarity still does not exceed the preset similarity threshold, a network search is performed based on the new current search request, and the search results and the current search record are stored in the shared task context.
8. The question-answering method based on context iteration according to claim 1, characterized in that, Step S4 also includes the following steps: S41, when performing an access operation, obtain the list of Uniform Resource Locators returned by the corresponding search operation from the shared task context; S42, perform format normalization processing on each Uniform Resource Locator in the Uniform Resource Locator list, and filter out the Uniform Resource Locators that have been accessed based on the historical access records stored in the shared task context to obtain a set of links to be accessed. S43 employs a parallel asynchronous mechanism to simultaneously initiate content retrieval requests for several links in the set of links to be accessed, and calls a preset content extraction tool to extract the target text content. S44. The extracted target text content is stored as an independent knowledge item, associated with the corresponding link, in the shared task context. At the same time, the content acquisition results of all links in the set of links to be accessed are integrated into the shared task context.
9. A non-transitory computer-readable storage medium, wherein the storage medium stores at least one instruction or at least one program segment, characterized in that, The at least one instruction or the at least one program segment is loaded and executed by the processor to implement the context-based iterative question-answering method as described in any one of claims 1-8.
10. An electronic device, characterized in that, Includes a processor and the non-transitory computer-readable storage medium as described in claim 9.