A method, device and storage medium for iterating question and answer based on priority queue

By employing a priority queue iterative question-answering method, dynamic priority sorting, and quality assessment, this approach addresses the issues of resource waste and insufficient logical relationship evaluation in large language model question-answering systems when handling complex questions. It achieves logical reliability, a deep understanding of complex open-domain questions, and efficient answer generation.

CN122242735APending Publication Date: 2026-06-19BEIJING YUCHEN SHIMEI SCI & TECH

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
BEIJING YUCHEN SHIMEI SCI & TECH
Filing Date
2026-03-13
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

Existing question-answering systems based on large language models struggle to dynamically assess the inherent logical relationships between sub-questions when dealing with complex open-domain problems. This leads to wasted resources and unreliable reasoning paths. Furthermore, the lack of quantitative assessment of the importance of sub-questions affects the accuracy and efficiency of the final answer.

Method used

We adopt a priority queue-based iterative question-answering method. By constructing a question queue, we generate hierarchical prompts and perform quality assessments. We then dynamically prioritize and sort the questions, generate sub-questions and insert them at the head of the queue. This ensures that the system prioritizes the most basic or information-gap-filled sub-questions and optimizes the reasoning path.

Benefits of technology

It improves the reliability and accuracy of answers to complex open-domain problems by optimizing the reasoning path through dynamic prioritization and quality assessment, thus ensuring the reliability and completeness of the final answer.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122242735A_ABST
    Figure CN122242735A_ABST
Patent Text Reader

Abstract

This invention relates to the field of intelligent question answering technology, and in particular to a method, device, and storage medium for iterative question answering based on a priority queue. The method includes: constructing a question queue and placing the initial user question corresponding to the target task as the current pending question; generating prompt information based on the current shared task context and the current pending question and inputting it into a preset large language model to obtain the answer to be output and perform quality evaluation on the answer to be output; if the quality evaluation result is unsuccessful and the determined operation to be executed is a reflection operation, generating a sub-question; calculating the priority score of each sub-question based on several preset evaluation indicators and inserting it into the head of the question queue in descending order to determine a new current pending question for iteration until the final answer is output. This invention can prioritize the iteratively generated sub-questions, thereby optimizing the reasoning path and ensuring the reliability and accuracy of the final answer.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of intelligent question answering technology, and in particular to a method, device and storage medium for iterative question answering based on priority queues. Background Technology

[0002] Currently, question-answering systems based on large language models, especially those employing retrieval-enhanced generative architectures, have become the mainstream approach for handling knowledge-intensive tasks. These systems typically follow a one-time "retrieval-generation" process: first, relevant document fragments are retrieved based on the user's query, and then the large language model directly generates the final answer based on the retrieved context.

[0003] When using iterative methods to handle complex open-domain question answering, the initial question is typically recursively decomposed into multiple sub-questions, which are then processed according to the generation order or simple heuristic rules. This approach can lead to several problems: First, it fails to dynamically assess the inherent logical relationships between sub-questions, potentially wasting significant resources on secondary or marginal issues, resulting in unreliable reasoning paths and difficulty in achieving optimal problem-solving within limited resource constraints. Second, it lacks a quantitative assessment mechanism for the importance of sub-questions and dynamic priority scheduling capabilities. During iteration, while the system can identify knowledge gaps and generate sub-questions, it often relies solely on the question generation order or simple rule-based judgments. This makes it difficult for the system to autonomously identify and prioritize the most fundamental and critical sub-questions when faced with complex problems, thus impacting reasoning efficiency and the quality of the final answer. Summary of the Invention

[0004] To address the aforementioned technical problems, this invention provides a method, device, and storage medium for iterative question answering based on a priority queue. This method can prioritize the sub-questions generated iteratively, which is beneficial for generating a logically reliable shared task context that provides a deep understanding of complex open-domain problems. This optimizes the reasoning path and ensures the reliability and accuracy of the final answer.

[0005] According to a first aspect of the present invention, a method for iterative question answering based on a priority queue is provided, comprising the following steps: S1, construct a problem queue and add the initial user problem corresponding to the target task as the current problem to be processed.

[0006] S2, based on the current shared task context and the current problem to be processed, generate prompt information and input it into the preset large language model to obtain the answer to be output and evaluate the quality of the answer to be output; the shared task context refers to the task context that is updated in real time when the target task is executed.

[0007] S3. If the quality assessment result is "not passed" and the operation to be executed determined from the preset operation set based on the quality assessment result is a reflection operation, generate at least one sub-problem.

[0008] S4. If there are at least two sub-problems, calculate the priority score of each sub-problem based on several preset evaluation indicators, and insert it into the head of the problem queue in descending order of priority score. Take the sub-problems at the head of the problem queue as the current problem to be processed and keep the initial user problem at the tail of the problem queue, and return to step S2. The several preset evaluation indicators include at least the basic score of the problem, the degree of missing context, and the degree of dependency of the problem.

[0009] S5. If the quality assessment result is passed, the current problem to be processed is determined from the problem queue in sequence and steps S2 to S5 are repeated until all problems in the problem queue are processed. Then, the final answer is generated and output based on the shared task context.

[0010] According to a second aspect of the present invention, a non-transitory computer-readable storage medium is provided, wherein at least one instruction or at least one program is stored therein, the at least one instruction or the at least one program being loaded and executed by a processor to implement the above-described priority queue-based iterative question-and-answer method.

[0011] According to a third aspect of the present invention, an electronic device is provided, including a processor and the aforementioned non-transitory computer-readable storage medium.

[0012] The present invention has at least the following beneficial effects: This invention provides a priority queue-based iterative question-answering method. First, a question queue is constructed, and the initial user question corresponding to the target task is placed as the current pending question. Then, based on the current shared task context and the current pending question, prompt information is generated and input into a preset large language model, providing the large language model with a hierarchical and context-rich instruction set. After obtaining the answer to be output, its quality is evaluated. If the quality evaluation result is satisfactory, the current pending question is removed from the question queue, and the iteration continues until the final answer is output. If the quality evaluation result is unsatisfactory and the determined operation to be performed is a reflection operation, a sub-question is generated. Then, based on several preset evaluation indicators, the priority score of each sub-question is calculated and inserted into the head of the question queue in descending order to determine a new current pending question for iteration. During this process, dynamic priority sorting ensures that the system can identify and prioritize the most basic, core, or informationally gap-filling sub-questions. This facilitates the generation of logically reliable shared task contexts with a deep understanding of complex open-domain problems, thereby optimizing the reasoning path and ensuring the reliability and accuracy of the final answer. Attached Figure Description

[0013] To more clearly illustrate the technical solutions in the embodiments of the present invention, the accompanying drawings used in the description of the embodiments will be briefly introduced below. Obviously, the accompanying drawings described below are only some embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.

[0014] Figure 1 A flowchart of a priority queue-based iterative question-answering method provided in an embodiment of the present invention. Detailed Implementation

[0015] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.

[0016] This invention provides a method for iterative question answering based on a priority queue, such as... Figure 1 As shown, the method includes the following steps: S1: Construct a problem queue and place the initial user problem corresponding to the target task as the current problem to be processed. This can be understood as follows: The initial state of the problem queue is that it only contains the initial user problem. When a sub-problem is generated in the reflection operation, the sub-problem is added to the head of the problem queue, while the initial user problem is still at the tail of the problem queue, so as to give priority to generating the answer content corresponding to the sub-problem, and finally output the answer to the initial user problem.

[0017] As described above, by constructing a problem queue, the current problem to be processed is always extracted from the head of the queue in each round of inference iteration, while the initial user problem is executed last. This design ensures that newly discovered, potentially more fundamental, sub-problems are given priority, while guaranteeing that the system will eventually return to solving the original problem. In this process, a continuously accumulating shared task context is maintained. The knowledge gained from solving the problems in the previous stages can be accumulated in this context and then directly utilized by all subsequent problems, thus strengthening the knowledge base and improving the accuracy and robustness of the solutions to the original complex problems.

[0018] S2, based on the current shared task context and the current problem to be processed, generate prompt information and input it into the preset large language model to obtain the answer to be output and evaluate the quality of the answer to be output; the shared task context refers to the task context that is updated in real time when the target task is executed.

[0019] Furthermore, the shared task context includes at least the history, all questions and keywords that have been raised, accumulated knowledge, and unaccessed Uniform Resource Locators.

[0020] In one embodiment, in step S2, the prompt message is generated through the following steps: S201, Set the system instruction target; the system instruction target includes at least the role of the agent in the preset large language model, the preset inference principle followed, and the preset data response format. For example, the role of the agent is a senior AI research assistant, the preset inference principle is multi-step inference, and the preset data response format is structured data conforming to JSON Schema.

[0021] S202, dynamically combine several information categories to construct a contextual situation; the several information categories include verified knowledge fragments, behavioral history records, and evaluation feedback for the answer to be output.

[0022] Specifically, the knowledge fragments are those that have been verified and accumulated during previous iterations.

[0023] Specifically, the evaluation feedback for the output answer refers to the output answers that failed the evaluation and the improvement strategies summarized from them.

[0024] S203, based on the current shared task context, obtain the current operation constraints; the operation constraints are used to disable specified operations in a preset operation set. For example, when there is no available Uniform Resource Locator in memory, access operations are prohibited, thereby guiding the direction of subsequent operation decisions, ensuring the rationality and efficiency of their action paths, and avoiding invalid loops.

[0025] S204 combines the system instruction target, context, and current operational constraints to generate a prompt message.

[0026] As described above, by constructing multiple structured information, the various functions in the prompt information can be clearly defined, thereby providing a hierarchical and context-rich instruction set for large language models. This overcomes the shortcomings of current advanced inference models that have the potential to automatically generate some instructions, but fail to achieve fine-grained control over agent behavior and optimize the efficiency of utilizing limited context windows. This method advocates the use of the above dynamic combination approach, which can ensure a high degree of controllability and stability of the interaction process, thereby significantly improving the reliability and efficiency of complex task execution.

[0027] Furthermore, the quality assessment of the output answer includes the following steps: S210, dynamically determine the corresponding set of evaluation criteria based on the type of the current problem to be addressed; this can be understood as: different sets of evaluation criteria are set for different problem types. For example, if the current problem to be addressed is a factual problem, the corresponding set of evaluation criteria includes accuracy, completeness, verifiability, consistency, relevance, and timeliness; if the current problem to be addressed is an analytical problem, the corresponding set of evaluation criteria includes multi-angle coverage, sufficiency of argumentation, balance of positions, logical coherence, and innovativeness.

[0028] S220 compares the answer to be output with the preset reference example for each evaluation criterion in the evaluation criterion set, and generates a judgment sub-result and judgment basis for the evaluation criterion itself. It can be understood as: if there is an evaluation criterion that does not meet the requirements, the answer to be output is identified as having a knowledge gap, so that the knowledge gap can be analyzed and corresponding sub-questions can be generated in the subsequent reflection operation.

[0029] S230, integrate the judgment sub-results of all evaluation criteria to generate a comprehensive evaluation result; the comprehensive evaluation result includes at least a Boolean judgment value representing whether the quality evaluation result passes or fails, and a judgment reasoning chain composed of the judgment basis of each evaluation criterion.

[0030] As described above, a dedicated evaluation module objectively and systematically assesses the quality of the generated output answer. During the assessment process, a consistency comparison is performed with reference to preset examples. This approach has higher stability and reliability compared to the model's self-evaluation strategy. The evaluation results also include a detailed reasoning chain, which records the judgment logic and basis for each evaluation criterion. Through this standardized and multi-dimensional evaluation mechanism, the accuracy, completeness, and relevance of the final output answer are improved.

[0031] S3. If the quality assessment result is "not passed" and the operation to be executed determined from the preset operation set based on the quality assessment result is a reflection operation, generate at least one sub-problem.

[0032] Specifically, the preset set of operations includes reflection operations, search operations, access operations, and encoding operations. The execution result information obtained from each operation is updated in the shared task context.

[0033] Specifically, the rules for determining the operation to be executed from the preset operation set based on the quality assessment results are as follows: S301, Analyze the decision reasoning chain in the comprehensive evaluation result.

[0034] S302, if the determination reasoning chain indicates that the coverage of the answer to be output is insufficient, then the operation to be performed is determined to be a reflection operation, so as to analyze the current shared task context and generate at least one sub-problem to be added to the head of the problem queue.

[0035] S303, if the determination reasoning chain indicates that the reliability of the answer to be output is insufficient, then the operation to be performed is determined to be a search operation and an access operation.

[0036] S304, if the judgment reasoning chain indicates that the answer to be output requires an encoding operation, then the operation to be performed is determined to be an encoding operation; the encoding operation is any one of programmed calculation, calling an external application programming interface, or processing structured data. For example, in scenarios such as mathematical operations and statistical modeling, programmed calculation is used; in scenarios such as acquiring real-time data or external data, calling an external application programming interface is used; in scenarios such as data cleaning, parsing, and information extraction, processing structured data is used.

[0037] The above approach precisely maps the conclusion of failing the evaluation to specific operations, enabling targeted correction of the problem's defects. This systematically guides the reasoning path, significantly improving the efficiency of iteration and the effectiveness of the generated answers. Furthermore, updating the execution results of the operations to the shared task context provides an information basis for the next iteration, ensuring the reliability of the generated answers.

[0038] In a preferred embodiment, before determining the operation to be executed, it is determined whether a preset resource constraint threshold has been reached; if the preset resource constraint threshold has not been reached, the operation to be executed is determined from the preset operation set based on the quality assessment result; if the preset resource constraint threshold has been reached, the final answer is forcibly generated and output based on the current shared task context.

[0039] Specifically, the preset resource constraint thresholds include a preset maximum number of consecutive failed evaluations and a maximum token usage. For example, the system initializes a step counter to count the number of consecutive failed evaluations.

[0040] Preferably, the maximum token usage meets the following conditions: , among which, T max For the maximum token usage, T base S is the preset token budget. h S represents the number of times historical evaluations have passed in the current session. t T represents the total number of iterations that have occurred in the current session. u η represents the amount of tokens used in the current session, and ζ are both adjustment coefficients.

[0041] The above formula can be understood as: Maximum token budget = Basic token budget × (1 + Success reward factor) × (1 - Consumption penalty factor). As the success reward factor increases, the system will increase the maximum token budget to encourage further exploration to obtain better answers. The consumption penalty factor is based on the resource consumption progress. As the task progresses, the consumption progress increases, and the consumption penalty factor decreases. The system will dynamically reduce the maximum token budget to prevent excessive consumption in the later stages. That is, based on the real-time success rate and resource consumption progress of the task, the remaining available resources of the task can be dynamically and automatically adjusted. By setting the maximum token usage, a clear resource budget boundary is provided for the entire question-answering task, preventing the loss of control over computational costs caused by infinite iteration or complex reasoning. Furthermore, the final answer output mechanism can be dynamically triggered, forcing the generation and output of the answer based on the accumulated context before resources are exhausted, ensuring the economic efficiency, feasibility, and reliability of this question-answering method.

[0042] Furthermore, step S3 also includes the following steps: S310: When performing a search operation, the semantic similarity between the current search request and historical search records is calculated. If the semantic similarity exceeds a preset similarity threshold, the current search request is filtered out. During this process, after filtering out duplicates from the current search request, the deduplication result is recorded in the shared task context. The search operation is closed or the result of the default search operation is set to null to influence the next decision.

[0043] Preferably, the embedding model jina-embeddings-v3 is used to calculate semantic similarity, so as to make the similarity control more accurate and reliable, thereby ensuring the deduplication effect.

[0044] S320, when the semantic similarity does not exceed a preset similarity threshold, the current search request is input to a preset search rewriting module for optimization; the optimization includes at least keyword optimization, synonym expansion, and expression transformation adapted to a specified retrieval algorithm. For example, the specified retrieval algorithm may be the BM25 algorithm.

[0045] S330: The optimized target search request is used as the new current search request. The process returns to step S31. If the semantic similarity still does not exceed the preset similarity threshold, a network search is performed based on the new current search request, and the search results and the current search record are stored in the shared task context.

[0046] As described above, by performing deduplication, rewriting, and further deduplication on search requests, the effectiveness of search requests and the quality of results can be improved. Deduplication can eliminate completely duplicate or highly semantically similar requests, avoiding redundant calculations. Rewriting optimization can cover relevant information under different languages, styles, and content formats, thereby improving the recall and accuracy of search results.

[0047] In another embodiment, step S3 further includes the following steps: S31, when performing an access operation, obtain the list of Uniform Resource Locators returned by the corresponding search operation from the shared task context.

[0048] S32, perform format normalization processing on each Uniform Resource Locator (URL) in the list of URLs, and filter out URLs that have been accessed based on the historical access records stored in the shared task context to obtain a set of links to be accessed; this can be understood as: the set of links to be accessed refers to URLs that have not yet been accessed.

[0049] S33 employs a parallel asynchronous mechanism, simultaneously initiating content retrieval requests for several links in the set of links to be accessed, and calling a pre-defined content extraction tool to extract the target text content. For example, the content extraction tool can use the JinaReader API. When using the parallel asynchronous mechanism, a limit is imposed on the number of Uniform Resource Locators (URLs) that can be accessed within a single step. This aims to reasonably control the memory usage of a single iteration and ensure the long-term stability of the system.

[0050] S34: The extracted target text content is stored as an independent knowledge entry, associated with the corresponding link, in the shared task context. At the same time, the content acquisition results of all links in the set of links to be accessed are integrated into the shared task context. This can be understood as: the content acquisition results of all links include the content acquired successfully or the information of failure, providing the agent with comprehensive information feedback.

[0051] As described above, by filtering out previously accessed Uniform Resource Locators (URLs), we can avoid redundant information collection and wasted computing resources. Furthermore, the use of parallel and asynchronous mechanisms can optimize access efficiency, thereby efficiently extracting structured text content and ensuring the structured preservation of the original network information, providing high-quality information input for subsequent reasoning steps.

[0052] S4. If there are at least two sub-problems, calculate the priority score of each sub-problem based on several preset evaluation indicators, and insert it into the head of the problem queue in descending order of priority score. Take the sub-problems at the head of the problem queue as the current problem to be processed and keep the initial user problem at the tail of the problem queue. Return to step S2.

[0053] In another case, if there is only one subproblem, the subproblem is directly inserted into the head of the problem queue and becomes the current problem to be processed.

[0054] Preferably, the preset evaluation indicators include at least the basic question score, the degree of missing context, and the degree of question dependency.

[0055] Specifically, the basic question score is used to quantify the strength of the association between the corresponding sub-question and the initial user question in terms of conceptual and semantic centrality; it can be understood as follows: the basic question score is obtained by calculating the semantic similarity between the sub-question and the initial user question.

[0056] Specifically, the context missingness is used to quantify the degree of information missing in the current shared task context for answering the corresponding sub-question, and the calculation formula is as follows: C=1-S max Where C represents the degree of context missing, S max The highest semantic similarity among all knowledge fragments shared in the context between the sub-problem and the current task.

[0057] Specifically, the problem dependency correlation is used to quantify the contribution of solving the corresponding sub-problem to advancing other unsolved problems in the problem queue. The calculation formula is as follows: , where D i S(q) represents the contribution of the i-th subproblem to advancing other unsolved problems in the problem queue. i q j Let ) represent the i-th subproblem and the other unsolved problems q in the problem queue. j The semantic similarity is given by N, where N is the total number of questions in the current question queue.

[0058] As described above, by inserting priority sub-problems at the head of the queue and always keeping the problem at the tail, depth-first exploration is encouraged while ensuring that the final goal is not forgotten. This effectively balances the depth and breadth of exploration. Through dynamic priority sorting, the system can identify and prioritize the most basic, core, or information gap-filling sub-problems, thereby optimizing the exploration path, ensuring the efficient and reasonable construction of the shared task context, and improving the reliability of the final answer.

[0059] Furthermore, the calculation of the priority score for each sub-problem based on several preset evaluation indicators includes the following steps: S401, obtain the resolution status records of all processed sub-problems within the preset sliding time window and the corresponding preset evaluation index values; wherein, the resolution status records include successfully resolved and unsuccessfully resolved.

[0060] S402, for each preset evaluation index, calculate the preset evaluation index value corresponding to the successfully solved sub-problem and the preset evaluation index value corresponding to the unsuccessfully solved sub-problem, and calculate the difference between the two as the feature discrimination degree of the preset evaluation index itself within the current sliding time window.

[0061] S403, calculate the initial adjustment weight of each preset evaluation index based on the feature discrimination of each preset evaluation index, and smoothly merge the initial adjustment weight with the preset evaluation index weight of the previous iteration to obtain the updated weight of each preset evaluation index; wherein, the initial state of the weight of each preset evaluation index is an average weight distribution.

[0062] Specifically, the initial adjustment weights of each preset evaluation index are calculated using the softmax function, and the calculation formula is as follows: Among them, W d H is the initial adjustment weight for the d-th preset evaluation indicator. d Let H be the feature discrimination of the d-th preset evaluation index within the current sliding time window, σ be the preset temperature parameter used to control the adjustment range, and H be the feature discrimination of the d-th preset evaluation index within the current sliding time window. g Let g be the feature discrimination of the g-th preset evaluation index within the current sliding time window, and k be the total number of preset evaluation indicators.

[0063] Preferably, the initial adjustment weights are smoothly integrated with the preset evaluation index weights of the previous iteration using an exponential moving average algorithm. Those skilled in the art are familiar with the specific calculation method of the exponential moving average algorithm, and it will not be described in detail here.

[0064] S404: Based on the updated weights of each preset evaluation index, the values ​​of each preset evaluation index for each subproblem are weighted and summed to obtain the priority score for each subproblem.

[0065] As described above, by analyzing historical success and failure cases, the weights of each preset evaluation indicator are automatically adjusted, enabling the priority ranking criteria to adapt to specific task and context characteristics and dynamically optimize the problem exploration strategy. In this process, a sliding window-based weight smoothing update mechanism is also introduced, which not only ensures rapid response based on the latest feedback, but also avoids strategy instability caused by fluctuations in single evaluation results, ensuring the continuous reliability and robustness of the scheduling strategy. This is conducive to generating logically reliable shared task contexts with a deep understanding of complex open-domain problems, thereby ensuring the reliability and accuracy of the final answer.

[0066] S5. If the quality assessment result is passed, the current problem to be processed is determined sequentially from the problem queue and steps S2 to S5 are repeated until all problems in the problem queue are processed. Then, the final answer is generated and output based on the shared task context. This can be understood as: each time a new current problem to be processed is determined in the problem queue, the problem at the head of the problem queue is taken as the current problem to be processed.

[0067] As described above, by determining whether a sub-problem that has passed evaluation is the initial user problem, the system can automatically schedule the next problem in the problem queue when a sub-problem is determined to have passed evaluation. This allows all subsequent problems to directly utilize the accumulated knowledge gained from previously solved problems, ensuring that the system only outputs the final answer when the answer to the initial user problem is verified as passed. This achieves structured and sequential processing of complex problems, improving the accuracy and completeness of the final answer.

[0068] Embodiments of the present invention also provide a non-transitory computer-readable storage medium that can be disposed in an electronic device to store at least one instruction or at least one program related to implementing a method in the method embodiments, wherein the at least one instruction or the at least one program is loaded and executed by the processor to implement the priority queue-based iterative question-and-answer method provided in the above embodiments.

[0069] Embodiments of the present invention also provide an electronic device, including a processor and the aforementioned non-transitory computer-readable storage medium.

[0070] While specific embodiments of the invention have been described in detail by way of example, those skilled in the art should understand that the examples are for illustrative purposes only and not intended to limit the scope of the invention. It should also be understood that various modifications can be made to the embodiments without departing from the scope and spirit of the invention. The scope of the invention is defined by the appended claims.

Claims

1. A method for iterative question answering based on priority queues, characterized in that, The method includes the following steps: S1, construct a problem queue and add the initial user problem corresponding to the target task as the current problem to be processed; S2, based on the current shared task context and the current problem to be processed, generate prompt information and input it into the preset large language model to obtain the answer to be output and evaluate the quality of the answer to be output; the shared task context refers to the task context that is updated in real time when the target task is executed; S3, if the quality assessment result is unsuccessful, and the operation to be executed determined from the preset operation set based on the quality assessment result is a rethinking operation, generate at least one sub-problem; S4. If there are at least two sub-problems, calculate the priority score of each sub-problem based on several preset evaluation indicators, and insert them into the head of the problem queue in descending order of priority score. Take the sub-problems at the head of the problem queue as the current problem to be processed and keep the initial user problem at the tail of the problem queue. Return to step S2. The several preset evaluation indicators include at least the basic score of the problem, the context missing degree, and the problem dependency correlation degree. S5. If the quality assessment result is passed, the current problem to be processed is determined from the problem queue in sequence and steps S2 to S5 are repeated until all problems in the problem queue are processed. Then, the final answer is generated and output based on the shared task context.

2. The method of iterating on a priority queue based question and answer based on claim 1, characterized in that, Step S4 also includes the following steps: If there is only one subproblem, then insert the subproblem directly into the head of the problem queue and treat it as the current problem to be processed.

3. The method for priority queue iteration based question answering according to claim 1, wherein, In step S2, the quality assessment of the output answer includes the following steps: S210, dynamically determine the corresponding set of evaluation criteria based on the type of the current problem to be addressed; S220: For each evaluation criterion in the evaluation criterion set, the answer to be output is compared with the preset reference example for consistency, and a judgment sub-result and judgment basis are generated for the evaluation criterion itself. S230, integrate the judgment sub-results of all evaluation criteria to generate a comprehensive evaluation result; the comprehensive evaluation result includes at least a Boolean judgment value representing whether the quality evaluation result passes or fails, and a judgment reasoning chain composed of the judgment basis of each evaluation criterion.

4. The method of iterating priority queue based question answering of claim 3, wherein, In step S3, the rules for determining the operation to be executed from the preset operation set based on the quality assessment results are as follows: S301, Analyze the decision reasoning chain in the comprehensive evaluation result; S302, if the judgment reasoning chain indicates that the coverage of the answer to be output is insufficient, then the operation to be performed is determined to be a reflection operation, so as to analyze the current shared task context and generate at least one sub-problem to be added to the head of the problem queue; S303, if the determination reasoning chain indicates that the reliability of the answer to be output is insufficient, then the operation to be performed is determined to be a search operation and an access operation; S304, if the determination reasoning chain indicates that the answer to be output needs to be encoded, then the operation to be performed is determined to be an encoding operation; the encoding operation is any one of programmed calculation, calling an external application interface, or processing structured data.

5. The method for priority queue iteration based question answering as claimed in claim 1 wherein, The basic score of the question is used to quantify the strength of the association between the corresponding sub-question and the initial user question in terms of conceptual and semantic centrality; The context missingness is used to quantify the degree to which the current shared task context lacks information for answering the corresponding sub-questions; The problem dependency correlation is used to quantify the contribution of solving the corresponding sub-problem to advancing other unsolved problems in the problem queue.

6. The method for priority queue iteration based question answering as claimed in claim 1 wherein, In step S4, calculating the priority score for each sub-problem based on several preset evaluation indicators includes the following steps: S401, obtain the resolution status records of all processed sub-problems within the preset sliding time window and the corresponding preset evaluation index values; wherein, the resolution status records include successfully resolved and unsuccessfully resolved; S402, For each preset evaluation index, calculate the preset evaluation index value corresponding to the successfully solved sub-problem and the preset evaluation index value corresponding to the unsuccessfully solved sub-problem, and calculate the difference between the two as the feature discrimination degree of the preset evaluation index itself within the current sliding time window. S403, calculate the initial adjustment weight of each preset evaluation index based on the feature discrimination of each preset evaluation index, and smoothly integrate the initial adjustment weight with the preset evaluation index weight of the previous iteration to obtain the updated weight of each preset evaluation index; wherein, the initial state of the weight of each preset evaluation index is an evenly distributed weight. S404: Based on the updated weights of each preset evaluation index, the preset evaluation index values ​​of each subproblem are weighted and summed to obtain the priority score of each subproblem.

7. The method for priority queue iteration based question answering as claimed in claim 1 wherein, The preset set of operations includes reflection operations, search operations, access operations, and encoding operations.

8. The method of iterating priority queue based question answering of claim 7, wherein, Step S3 also includes the following steps: S310, when performing a search operation, calculate the semantic similarity between the current search request and historical search records, and filter out the current search request when the semantic similarity exceeds a preset similarity threshold; S320, when the semantic similarity does not exceed the preset similarity threshold, the current search request is input to the preset search rewriting module for optimization; the optimization includes at least keyword optimization, synonym expansion and expression conversion adapted to the specified retrieval algorithm; S330: The optimized target search request is used as the new current search request. The process returns to step S31. If the semantic similarity still does not exceed the preset similarity threshold, a network search is performed based on the new current search request, and the search results and the current search record are stored in the shared task context. 9.A non-transitory computer-readable storage medium having stored therein at least one instruction or at least one piece of program, characterized in that, The at least one instruction or the at least one program segment is loaded and executed by the processor to implement the priority queue-based iterative question-answering method as described in any one of claims 1-8.

10. An electronic device, comprising: Includes a processor and the non-transitory computer-readable storage medium as described in claim 9.