An interactive guiding method and system of an intelligent question-answering system and a storage medium

By analyzing user expression patterns and rhythm preferences, and combining intent confidence prediction with automatic generation of clarification path templates, the problem of low intent recognition accuracy in multi-round interactions of traditional intelligent question answering systems has been solved, thereby improving user interaction experience and dialogue fluency.

CN122064795BActive Publication Date: 2026-06-23SHAANXI YUNCHUANG NETWORK TECH CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
SHAANXI YUNCHUANG NETWORK TECH CO LTD
Filing Date
2026-04-22
Publication Date
2026-06-23

AI Technical Summary

Technical Problem

Traditional intelligent question-answering systems cannot achieve full-process dynamic tracking of user interaction status in multi-round interactions, making it difficult to accurately identify effective guidance opportunities, resulting in low intent recognition accuracy and poor human-computer interaction experience.

Method used

By analyzing users' expression patterns and rhythm preferences through natural language processing and long short-term memory networks, user response sequences and emotion fluctuation sequences are generated. Intent confidence is predicted by combining RBF kernel support vector machines, expression ambiguity markers and clarification path templates are automatically generated, and guidance strategies are optimized to improve the fluency of multi-turn dialogues.

Benefits of technology

It achieves dynamic trend prediction of user intent recognition, improves the accuracy of intent recognition and the fluency of multi-turn dialogue, and optimizes the human-computer interaction experience.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122064795B_ABST
    Figure CN122064795B_ABST
Patent Text Reader

Abstract

The application relates to the technical field of information processing, and discloses an interactive guiding method and system of an intelligent question-answering system and a storage medium. The method comprises the following steps: an intelligent question-answering system analyzes and processes expression mode data and rhythm preference data, generates a user reaction sequence and an emotional fluctuation sequence, judges a trend of user intention confidence evaluation, adjusts a dialogue guiding strategy, simultaneously generates an expression ambiguity label and a clarification path template, extracts question type classification statistical data of multi-round dialogue, determines an effective guiding opportunity, matches dialogue data of similar scenes, optimizes a hierarchical progressive strategy and a question boundary narrowing strategy of a guiding question, processes dialogue data of a current interaction round, judges a narrowing degree of the expression ambiguity label, generates a multi-round dialogue fluency index, and obtains a stable state of the user intention confidence evaluation. The application improves the intention recognition accuracy, ambiguity clarification efficiency and multi-round interaction fluency of the intelligent question-answering system.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of information processing technology, and in particular to an interactive guidance method, system and storage medium for an intelligent question-and-answer system. Background Technology

[0002] Intelligent question-answering systems have been widely applied in various scenarios such as customer service, office work, and daily life services. However, traditional intelligent question-answering systems still have many technical shortcomings in multi-round interactive guidance. These systems often employ standardized dialogue guidance logic, ignoring users' personalized expression patterns, interaction rhythm preferences, and real-time emotional fluctuations. They cannot achieve full-process dynamic tracking of user interaction states, easily leading to mismatches between guidance and user needs. Furthermore, traditional systems primarily assess user intent using static numerical judgments, lacking the ability to dynamically predict the trend of intent confidence. When faced with ambiguous user expressions or knowledge gaps, they struggle to accurately identify effective guidance opportunities, often resulting in ineffective follow-up questions and chaotic guidance levels. This leads to narrowed question boundaries, low efficiency, and poor fluency in multi-round dialogues. In addition, traditional systems often use fixed templates for clarification strategies, failing to adaptively adjust guidance intensity based on changes in user willingness to cooperate. This easily triggers user resistance, ultimately resulting in low intent recognition accuracy and a poor human-computer interaction experience, failing to meet users' personalized, intelligent, and efficient interaction needs for intelligent question-answering systems. Summary of the Invention

[0003] To address the aforementioned technical issues, this application provides an interactive guidance method, system, and storage medium for an intelligent question-answering system, which improves the accuracy of intent recognition, the efficiency of ambiguity clarification, and the fluency of multi-turn dialogues in the intelligent question-answering system.

[0004] Firstly, this application provides an interactive guidance method for an intelligent question-answering system, the method comprising:

[0005] The intelligent question-answering system extracts user expression pattern data and rhythm preference data from historical dialogue records, analyzes and processes the expression pattern data and rhythm preference data respectively, and generates user response sequence and emotion fluctuation sequence.

[0006] Based on the user response sequence and the emotion fluctuation sequence, the user's willingness to cooperate intensity classification result and knowledge blind spot labeling information are determined, and comprehensive impact data is obtained through comprehensive calculation. Based on the comprehensive impact data, the trend of user intention confidence assessment is judged, and the dialogue guidance strategy is adjusted according to the trend. At the same time, expression ambiguity markers and clarification path templates are generated.

[0007] Extract question type classification statistics from the clarification path template, calculate the control intensity change range, determine the effective guidance timing, match dialogue data of similar scenarios for the effective guidance timing, and optimize the hierarchical progression strategy and question boundary narrowing strategy of guidance questions based on the matching results.

[0008] The optimized hierarchical progression strategy for guiding questions and the narrowing strategy for question boundaries are used to process the dialogue data of the current interaction round, determine the degree of narrowing of the expression ambiguity markers, generate a multi-round dialogue fluency index, and obtain a stable state of user intent confidence assessment.

[0009] Secondly, this application provides an intelligent question-answering system, the system comprising:

[0010] The sequence extraction unit extracts user expression pattern data and rhythm preference data from historical dialogue records. It then analyzes and processes the expression pattern data and rhythm preference data to generate user response sequences and emotion fluctuation sequences.

[0011] The direction judgment unit is used to determine the user's willingness to cooperate intensity grading result and knowledge blind spot labeling information based on the user's response sequence and the emotional fluctuation sequence, and to perform comprehensive calculation to obtain comprehensive impact data. Based on the comprehensive impact data, it judges the direction trend of the user's intention confidence assessment, adjusts the dialogue guidance strategy according to the direction trend, and generates expression ambiguity markers and clarification path templates.

[0012] The matching unit is used to extract the question type classification statistics of the multi-turn dialogue from the clarification path template, calculate the control intensity change range, determine the effective guidance time, match dialogue data of similar scenarios for the effective guidance time, and optimize the hierarchical progression strategy and question boundary narrowing strategy of the guidance questions based on the matching results.

[0013] The interaction unit is used to process the dialogue data of the current interaction round using the optimized hierarchical progression strategy of the guiding questions and the question boundary narrowing strategy, determine the degree of narrowing of the expression ambiguity markers, generate a multi-round dialogue fluency index, and obtain a stable state of user intent confidence assessment.

[0014] A third aspect of this application provides a computer-readable storage medium storing instructions that, when executed on a computer, cause the computer to perform the interactive guidance method of an intelligent question-and-answer system described above.

[0015] Compared with the prior art, the beneficial effects of the present invention are at least as follows:

[0016] First, by employing natural language processing, sliding window algorithms, and long short-term memory networks, the expression patterns and rhythm preferences of users' historical dialogues are structured and temporally processed to accurately generate user response sequences and emotion fluctuation sequences. This addresses the problem of traditional question-answering systems neglecting personalized user behavior and emotional characteristics, enabling full-process tracking of user interaction states. Next, based on the response sequences and emotion fluctuation sequences, user response speed and content relevance are quantified to obtain a classification of willingness to cooperate. Combined with emotion peaks to label knowledge blind spots, weighted calculations generate comprehensive impact data. A multi-dimensional feature vector is then constructed, inputting into an RBF kernel support vector machine and a trend prediction model to accurately predict the trend of user intent confidence. This overcomes the limitations of single numerical analysis, upgrading intent assessment from static judgment to dynamic trend prediction. Then, the guidance intensity is dynamically adjusted based on the confidence trend. Ambiguity markers and dynamic clarification path templates are automatically generated. K-means clustering and linear regression accurately identify effective guidance opportunities. Matching similar scenario datasets optimizes the hierarchical progression strategy and boundary narrowing strategy for guidance questions, adapting to different user cooperation states. This significantly improves the efficiency of clarifying ambiguous expressions and avoids ineffective guidance and user interaction resistance. Finally, by calculating the degree of ambiguity narrowing and the multi-turn dialogue fluency index, the intention confidence segmentation evaluation model is adaptively updated, the stable state of intention confidence is locked and used as the threshold for subsequent strategy adjustment, forming a complete interactive guidance process from user feature extraction, intention analysis, strategy optimization to state closure, which significantly improves the intention recognition accuracy, multi-turn dialogue fluency and interaction adaptability of the intelligent question answering system, and optimizes the overall human-computer interaction experience. Attached Figure Description

[0017] To more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings used in the description of the embodiments will be briefly introduced below. Obviously, the drawings described below are some embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.

[0018] Figure 1 This is a flowchart of an interactive guidance method for an intelligent question-answering system according to an embodiment of this application;

[0019] Figure 2 This is a schematic diagram illustrating the trend prediction of intent confidence in an embodiment of this application.

[0020] Figure 3 This is a radar chart showing the multi-dimensional performance comparison of embodiments of this application;

[0021] Figure 4 This is a schematic diagram of the structure of an interactive guidance system of an intelligent question-and-answer system according to an embodiment of this application. Detailed Implementation

[0022] This application provides an interactive guidance method, system, and storage medium for an intelligent question-answering system. The terms "first," "second," "third," "fourth," etc. (if present) in the specification, claims, and accompanying drawings are used to distinguish similar objects and are not necessarily used to describe a specific order or sequence. It should be understood that such data can be interchanged where appropriate so that the embodiments described herein can be implemented in a sequence other than that illustrated or described herein. Furthermore, the terms "comprising" or "having" and any variations thereof are intended to cover a non-exclusive inclusion; for example, a process, method, system, product, or device that includes a series of steps or units is not necessarily limited to those steps or units explicitly listed, but may include other steps or units not explicitly listed or inherent to such processes, methods, products, or devices.

[0023] For ease of understanding, the specific process of the embodiments of this application is described below. Please refer to [link / reference]. Figure 1 An embodiment of an interactive guidance method for an intelligent question-answering system in this application includes:

[0024] Step S1: The intelligent question-answering system extracts user expression pattern data and rhythm preference data from historical dialogue records, analyzes and processes the expression pattern data and rhythm preference data respectively, and generates user response sequence and emotion fluctuation sequence.

[0025] This includes extracting user expression pattern data and rhythm preference data, including:

[0026] Natural language processing (NLP) techniques were used to segment, tag, and semantically assign semantic roles in historical dialogue records, and a feature vector library of user dialogue texts was constructed. Based on the feature vector library, user expression pattern data was extracted, including user high-frequency vocabulary, sentence structure types, expression redundancy features, frequency of use of professional terms, and patterns of occurrence of ambiguous words. Based on the historical dialogue records indexed by time sequence, rhythm preference data was extracted, including the distribution of character length of a user's single expression, the time interval between user responses in two adjacent rounds of dialogue, the difference in response time for different types of questions, and the pattern of user initiation of dialogue. The rhythm preference data was smoothed using a sliding window algorithm to obtain optimized rhythm preference data.

[0027] This includes generating user reaction sequences and emotion fluctuation sequences, including:

[0028] The user response sequence data is generated by serializing expression pattern data and rhythm preference data using a Long Short-Term Memory (LSTM) network. The LSTM network captures long-term dependencies in the sequence data through gating mechanisms such as forget gate, input gate, and output gate. Based on the user response sequence data, the changes in the user's lexical sentiment values ​​are tracked to generate emotion fluctuation sequence data. The changes in the user's response patterns in multi-turn dialogues are analyzed using the emotion fluctuation sequence data, and the user expression pattern database is updated based on these changes. The tracking results of the user response sequence and emotion fluctuation sequence are generated based on the updated user expression pattern database.

[0029] Specifically, historical dialogue records are text interaction data retained during past human-computer interactions between the intelligent question-and-answer system and the user, including basic information such as the system's questions, user responses, and interaction times. The intelligent question-and-answer system first uses natural language processing (NLP) technology to analyze the text features of the dialogue text in the historical dialogue records. NLP technology refers to the technology of using computers to analyze and process the form, sound, and meaning of human natural language. In this embodiment, three core processing steps are specifically applied: word segmentation, part-of-speech tagging, and semantic role labeling. Word segmentation involves dividing continuous dialogue text into word units with independent semantic meaning. Part-of-speech tagging involves labeling each segmented word unit with its corresponding part-of-speech category. Semantic role labeling involves identifying the semantic relationship between each component in the dialogue text and the core verb and labeling it with its corresponding role. Through the continuous processing of the above NLP technology, the intelligent question-and-answer system converts the unstructured dialogue text into structured text feature data. Then, it uses a vectorization mapping algorithm to convert this structured text feature data into a high-dimensional feature vector. The vectorization mapping algorithm is an algorithm that maps discrete text feature data into high-dimensional numerical vectors. In this embodiment, word embedding (Word) is specifically used. The Embedding algorithm takes text feature data processed by word segmentation, part-of-speech tagging, and semantic role tagging as input and outputs a dense numerical vector of fixed dimension. Its working principle is based on a pre-trained language model, mapping each word unit, part-of-speech tag, and semantic role tag to a pre-defined high-dimensional vector space. This ensures that semantically similar features correspond to vectors that are close in distance within the vector space, thus transforming discrete text features into high-dimensional feature vectors that can be quantified and computed by a computer. The intelligent question-answering system integrates and stores the high-dimensional feature vectors corresponding to all user dialogue texts to construct a feature vector library for user dialogue texts. This feature vector library refers to the dataset formed by mapping text feature data processed by word segmentation, part-of-speech tagging, and semantic role tagging into feature vectors in a high-dimensional space using a vectorization mapping algorithm. It provides structured and quantifiable feature input for subsequent extraction of user expression pattern data.

[0030] Expression pattern data is a set of feature data that reflects users' personalized language expression habits. Specifically, it includes users' frequently used vocabulary, sentence structure types, expression redundancy characteristics, frequency of professional terminology use, and patterns of ambiguous vocabulary occurrence. The frequently used vocabulary refers to word combinations that appear more frequently than a preset frequency threshold in historical dialogues. Sentence structure types refer to the sentence structures users commonly use in their expressions. Expression redundancy characteristics refer to the proportion of irrelevant semantic content in users' expressions. Professional terminology usage frequency refers to the number of times users use professional terms related to their field in dialogues. Patterns of ambiguous vocabulary occurrence refer to users' use of words like "probably" and "possibly," which lack clear semantic focus. The frequency and contextual characteristics of user expression patterns are analyzed. The intelligent question-answering system first performs word segmentation, part-of-speech tagging, and semantic role tagging on the user's text, filtering out stop words, modal particles, and punctuation marks without actual semantic meaning to obtain a valid vocabulary set. Then, from the valid vocabulary set, nouns, verbs, and core semantic role words that carry core semantic meaning are identified and marked as core semantic words. Modifiers, repetitive words, and irrelevant filler words in the valid vocabulary set are recorded as redundant words. The number of redundant words and the total number of valid words in the current expression are counted separately. The original redundancy value is calculated by dividing the number of redundant words by the total number of valid words, and this value is normalized to the 0-1 range. The normalized value is the quantified value of the expression redundancy feature. The closer the value is to 1, the higher the proportion of irrelevant semantic content and the higher the degree of redundancy in the user's expression; the closer the value is to 0, the more concise the user's expression and the more prominent the core intent. These feature dimensions together constitute the core content of user expression pattern data, which can accurately reflect the user's language expression habits.

[0031] Temporal indexing refers to the process of associating each user statement in a historical dialogue record with corresponding time and interaction node information. Historical dialogue records indexed temporally will possess temporal dimension characteristics. Rhythm preference data is a set of feature data that reflects user interaction response habits. Specifically, it includes the character length distribution of a user's single statement, the response time interval between two adjacent dialogue rounds, the difference in response time for different types of questions, and the time pattern of users initiating dialogues. The character length distribution of a user's single statement refers to the distribution characteristics of the number of text characters in each user's reply; the response time interval between two adjacent dialogue rounds refers to the time interval between the user asking a question and responding; the difference in response time for different types of questions refers to the difference in response time for different types of questions posed by the system, such as confirmatory and exploratory questions; and the time pattern of users initiating dialogues refers to the time distribution characteristics of users actively interacting with the system. Since the rhythm preference data mentioned above may contain abnormal time data caused by non-user subjective factors such as network latency, the intelligent question answering system uses a sliding window algorithm to smooth the extracted rhythm preference data to ensure data validity. The sliding window algorithm is a time series data processing algorithm that traverses the time series data segment by segment by setting a fixed-length window and calculates the mean of the data within the window. The input of this algorithm in this step is the original extracted rhythm preference data containing abnormal data, and the output is the optimized rhythm preference data after eliminating the interference of abnormal data. Its working principle is to set a window length that is adapted to the interaction duration, input the original rhythm preference data into the window in chronological order, calculate the mean of the time feature data in each window, replace the original data in the window with the mean data, and complete the smoothing process of all rhythm preference data by traversing window by window, and finally obtain the optimized rhythm preference data.

[0032] Long Short-Term Memory (LSTM) networks are an improved type of recurrent neural network that effectively solves the gradient vanishing problem in recurrent neural networks when processing long sequence data. They are suitable for temporal feature analysis in multi-turn dialogues. In this embodiment, the input to the network is a sequence of user behavior features composed of user expression pattern data and optimized rhythm preference data after normalization. The output is user response sequence data. Its working principle involves a three-level gating mechanism (forget gate, input gate, and output gate) to extract features from the sequence data. The forget gate filters and discards irrelevant historical features from the sequence data, the input gate inputs new sequence features into the network's cell state, and the output gate generates the current output features based on the cell state. This three-level gating mechanism works synergistically, enabling the LSM network to accurately capture long-term dependencies in the user behavior feature sequence, thereby generating user response sequence data that reflects the user's behavioral response patterns in multi-turn dialogues.

[0033] Lexical sentiment value represents the positive, negative, or neutral emotional attributes of words in a user's expression, along with their corresponding scores. The intelligent question-answering system uses pre-defined sentiment mapping rules to map different combinations of response speed and lexical sentiment values ​​to corresponding emotional states. These pre-defined rules are quantitative matching rules pre-set in the intelligent question-answering system that map the combination of user response speed characteristics and lexical sentiment value characteristics to specific emotional states. These rules are constructed based on the correlation between user behavior and emotions in multi-turn dialogue interaction scenarios. First, response speed and lexical sentiment values ​​are quantified and graded separately. Response speed is divided into three levels—fast, moderate, and slow—based on the time interval between the user receiving a question from the system and responding. Lexical sentiment values ​​are divided into positive, neutral, and neutral levels based on the results of a sentiment dictionary and text sentiment analysis model. The system assigns three levels: positive, negative, and neutral, with each level corresponding to a defined numerical range. A one-to-one correspondence is then established between these level combinations and emotional states. For example, a fast response speed with a positive lexical sentiment value corresponds to a pleasant emotion; a slow response speed with a negative lexical sentiment value corresponds to an irritable emotion; and a moderate response speed with a neutral lexical sentiment value corresponds to a calm emotion. Matching thresholds are set for each combination to ensure the accuracy of the emotional state mapping. Based on these preset emotional mapping rules, the intelligent question-answering system combines and matches the user's response speed level and lexical sentiment value extracted from each round of dialogue to obtain the specific emotional state corresponding to each round. The system then strings together the emotional states of each round in chronological order to generate an emotional fluctuation sequence data that reflects the user's emotional changes across multiple rounds of dialogue. After generating the emotion fluctuation sequence data, the intelligent question-answering system analyzes the changes in the user's reaction pattern in multiple rounds of dialogue using this data. Specifically, it identifies the emotion peaks, troughs, and trends in the emotion fluctuation sequence data, and combines this with the dialogue content of the corresponding rounds to determine the changes in the user's reaction pattern caused by factors such as dialogue content and interaction methods. For example, when the emotion fluctuation sequence data shows a negative emotion peak and the corresponding round is when the system poses a complex professional question, it is determined that the user's reaction pattern is changing towards impatience.

[0034] The user expression pattern database stores personalized expression and response pattern characteristics of users. It is dynamically updated based on changes in user interaction behavior to ensure data timeliness and accuracy. After the database update, the intelligent question-answering system integrates and corrects the previously generated user response sequence data and emotion fluctuation sequence data based on the updated database. This generates tracking results for user response sequences and emotion fluctuation sequences. These tracking results are a set of sequence data that integrates users' historical behavioral characteristics, real-time response characteristics, and emotion change characteristics. They comprehensively reflect users' behavior and emotional state in multi-turn dialogues, providing core sequence feature data support for subsequent steps in the intelligent question-answering system, such as grading the intensity of user cooperation willingness and assessing intent confidence.

[0035] Step S2: Based on the user's response sequence and emotional fluctuation sequence, determine the user's willingness to cooperate intensity grading results and knowledge blind spot labeling information, and perform comprehensive calculations to obtain comprehensive impact data. Based on the comprehensive impact data, determine the trend of the user's intention confidence assessment, adjust the dialogue guidance strategy according to the trend, and generate expression ambiguity markers and clarification path templates.

[0036] The comprehensive impact data obtained includes:

[0037] For user response sequences and emotion fluctuation sequences, the initial cooperation willingness intensity value is obtained by quantifying user response speed and content relevance. This initial cooperation willingness intensity value is mapped to a preset grading scale to generate cooperation willingness intensity grading data for the current round, which is then used as the cooperation willingness intensity grading result. The cooperation willingness intensity grading data is used to analyze changes in user dialogue participation before and after rounds. Simultaneously, emotion peaks with emotion score standard deviations exceeding a preset standard threshold are detected from the emotion fluctuation sequence. The dialogue points corresponding to these emotion peaks are marked as potential knowledge blind spots and associated with the dialogue context to form dynamic knowledge blind spot data, which is then used as knowledge blind spot annotation information. The number of knowledge blind spot annotation information and the value of the cooperation willingness intensity grading data are weighted and summed according to preset weights to generate comprehensive impact data.

[0038] Among them, determining the trend of user intent confidence assessment includes:

[0039] If the overall impact data exceeds the preset impact threshold, the overall impact data is combined with the cross-round change value of the user cooperation intention intensity level and the change rate of the number of knowledge blind spot annotation information to construct a multi-dimensional feature vector. After normalization preprocessing, it is input into a support vector machine model with radial basis function as kernel function for classification processing. The evolution law data of the user cooperation intention intensity level is determined by the classification processing results. The non-linear change characteristics of the evolution law data are identified, and the non-linear change characteristics are input into a pre-established trend prediction model to generate the probability distribution of the next round of user intention confidence falling into the high, medium and low confidence intervals as the prediction result of the segmented evaluation. The trend of user intention confidence evaluation is judged based on the prediction results.

[0040] The generation of ambiguity markers and clarification path templates includes:

[0041] If the trend of user intent confidence assessment shows that the granularity of user expression decomposition is increasing, the guidance strength of the multi-round intent clarification sequence is enhanced through a real-time adjustment module; adjusted real-time ambiguity marker data is generated based on the enhanced guidance strength, and this data is used as ambiguity markers; dynamic switching script templates for clarification paths are generated based on the ambiguity markers, and these templates are used as clarification path templates; the distribution of ambiguity in multi-round dialogues is analyzed using the clarification path templates, the clarification path template database is updated based on the distribution data, and the updated database is used to optimize the guidance strategy in the next round of dialogue.

[0042] Specifically, regarding user response speed, the actual response time interval from receiving the question instruction from the intelligent question-answering system to sending a reply in each round of dialogue is extracted from the user response sequence. This actual response time interval is compared with a preset response time benchmark interval. The preset response time benchmark interval is set based on the typical user response time in multi-round dialogue interaction scenarios. The upper limit of this benchmark interval is used as a quantization threshold. If the actual response time interval is less than or equal to the quantization threshold, normalization is performed by dividing the actual response time interval by the quantization threshold. If the actual response time interval is greater than the quantization threshold, its quantization value is directly set to 1. In this way, the user response speed is converted into a value between 0 and 1. The closer the value is to 0, the faster the user response speed; the closer it is to 1, the slower the user response speed. Regarding content relevance, the similarity between the user's reply content and the question topic of the intelligent question-answering system in each round of dialogue is calculated using a text similarity algorithm. In this embodiment, a cosine similarity algorithm is used. This algorithm takes the feature vector of the question topic and the feature vector of the user's reply content as input and calculates the similarity result by calculating the cosine value between the two feature vectors. The calculation formula is: ,in, This is a quantitative value for content relevance. The feature vector of the question topic. The similarity result is then directly mapped to a value between 0 and 1, with the closer the value is to 1, the higher the match between the user's response and the question's topic, and the stronger the content relevance. A value closer to 0 indicates a lower match and weaker content relevance. Through the aforementioned preset quantification rules, standardized quantification of user response speed and content relevance is achieved, converting both to values ​​between 0 and 1. This provides a unified quantitative indicator for calculating the initial willingness to cooperate. The intelligent question-answering system weights and sums these two values ​​according to a preset ratio to obtain an initial willingness to cooperate value reflecting the user's participation in the current round of dialogue. Specifically, the response speed quantification value V and the content relevance quantification value C are reversed (since a larger V indicates a slower response and lower willingness to cooperate, 1 − V is used). The calculation formula is: ,in, To initially assess the level of willingness to cooperate, , For the preset proportional weight, and , This is a quantified value for response speed. The preset grading scale refers to a pre-defined numerical range scale in the intelligent question-and-answer system used to classify the intensity of user cooperation willingness. In this embodiment, the scale is divided into three levels: low, medium, and high. Each level corresponds to a fixed range of initial cooperation willingness values. The intelligent question-and-answer system generates the cooperation willingness intensity grading data for the current round by matching the initial cooperation willingness intensity values ​​to the corresponding numerical ranges, and directly uses this cooperation willingness intensity grading data as the user's cooperation willingness intensity grading result.

[0043] By comparing the intensity of cooperation willingness data between the current round and several previous rounds, the system determines whether user engagement in the dialogue shows an upward, stable, or downward trend. Then, it detects dynamic data on users' knowledge blind spots from the emotion fluctuation sequence. Specifically, the system first calculates the standard deviation of the emotion score in the emotion fluctuation sequence. The emotion score is the quantified score corresponding to each emotional state in the sequence, and the standard deviation is a statistic reflecting the dispersion of the emotion score. The intelligent question-answering system compares the calculated standard deviation with a preset standard threshold. Emotion score nodes with standard deviations exceeding the preset threshold are marked as emotion peaks. These peaks represent dialogue points where the user's emotions fluctuate drastically across multiple rounds of dialogue, usually caused by misunderstandings or knowledge gaps in the dialogue content. The intelligent question-answering system marks the dialogue points corresponding to these emotion peaks as potential knowledge blind spots and associates these potential knowledge blind spots with the corresponding dialogue context, forming dynamic data on knowledge blind spots that reflects the location of user knowledge gaps and related dialogue scenarios. This dynamic data is used as knowledge blind spot annotation information, and the number of annotations is the number of potential knowledge blind spots detected in the current round.

[0044] Preset weights refer to the weight values ​​pre-defined in the intelligent question-answering system for allocating the proportion of cooperation willingness intensity grading data and the number of knowledge blind spot annotations in the calculation of comprehensive impact data. The weight coefficient of the number of knowledge blind spot annotations is set based on the severity of the knowledge blind spot, and this weight coefficient is higher than that of the cooperation willingness intensity grading data, in order to highlight the degree of influence of knowledge blind spots on the user's intention expression. The intelligent question-answering system calculates the two through a weighted summation formula to obtain comprehensive impact data that can comprehensively reflect the degree of influence of user cooperation willingness and knowledge blind spots on intention recognition. Specifically, the number of knowledge blind spot annotations is first normalized to the 0-1 range using a minimum-maximum standardization, and then weighted and summed with the cooperation willingness intensity grading data according to the preset weights to obtain comprehensive impact data.

[0045] Specifically, the preset impact threshold is a numerical threshold pre-set in the intelligent question-answering system to determine whether the comprehensive impact data needs to be classified. If the comprehensive impact data does not exceed the threshold, the trend of user intent confidence assessment is directly judged based on the numerical change of the comprehensive impact data. The cross-round change value of the user cooperation willingness intensity classification is the difference between the cooperation willingness intensity classification data of the current round and the previous round. The change rate of the number of knowledge blind spot annotation information is the ratio of the number of knowledge blind spots in the current round to the number of knowledge blind spots in the previous round. The multi-dimensional feature vector is normalized preprocessed. Normalization preprocessing refers to the standardization processing method of mapping the numerical values ​​of each dimension of the feature vector to the numerical values ​​between 0 and 1. Its function is to eliminate the difference in the dimensions of the feature vector and improve the accuracy of subsequent model classification processing. In this embodiment, the min-max normalization algorithm is used to complete the normalization preprocessing. This algorithm takes the multi-dimensional feature vector constructed from the comprehensive impact data as input and outputs the normalized standardized feature vector.

[0046] Support Vector Machine (SVM) is a supervised learning model based on statistical learning theory. Its core function is to perform binary or multi-class classification of data by finding the optimal classification hyperplane. In this embodiment, the model uses the radial basis function (RBF) as the kernel function. The radial basis function is a kernel function with strong locality, which can map nonlinearly separable data in low-dimensional space to high-dimensional space to achieve linear separability of data. The input of the SVM model is the normalized comprehensive influence data feature vector. The training set of the model is the comprehensive influence data feature vector set labeled with the evolution of cooperation intention in historical dialogues. The output is the evolution law data of the intensity of user cooperation intention. The working principle of the model is to map the input low-dimensional feature vector to the high-dimensional feature space through the radial basis function, find the maximum margin hyperplane in the high-dimensional feature space that can separate data with different evolution types of cooperation intention, and use the hyperplane to classify the input comprehensive influence data feature vector to obtain the classification result of whether the intensity of user cooperation intention is evolving positively or negatively. Based on the classification result, evolution law data that reflects the change law of user cooperation intention intensity is generated. Evolutionary pattern data refers to a structured dataset that integrates the classification results of user willingness to cooperate in different rounds, the trend of change across rounds, and the rate of change. It covers the evolution direction of willingness to cooperate in different rounds, the difference in changes between adjacent rounds, the number of rounds of change, and the fluctuation characteristics of the rate of change. It can quantitatively and intuitively reflect the trajectory of user willingness to cooperate in different rounds of dialogue, from the current round to the previous round, as well as the evolutionary trend from the current round to the subsequent rounds. At the same time, this evolutionary pattern data is also associated with the knowledge blind spot annotation information and emotional fluctuation characteristics at the time of the corresponding change, realizing a deep binding between changes in user willingness to cooperate and factors influencing the dialogue scenario.

[0047] Nonlinear change characteristics refer to the nonlinear changes presented in evolutionary data, specifically including the location of abrupt change points in the rate of change, fluctuation amplitude, and the persistence of the direction of change. This feature can accurately reflect the complex changing trend of the intensity of user cooperation intention, providing refined feature input for subsequent trend judgment of intention confidence assessment. The trend prediction model is a hybrid regression model based on a Long Short-Term Memory (LSTM) network and a fully connected layer. The LSTM network is used to capture long-term time-dependent features in the evolutionary data, while the fully connected layer is used to fuse and map the hidden states output by the LSTM network with the nonlinear change features. The input of the trend prediction model is the nonlinear change features in the evolutionary data, and the training set of the model is a set of nonlinear change features labeled with intent confidence intervals in historical dialogues. The output is the probability distribution of the user intent confidence falling into the high, medium, and low confidence intervals in the next round. The working principle of the model is to extract the temporal features of the nonlinear change features through the LSTM network to obtain a hidden state vector that reflects the feature change pattern, and then map the hidden state vector to the probability values ​​of the three confidence intervals (high, medium, and low) through the fully connected layer to form the prediction result corresponding to the probability distribution. The trend of user intent confidence assessment is determined based on the prediction results. The specific judgment rules are as follows: if the probability value of the low confidence interval exceeds the preset probability threshold, the trend of user intent confidence assessment is judged to be a decreasing confidence trend; if the probability value of the high confidence interval is dominant, the trend is judged to be an increasing confidence trend; if the probability distribution of the three confidence intervals is relatively even, the trend is judged to be a vague or fluctuating confidence trend. This trend can accurately predict the clarity of the user's intent expression in the next round, providing a decision-making basis for adjusting the subsequent dialogue guidance strategy.

[0048] For example, Figure 2 This is a diagram illustrating the trend prediction of intention confidence. Figure 2 The left subplot shows the probability distribution prediction results of the user's intent confidence falling into three confidence intervals: high, medium, and low. This prediction result indicates that, based on the current evolution pattern and non-linear change characteristics of the user's willingness to cooperate, the clarity of the user's intent expression in the next round is likely to be at a high confidence level. The right subplot shows the actual trend of user intent confidence in multi-round dialogue and the model prediction results. The trend prediction provides accurate data support for the system to dynamically adjust the guidance strategy, enabling the system to narrow the problem boundary in a timely manner when the user's intent tends to be clear, and to enhance the clarification guidance intensity when the intent is ambiguous.

[0049] The increasing granularity of fuzzy expression decomposition indicates that the user's intention expression in the next round is expected to have more ambiguities, requiring more detailed decomposition and clarification to accurately identify the user's intention. If this trend does not occur, the original dialogue guidance strategy remains unchanged. If this trend is observed, the guidance intensity of the multi-round intention clarification sequence is enhanced through the real-time adjustment module in the intelligent question-answering system. The real-time adjustment module is a functional module in the intelligent question-answering system used to dynamically adjust the dialogue guidance intensity. This module presets multiple guidance intensity levels, each corresponding to different question density, follow-up question depth, and confirmation frequency. The enhancement of guidance intensity is specifically manifested in increasing the guidance intensity level, increasing the number of follow-up questions in a single round of dialogue, and increasing the frequency of intention confirmation, so as to achieve refined clarification of the user's fuzzy expression.

[0050] Real-time ambiguity tagging data refers to data generated by identifying and tagging ambiguities in a user's current statement. Ambiguities include lexical ambiguity, referential ambiguity, omission ambiguity, and structural ambiguity. Enhancing guidance intensity will simultaneously lower the sensitivity threshold for ambiguity identification, enabling the system to more accurately and comprehensively identify potential ambiguities in user statements. The intelligent question-answering system will directly use the generated real-time ambiguity tagging data as ambiguity tags, providing core tagging basis for the subsequent generation of clarification path templates. Subsequently, the intelligent question-answering system generates a clarification path dynamically switching script template based on the expression ambiguity marker. Specifically, the ambiguous points in the expression ambiguity marker are sorted according to confidence scores, and several core ambiguities with the highest confidence scores are selected. Basic clarification scripts matching the core ambiguity point type and dialogue scenario are retrieved from the intelligent question-answering system's script template library. Then, based on the enhanced guidance intensity and historical dialogue context, the basic clarification scripts are dynamically assembled in a serial or parallel manner to form a clarification path dynamically switching script template that can gradually clarify different ambiguities. This script template is then directly used as the clarification path template. This clarification path template can dynamically switch clarification paths according to the user's response content, achieving efficient clarification of the user's expression ambiguity.

[0051] By statistically analyzing the frequency of different types of ambiguity points in multi-round dialogues, the success rate of clarification, and the average number of clarification rounds required, a distribution data reflecting the characteristics of expression ambiguity is generated. This distribution data is then used to update the clarification path template database in the intelligent question-answering system. The clarification path template database stores various clarification path templates and their corresponding ambiguity handling effects. The update operations specifically include: adjusting the success rate weights of existing dialogue templates, adding clarification path templates for high-frequency ambiguities, and optimizing the switching logic of clarification paths. The intelligent question-answering system uses the updated clarification path template database to optimize the guidance strategy in the next round of dialogue, making the guidance strategy more aligned with the user's expression ambiguity characteristics, improving the efficiency and accuracy of ambiguity clarification, and providing better template support for intent assessment and guidance in subsequent multi-round dialogues.

[0052] Step S3: Extract the question type classification statistics from the clarification path template, calculate the control intensity change range, determine the effective guidance timing, match dialogue data of similar scenarios for the effective guidance timing, and optimize the hierarchical progression strategy and question boundary narrowing strategy of guidance questions based on the matching results.

[0053] Among them, the hierarchical progression strategy and problem boundary narrowing strategy based on matching results optimization guidance include:

[0054] Based on the statistical data of question type classification, the difference in the proportion of question types between previous and subsequent rounds is calculated to obtain the control intensity change magnitude. The K-means clustering algorithm is used to cluster the control intensity change magnitude, generating grouping results for question type classification. Using the time-series sequence of control intensity change magnitudes as input, a linear regression method is applied to fit the trend line. The grouping results are then combined to analyze the time-series trend of the question type classification statistical data. The slope of the trend line is used to determine effective guidance opportunities. If the slope is greater than a preset slope threshold, it is identified as effective guidance opportunity data. For effective guidance opportunities, similar scenario numbers are extracted from accumulated historical dialogue records. According to the method, similar scene data are grouped and processed using the K-means clustering algorithm to determine the scene subset with the highest similarity to the current effective guidance opportunity. The user response sequence of the scene subset and the current user behavior sequence are input into the support vector machine model for classification processing, and the sequence similarity score is calculated to obtain the matching result. The degree of fit between the current user behavior and similar scene data is judged based on the matching result. Based on the degree of fit, nonlinear change features are extracted from the matching result, and the trend prediction model is used to judge the trend of user intention confidence. Based on the confidence trend, the hierarchical progression strategy of guidance questions is optimized. At the same time, based on the optimized hierarchical progression strategy of guidance questions, a framework for gradually narrowing the question boundary from broad to specific is constructed, generating a question boundary narrowing strategy.

[0055] Specifically, the question type classification statistics refer to the structured data arranged in chronological order, which quantifies and statistically analyzes the frequency, single-round percentage, and cross-round distribution patterns of different question types (confirmation, clarification, guidance, and open-ended exploration) used by the system within the multi-round interaction process corresponding to the clarification path template. This data records the percentage of each question type in each round of dialogue, intuitively reflecting the dynamic changes in the dialogue guidance rhythm and providing a chronological basis for calculating the subsequent control intensity variation. Based on this question type classification statistics, the intelligent question answering system calculates the difference in the percentage of the same question type in two adjacent rounds of dialogue. The absolute value of this difference is used as the control intensity variation. Then, the difference data of all rounds are integrated to obtain a sequence of control intensity variation amplitudes arranged chronologically by dialogue round. The control intensity variation amplitude is used to quantitatively characterize the adjustment range of the system's guidance method in continuous dialogue. The chronological sequence characteristics are a core prerequisite for ensuring the feasibility of subsequent linear regression analysis and avoiding logical contradictions in the algorithm due to disordered data.

[0056] K-means clustering is an unsupervised clustering algorithm based on Euclidean distance. Its core function is to stratify and classify numerical data according to their magnitude characteristics without changing the temporal order of the original data. It only classifies the numerical amplitude within the sequence. In this step, the input of the algorithm is the numerical data in the time series of control intensity variation amplitude, and the output is the grouping results of high variation amplitude, medium variation amplitude, and low variation amplitude. Its working principle is to pre-set the number of clusters to 3, randomly select three initial cluster centers, iteratively calculate the Euclidean distance between each value and each cluster center, and assign the value to the nearest cluster. After each iteration, the mean of each cluster is updated as the new cluster center until the clustering results are stable, thereby achieving stratified differentiation of the degree of control intensity variation.

[0057] The intelligent question-answering system takes the original, time-series sequence of control intensity changes as input and applies linear regression to fit a trend line. Linear regression, a time-series trend analysis algorithm based on least squares, is designed to uncover the overall trend of changes in ordered time-series data. The input is the original time-series sequence before clustering, and the output is a trend line representing the trend and its corresponding slope. Its working principle is to generate a linear equation reflecting the overall direction of change by minimizing the sum of squared errors from data points to the fitted line. The slope of the trend line visually reflects the rising, falling, or stable trend of control intensity changes. The system combines the grouping results obtained from K-means clustering to analyze the changing trends of question type classification statistics in the time dimension. Specifically, by combining the trend line slope with the magnitude level of the corresponding values, it clarifies the pattern of guidance intensity changes within different ranges of change. For example, a higher absolute slope for a high-amplitude group indicates a more significant adjustment in the guidance rhythm during that stage, while a gentler slope for a medium-to-low-amplitude group indicates a relatively stable guidance rhythm. This achieves accurate and hierarchical analysis of time-series trends.

[0058] A slope threshold is pre-set for the trend line slope. If the slope of the trend line is greater than the preset slope threshold, it indicates that the current guidance intensity has changed significantly and the user's intention is becoming more concentrated. The current time is determined to be an effective guidance opportunity, and the corresponding opportunity information is used as effective guidance opportunity data and directly used as an effective guidance opportunity. For users with high emotional fluctuations or high knowledge gaps, the system can adaptively reduce the preset slope threshold to capture potential effective guidance nodes earlier and improve the timeliness of guidance.

[0059] Similar scenario data refers to multi-turn dialogue data that is highly similar to the current dialogue in terms of question type, user willingness to cooperate, distribution of knowledge blind spots, and changes in intent confidence in historical interactions. The intelligent question answering system again uses the K-means clustering algorithm to group this similar scenario data. Taking the scenario feature vector as input, this feature vector integrates three core features: willingness to cooperate, knowledge blind spots, and question type. The output data consists of multiple scenario subsets arranged from high to low similarity to the current effective guidance moment. The scenario subset with the highest similarity is selected, and this subset contains the complete historical dialogue process, guidance strategy, user feedback, and intent convergence results.

[0060] The historical user response sequences within this subset of the scenario are input together with the current user behavior sequences generated in real-time during the current dialogue into a Support Vector Machine (SVM) model for behavior consistency classification. This SVM model uses radial basis functions as its kernel function. The input data is a set of dual-sequence features composed of the historical scenario user response sequences and the current user behavior sequences. The output data is the classification result of behavior matching or mismatch. Its working principle is to map low-dimensional sequence features to a high-dimensional space, find the optimal classification hyperplane, and determine whether the behavioral features of the two sequences belong to the same category. Simultaneously, the system uses a cosine similarity algorithm to calculate the similarity score between the historical user response sequences and the current user behavior sequences. The input data for the cosine similarity algorithm are the feature vectors corresponding to the two sequences, and the output data is a similarity score in the range of 0-1. The closer the score is to 1, the higher the similarity of the sequence features. This similarity score is only used as a quantitative reference for the degree of matching and is not the matching result itself.

[0061] Based on the matching result, the system further determines the degree of fit between the current user behavior and similar scenario data. The degree of fit is used to characterize the degree of overlap between the current real-time user behavior, interaction scenario, and expression habits and historical similar scenarios. Specifically, it is divided into three levels: high, medium, and low. High fit means that the current user behavior trajectory, willingness to cooperate, and expression characteristics are highly consistent with historical successful guidance scenarios, with extremely strong scenario adaptability, and the core guidance logic of the past can be directly reused. Medium fit means that the current user behavior partially matches the historical scenario, and only minor adjustments to the guidance rhythm are needed for adaptation. Low fit means that the current user behavior and expression habits are significantly different from the historical scenario. If the historical guidance logic is directly copied, it is very easy to cause user resistance. The guidance rhythm and questioning methods need to be reconstructed in a targeted manner to fit the actual user interaction state.

[0062] The nonlinear variation features specifically encompass three core temporal features: abrupt changes in user willingness to cooperate, fluctuations in intent confidence, and jump patterns in the proportion of question types. The intelligent question-answering system strictly adheres to the aforementioned matching degree level, extracting differentiated features from historical similar scenario datasets corresponding to the matching results, rather than indiscriminately extracting all features. This effectively implements the core technical limitation of feature extraction based on matching degree in the claims. For high-matching scenarios, given the high compatibility between current user behavior and historical successful guidance scenarios, the system selectively extracts nonlinear variation features from the matching results related to the intent convergence phase within historical scenarios. It focuses on selecting core features such as positive abrupt changes in user willingness to cooperate, small and stable fluctuations in intent confidence, and regular jumps in the proportion of question types. Simultaneously, it removes invalid fluctuations and abnormal interference features caused by network latency, temporary interference, etc., from historical scenarios. These features accurately reflect the key variation patterns in the efficient guidance process and can be directly used to quickly optimize the current guidance strategy. For low-matching scenarios, due to the significant difference between current user behavior and historical scenarios, the system extracts features from... The system extracts nonlinear change features from historical scenarios where users resist or experience significant behavioral fluctuations. It focuses on identifying nodes with negative abrupt changes in user willingness to cooperate, ranges with significant fluctuations in intent confidence, and anomalous jumps in the proportion of question types. Simultaneously, it correlates these features with the fluctuation features of real-time user behavior in the current round for supplementary calibration. These features can effectively avoid guidance pitfalls and provide a core basis for subsequent gentle adjustments to the guidance pace and reducing user resistance. For medium-compatibility scenarios, the system takes into account both the nonlinear change features of stable convergence segments and smooth fluctuation segments in historical scenarios, balancing the dual needs of guidance efficiency and user interaction experience, and avoiding overly aggressive or overly conservative guidance methods.

[0063] The regularized nonlinear change features extracted above are input into the trend prediction model built in the previous steps. The trend prediction model is built by combining a long short-term memory network and a fully connected layer. It is specifically designed for the scenario of trend prediction of time-series features. Its input data is a set of nonlinear change features after matching degree classification and filtering. The output data is the prediction result of the user intent confidence rising, falling or stabilizing. The working principle is to effectively capture the long-term time-series dependency of nonlinear change features through the long short-term memory network, avoiding the problem of missed judgment of fluctuation features that is easy to occur in conventional models. Then, the fully connected layer maps the extracted time-series features to the corresponding confidence direction probability, accurately predicting the subsequent user intent expression state. Based on the user intent confidence trend predicted by the model, the hierarchical strategy of guiding questions is optimized accordingly. If the confidence level tends to rise and the scenario fit is high, the guiding question level is directly simplified, significantly reducing transitional and redundant questions, and achieving rapid convergence of user intent. If the confidence level tends to stabilize or decline and the scenario fit is low, intermediate transitional guiding questions are added layer by layer to moderately slow down the overall guiding pace, gradually guide users to sort out their expression logic, and prevent user resistance caused by jumping questions. For scenarios with medium fit, the guiding level is appropriately simplified to balance guiding efficiency and user experience, and adapt to a moderate level of scenario adaptability.

[0064] Building upon the optimized hierarchical question progression strategy, the system further constructs a framework that gradually narrows the question boundaries, moving from broad, open-ended questions to precise, closed-ended questions. The framework's construction rules are fully compatible with the hierarchical progression strategy. In scenarios with high relevance, the framework narrows faster, allowing for a quick transition from broad questions to precise questions, efficiently reducing the boundaries of intent definition. In scenarios with low relevance, the framework narrows slowly in multiple steps. First, it uses multi-level broad questions to lock in the general range of user intent, then gradually refines the question content and tightens the intent boundaries, ultimately generating a question boundary narrowing strategy that is fully adapted to the current interaction scenario. This achieves comprehensive adaptation between the guidance strategy, user behavior, and scenario relevance.

[0065] The subsequent intelligent question answering system will input the complete dialogue data, user behavior characteristics, and ambiguity clarification characteristics processed in the current round into the updated intent confidence segmentation evaluation model for calculation. The model will then calculate the corresponding confidence value sequence, which is a time series composed of intent confidence scores from multiple consecutive rounds. This sequence can intuitively reflect the dynamic trend of intent confidence. The system records the confidence score sequence in real time and judges the fluctuation range of the sequence. It compares the result with a preset fluctuation range, which is a critical range pre-set by the system to determine whether the confidence score of intent tends to be stable without drastic fluctuations. If the fluctuation range of the confidence score sequence is less than the preset range, it indicates that the user's confidence score of intent has tended to be stable without significant jumps. The system determines the arithmetic mean of the sequence as the stable state data for the confidence score of intent. This stable state data is directly used as the stable state of the user's confidence score of intent. At the same time, this data is used as a reference threshold for adjusting the parameters of the dialogue guidance strategy in the next round. This provides a standardized quantitative basis for optimizing and adjusting the guidance intensity, questioning method, and clarification logic in subsequent rounds. This achieves a complete closed-loop optimization of the entire interactive guidance process from feature extraction and strategy optimization to stable state determination, continuously improving the accuracy of intent recognition and the precision of interactive guidance in the intelligent question answering system.

[0066] Step S4: Process the dialogue data of the current interaction round using the optimized hierarchical progression strategy of guiding questions and the question boundary narrowing strategy, determine the degree of narrowing of expression ambiguity markers, generate multi-round dialogue fluency indicators, and obtain a stable state of user intent confidence assessment.

[0067] Among them, the stable states from which the user intent confidence assessment is obtained include:

[0068] By employing an optimized hierarchical question progression strategy and a question boundary narrowing strategy, the focus of the user's expression in the current interaction round is extracted and compared point-by-point with the response sequence of previous rounds. The behavior matching degree is calculated. If the behavior matching degree exceeds a preset matching value, the question boundary is gradually narrowed, and an adjusted hierarchical question sequence is output to process the dialogue data of the current interaction round. The system compares and processes the corresponding expression ambiguity markers in the dialogue data before and after processing, calculating the reduction ratio of the number of markers as the degree of narrowing of expression ambiguity markers. Based on the degree of narrowing, a multi-turn dialogue fluency index is calculated. If the multi-turn dialogue fluency index is higher than a preset threshold... If the value is higher than the preset score threshold, then valid samples with a coherence score higher than the preset score threshold are extracted. The numerical offset of each segment boundary of the intention confidence segment evaluation in the valid samples is adjusted, and the intention confidence segment evaluation model is updated. The data processed in the current round is input into the updated intention confidence segment evaluation model for calculation, and the output confidence value sequence is recorded. If the fluctuation range of the confidence value sequence is less than the preset range, then the average value of the sequence is determined as the stable state data of the intention confidence evaluation. The stable state data is used as the stable state of the user intention confidence evaluation and the reference threshold for adjusting the dialogue strategy parameters in the next round.

[0069] Specifically, the hierarchical guidance strategy combines the user's fit with historical similar scenarios and the trend of intent confidence to set a layered guidance logic. For scenarios with high fit, the guidance level is simplified, and for scenarios with medium to low fit, transitional guidance questions are added. Its core function is to adapt to the user's expression habits and cooperation status, and gradually converge the user's intent. The question boundary narrowing strategy is an intent convergence framework that gradually transitions from broad open-ended questions to precise closed-ended questions. With the synergy of the two, the intelligent question answering system will first lock the core intent category of the user's expression, such as "weather query" belonging to the life service intent. Then, it will combine the features of the current guidance level to extract the core elements of the expression (such as "weather" as the core keyword in "check the weather"), expression complexity features (such as the complexity level of single-word expressions and short sentence expressions), and intent association features (such as whether it contains time, location, and other limiting words). The above features are vectorized and mapped to obtain a high-dimensional standardized feature vector of the current question focus. Each dimension of this feature vector corresponds to a quantifiable intent feature, providing structured and computable foundational data for point-by-point comparison with previous rounds of response sequences, thus solving the problem that surface text cannot directly participate in feature comparison. Therefore, the focus of the question is not merely the surface text content expressed by the user, such as "check the weather," but rather a composite feature set resulting from the structured decomposition of the user's core needs by combining the hierarchical features of the progressive strategy and the intent range features of the narrowing strategy.

[0070] Subsequently, the intelligent question-answering system extracts the response sequence from previous rounds. This sequence is a temporally sequenced feature set containing the user's historical response speed, response content relevance, willingness to cooperate, and emotional fluctuation characteristics. Each round corresponds to a complete set of behavioral feature items, and each feature has been quantified: response speed is the normalized value of the time interval between receiving the system's question and issuing a response (0-1 interval); response content relevance is the quantified value of the cosine similarity between the current question and the user's response (0-1 interval); willingness to cooperate is the weighted sum of response speed and content relevance (0-1 interval); and emotional fluctuation is the quantified score of emotional state obtained based on emotion mapping rules (0-1 interval). The system arranges the behavioral feature items from each previous round according to the dialogue round sequence, forming a set of feature vectors for the previous round response sequence. Each element in the set is a single-round behavioral feature vector arranged chronologically, with the vector dimension consistent with the feature vector dimension of the current question focus, providing an aligned dimensional basis for point-by-point comparison and cosine similarity calculation.

[0071] Point-by-point comparison does not refer to a word-by-word comparison at the text level, but rather a point-by-point matching and verification at the feature dimension level. Specifically, the feature vector of the current question focus and the feature vector of each previous round of response sequence both contain the same number of feature dimensions, such as the core intent category dimension, expression complexity dimension, response speed adaptation dimension, and cooperation willingness matching dimension. The point-by-point comparison process is as follows: for each corresponding feature dimension of the two vectors, the matching degree of feature values ​​is checked one by one. For example, if the core intent category dimension is "life service - weather query", then the dimension is determined to match; if the expression complexity dimension is "short sentence expression" and the corresponding dimension of the previous round of response sequence is "adapting to the level of short sentence expression", then the dimension is determined to match; if the response speed adaptation dimension is "moderate" for the guidance level corresponding to the current question focus and the user response speed quantification value in the previous round of response sequence is in the "moderate range", then the dimension is determined to match. By matching and verifying each feature dimension one by one, a point-by-point comparison is completed, clarifying the dimensional fit between the current focus of question and the user behavior features in each previous round, thus providing a basis for point-by-point feature matching for subsequent overall similarity calculation.

[0072] After point-by-point comparison, the intelligent question-answering system calculates the overall behavioral matching degree using the cosine similarity algorithm. The cosine similarity algorithm is a similarity calculation method based on a vector space model. Its core function is to quantify the overall similarity between two feature vectors of the same dimension. In this step, the input is the high-dimensional standardized feature vector of the current question focus and the single-round behavioral feature vectors of each previous round's response sequence. The output is a behavioral matching degree value in the range of 0-1. The algorithm works by mapping two feature vectors of the same dimension to a high-dimensional vector space and representing the similarity by calculating the cosine value of the angle between the two vectors. The closer the cosine value is to 1, the more consistent the directions of the two vectors, meaning a higher overall fit between the current user's question focus and the user's behavioral features in previous rounds; the closer the cosine value is to 0, the lower the overall fit. In practice, the intelligent question-answering system calculates the cosine similarity between the feature vector of the current question focus and the feature vector of the response sequence in each previous round. Then, it combines the temporal weights of each round (such as the higher weight of recent rounds) to perform a weighted summation to obtain the final behavior matching score. This score can comprehensively reflect the overall consistency between the current user's core intent and historical interaction behavior, providing a quantitative basis for decision-making on whether to narrow down the question boundaries and adjust the guidance strategy.

[0073] The preset matching value is a critical value set by the system to determine whether the guidance boundary can be tightened. If the behavior matching degree exceeds the preset matching value, it indicates that the user's behavior is stable and the expression of intent is becoming clearer. The system gradually narrows the scope of intent definition, i.e., the problem boundary, according to the problem boundary narrowing strategy, eliminates redundant guidance links, and outputs an adjusted hierarchical question sequence. This hierarchical question sequence completes the standardization processing of the dialogue data of the current interaction round. If the behavior matching degree does not reach the preset matching value, the original guidance boundary and question level are maintained to maintain a gentle guidance rhythm and avoid further ambiguity in the expression of user intent.

[0074] Expression ambiguity markers refer to the real-time annotation of various ambiguities in user expressions, such as lexical ambiguity, referential ambiguity, omission ambiguity, and structural ambiguity, forming marked data. Each marker corresponds to one point of ambiguity to be clarified. The system separately counts the total number of ambiguous markers before processing and the number of remaining ambiguous markers after processing. By calculating the ratio of the difference between the two to the initial total number, the reduction ratio of the number of markers is obtained, and this reduction ratio is directly defined as the narrowing degree of expression ambiguity markers. The narrowing degree ranges from 0 to 1. The higher the value, the higher the clarification ratio of the ambiguities in the user expression and the higher the clarity of the intended expression.

[0075] The multi-turn dialogue fluency index is a comprehensive indicator used to quantitatively evaluate the coherence of multi-turn interactions, the efficiency of ambiguity clarification, and the guidance effect. It is calculated by multiplying the degree of narrowing of the expression ambiguity markers by a preset turn connection weight. The turn connection weight is a coefficient preset by the system to balance the effect of single-turn ambiguity clarification and the coherence of multi-turn interactions, taking into account both the dimensions of ambiguity elimination and interactive experience. The value obtained after multiplication is the multi-turn dialogue fluency index. The index value is also in the range of 0-1. The higher the value, the smoother the multi-turn dialogue interaction and the better the overall guidance effect.

[0076] The preset threshold is a critical value set by the system to determine whether the interaction is coherent and has value for model updating. If the multi-turn dialogue fluency index is higher than the preset threshold, it indicates that the current multi-turn interaction is coherent and the guidance strategy is effective. The system extracts valid samples from the current round and previous rounds whose coherence scores are higher than the preset score threshold. The coherence score is an interaction quality score calculated by combining user response speed, willingness to cooperate, and ambiguity clarification efficiency. Valid samples are high-quality interaction data that indicate successful guidance and clear convergence of intent. The intent confidence segmentation assessment model is used to divide the clarity of user intent into three confidence intervals: high, medium, and low. Its inputs are user behavior features, cooperation intention evolution features, and ambiguity clarification features. The outputs are the scores and distribution results of the corresponding confidence intervals. The working principle is to quantify and score the input features based on preset segmentation boundaries. The adjustment method is to optimize the boundary values ​​of the high, medium, and low confidence intervals based on the actual confidence distribution data in the effective samples, so that the model segmentation judgment is more in line with the real user interaction pattern, complete the adaptive update of the model, and improve the accuracy of subsequent assessments.

[0077] The intelligent question answering system inputs the complete dialogue data, user behavior features, and ambiguity clarification features processed in the current round into the updated intent confidence segmentation evaluation model. The model calculates the corresponding confidence value sequence, which is a time series composed of intent confidence scores from multiple consecutive rounds, used to reflect the dynamic trend of intent confidence.

[0078] The system records the confidence score sequence and determines its fluctuation range. Specifically, it extracts the maximum and minimum values ​​from the confidence score sequence, calculates the sequence range by subtracting the minimum value from the maximum, and calculates the standard deviation of the sequence to characterize the degree of numerical dispersion. The system compares the calculated sequence range and standard deviation with preset fluctuation range thresholds and standard deviation thresholds, respectively. If the sequence range is less than the preset range threshold and the standard deviation is less than the preset standard deviation threshold, the system determines that the fluctuation range of the confidence score sequence is less than the preset range, indicating that the user's intent confidence has stabilized without drastic fluctuations. The system calculates the arithmetic mean of all scores within the confidence score sequence and determines this mean as the stable state data for intent confidence assessment. This stable state data serves as the stable state for user intent confidence assessment and as a reference threshold for adjusting the parameters of the next round of dialogue guidance strategy. This provides a standardized quantitative basis for optimizing subsequent guidance intensity, questioning methods, and clarification logic, achieving closed-loop optimization of the entire interactive guidance process and continuously improving the intent recognition and accurate guidance capabilities of the intelligent question-answering system. For example, the processed data from this round is input into the updated intention confidence segmentation evaluation model. The confidence score sequence for five consecutive rounds is recorded as [0.82, 0.85, 0.83, 0.86, 0.84]. The fluctuation range is calculated, yielding a sequence range of 0.86 - 0.82 = 0.04, a mean of 0.84, a variance of 0.0002, and a standard deviation of [missing value]. The range threshold of 0.05 and the standard deviation threshold of 0.02 are both satisfied. Therefore, the fluctuation range of the confidence value sequence is less than the preset range, indicating that the user intent confidence has become stable. The stable state data of intent confidence assessment = sequence average = 0.84. 0.84 is taken as the final intent confidence of this interaction and as the reference threshold for adjusting the dialogue strategy parameters in the next round.

[0079] For example, Figure 3 A multi-dimensional performance comparison radar chart is provided, showing the comparison between the traditional question-answering system and the system of this application on several key performance indicators.

[0080] The interactive guidance method of an intelligent question-answering system according to an embodiment of this application has been described above. The intelligent question-answering system according to an embodiment of this application is described below. Please refer to [link / reference]. Figure 4 An embodiment of the interactive guidance system of an intelligent question-and-answer system in this application includes:

[0081] The sequence extraction unit extracts user expression pattern data and rhythm preference data from historical dialogue records. It then analyzes and processes the expression pattern data and rhythm preference data to generate user response sequences and emotion fluctuation sequences.

[0082] The direction judgment unit is used to determine the user's willingness to cooperate intensity level and knowledge blind spot labeling information based on the user's response sequence and emotional fluctuation sequence, and to perform comprehensive calculation to obtain comprehensive impact data. Based on the comprehensive impact data, it judges the trend of the user's intention confidence assessment, adjusts the dialogue guidance strategy according to the trend, and generates expression ambiguity markers and clarification path templates.

[0083] The matching unit is used to extract question type classification statistics from the clarification path template, calculate the magnitude of control intensity change, determine the effective guidance timing, match dialogue data of similar scenarios for the effective guidance timing, and optimize the hierarchical progression strategy and question boundary narrowing strategy of guidance questions based on the matching results.

[0084] The interaction unit is used to process the dialogue data of the current interaction round using an optimized hierarchical strategy for guiding questions and a strategy for narrowing question boundaries. It determines the degree of narrowing of expression ambiguity markers, generates multi-round dialogue fluency indicators, and obtains a stable state of user intent confidence assessment.

[0085] This application also provides a computer-readable storage medium, which can be a non-volatile computer-readable storage medium or a volatile computer-readable storage medium, wherein the computer-readable storage medium stores instructions that, when the instructions are executed on a computer, cause the computer to perform the steps of the interactive guidance method of the intelligent question-answering system.

[0086] Those skilled in the art will clearly understand that, for the sake of convenience and brevity, the specific working processes of the systems and units described above can be referred to the corresponding processes in the foregoing method embodiments, and will not be repeated here.

[0087] If the integrated unit is implemented as a software functional unit and sold or used as an independent product, it can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of this application, in essence, or the part that contributes to the prior art, or all or part of the technical solution, can be embodied in the form of a software product. This computer software product is stored in a storage medium and includes several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of this application. The aforementioned storage medium includes various media capable of storing program code, such as USB flash drives, portable hard drives, read-only memory (ROM), random access memory (RAM), magnetic disks, or optical disks.

[0088] The above-described embodiments are only used to illustrate the technical solutions of this application, and are not intended to limit them. Although this application has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that modifications can still be made to the technical solutions described in the foregoing embodiments, or equivalent substitutions can be made to some of the technical features. Such modifications or substitutions do not cause the essence of the corresponding technical solutions to deviate from the spirit and scope of the technical solutions of the embodiments of this application.

Claims

1. An interactive guiding method of an intelligent question-answering system, characterized in that, The method includes: The intelligent question-answering system extracts user expression pattern data and rhythm preference data from historical dialogue records, analyzes and processes the expression pattern data and rhythm preference data respectively, and generates user response sequence and emotion fluctuation sequence. Based on the user response sequence and the emotion fluctuation sequence, the user's willingness to cooperate intensity classification result and knowledge blind spot labeling information are determined, and comprehensive impact data is obtained through comprehensive calculation. Based on the comprehensive impact data, the trend of user intention confidence assessment is judged, and the dialogue guidance strategy is adjusted according to the trend. At the same time, expression ambiguity markers and clarification path templates are generated. The comprehensive impact data obtained includes: For the user response sequence and the emotion fluctuation sequence, the user response speed and content relevance are quantified to obtain a preliminary cooperation willingness intensity value. This preliminary cooperation willingness intensity value is mapped to a preset grading scale to generate cooperation willingness intensity grading data for the current round, and this cooperation willingness intensity grading data is used as the cooperation willingness intensity grading result. The cooperation willingness intensity grading data is used to analyze the changes in user dialogue participation before and after rounds. At the same time, emotion peaks with emotion score standard deviations exceeding a preset standard threshold are detected from the emotion fluctuation sequence. The dialogue points corresponding to the emotion peaks are marked as potential knowledge blind spots and associated with the dialogue context to form dynamic knowledge blind spot data, which is used as the knowledge blind spot annotation information. The number of knowledge blind spot annotation information and the value of the cooperation willingness intensity grading data are weighted and summed according to preset weights to generate the comprehensive impact data. Extract question type classification statistics from the clarification path template, calculate the difference in the proportion of question types in previous and subsequent rounds based on the question type classification statistics, obtain the control intensity change range, determine the effective guidance timing, match dialogue data of similar scenarios for the effective guidance timing, and optimize the hierarchical progression strategy and question boundary narrowing strategy of guidance questions based on the matching results. The dialogue data of the current interaction round is processed using the optimized hierarchical progression strategy of the guiding questions and the question boundary narrowing strategy. The degree of narrowing of the expression ambiguity markers is judged, and a multi-round dialogue fluency index is generated to obtain a stable state of user intent confidence assessment.

2. The method of claim 1, wherein, Extract user expression pattern data and rhythm preference data, including: Natural language processing techniques are used to segment, tag, and semantically label the dialogue text in the historical dialogue records, and a feature vector library of user dialogue text is constructed. Based on the feature vector library, user expression pattern data containing user high-frequency vocabulary, sentence structure type, expression redundancy features, frequency of use of professional terms, and occurrence patterns of ambiguous words are extracted. Based on historical dialogue records indexed by time sequence, rhythm preference data is extracted, which includes the distribution of character length of a user's single statement, the time interval between user responses in two adjacent rounds of dialogue, the difference in response time for different types of questions, and the pattern of when users initiate dialogue. The rhythm preference data is then smoothed using a sliding window algorithm to obtain optimized rhythm preference data.

3. The method according to claim 1, characterized in that, Generate user reaction sequences and emotion fluctuation sequences, including: The expression pattern data and rhythm preference data are serialized using a long short-term memory network to generate user response sequence data. Based on the user response sequence data, changes in the user's lexical sentiment values ​​are tracked to generate emotion fluctuation sequence data. The changes in the user's response patterns during multi-turn dialogues are analyzed using the emotion fluctuation sequence data, and the user expression pattern database is updated based on these changes. The updated user expression pattern database is then used to summarize and generate the user response sequence and the emotion fluctuation sequence.

4. The method according to claim 1, characterized in that, To determine the trend of user intent confidence assessment, including: If the comprehensive impact data exceeds the preset impact threshold, the comprehensive impact data is combined with the associated user cooperation willingness intensity level cross-cycle change value and the change rate of the number of knowledge blind spot annotation information to construct a multi-dimensional feature vector. After normalization preprocessing, it is input into the support vector machine model for classification processing. The evolution law data of user cooperation willingness intensity level is determined through the classification processing results. Identify the nonlinear change characteristics of the evolution data, input the nonlinear change characteristics into a pre-established trend prediction model, and generate the probability distribution of the next round of user intention confidence falling into the high, medium, and low confidence intervals as the prediction result of the segmented evaluation; judge the trend of the user intention confidence evaluation based on the prediction result.

5. The method according to claim 1, characterized in that, Generate ambiguity markers and clarification path templates, including: If the trend of the user intent confidence assessment shows that the granularity of the user's ambiguous expression is increasing, then the guidance strength of the multi-round intent clarification sequence is enhanced by the real-time adjustment module; adjusted real-time expression ambiguity labeling data is generated based on the enhanced guidance strength, and the real-time expression ambiguity labeling data is used as the expression ambiguity label. Based on the aforementioned ambiguity markers, a dialogue template for dynamic switching of clarification paths is generated, and the dialogue template is used as the clarification path template. The distribution of ambiguity in multi-turn dialogues is analyzed through the clarification path template, the clarification path template database is updated according to the distribution data, and the guidance strategy in the next round of dialogue is optimized using the updated clarification path template database.

6. The method according to claim 1, characterized in that, Based on the matching results, optimize the hierarchical progression strategy and problem boundary narrowing strategy for guiding questions, including: The control intensity change amplitude is clustered to generate grouping results for question type classification. The control intensity change amplitude sequence arranged in time sequence is used as input, and the linear regression method is applied to fit the change trend line. The change trend of question type classification statistics in time sequence dimension is analyzed in combination with the grouping results. If the slope of the trend line of the changing trend is greater than the preset slope threshold, it is determined to be effective guidance timing data, and the effective guidance timing data is used as the effective guidance timing. For the effective guidance opportunity, similar scene data is extracted from historical dialogue records. The similar scene data is grouped and processed using the K-means clustering algorithm to determine the scene subset with the highest similarity to the current effective guidance opportunity. The user response sequence of the scene subset and the current user behavior sequence are input into the support vector machine model for classification processing, and the sequence similarity score is calculated to obtain the matching result. The matching results are used to determine the degree of fit between the current user behavior and the similar scene data. Based on the degree of fit, non-linear change features are extracted from the matching results, and the trend prediction model is used to determine the trend of user intent confidence. Based on the trend of confidence, the hierarchical progression strategy of guiding questions is optimized. At the same time, based on the optimized hierarchical progression strategy of guiding questions, a framework for gradually narrowing the problem boundary is constructed, and the problem boundary narrowing strategy is generated.

7. The method according to claim 1, characterized in that, The steady state obtained from the user intent confidence assessment includes: By using the optimized hierarchical question progression strategy and the question boundary narrowing strategy, the focus of the user's question in the current interaction round is extracted and compared point by point with the response sequence of the previous round. The behavior matching degree is calculated. If the behavior matching degree exceeds the preset matching value, the question boundary is narrowed and the adjusted hierarchical question sequence is output to process the dialogue data of the current interaction round. The ambiguity markers before and after processing are compared one by one based on the processed dialogue data. The reduction ratio of the number of markers is calculated as the narrowing degree of the ambiguity markers. The multi-turn dialogue fluency index is calculated based on the narrowing degree. If the multi-turn dialogue fluency index is higher than the preset index threshold, then valid samples with a coherence score higher than the preset score threshold are extracted, and the numerical offset adjustment is made to the segment boundaries of the intention confidence segment evaluation in the valid samples, and the intention confidence segment evaluation model is updated. The data processed in the current round is input into the updated intention confidence segmentation evaluation model for calculation. The output confidence value sequence is recorded. If the fluctuation range of the confidence value sequence is less than the preset range, the average value of the sequence is determined as the stable state data of intention confidence evaluation. The stable state data is used as the stable state of user intention confidence evaluation and the reference threshold for adjusting the dialogue strategy parameters in the next round, until the interactive question-and-answer guidance is completed.

8. An intelligent question-answering system, used to implement the interactive guidance method of an intelligent question-answering system as described in any one of claims 1-7, characterized in that, The system includes: The sequence extraction unit extracts user expression pattern data and rhythm preference data from historical dialogue records. It then analyzes and processes the expression pattern data and rhythm preference data to generate user response sequences and emotion fluctuation sequences. The trend judgment unit is used to determine the user's willingness to cooperate intensity grading result and knowledge blind spot labeling information based on the user's response sequence and emotional fluctuation sequence, and to perform comprehensive calculations to obtain comprehensive impact data. Based on the comprehensive impact data, it judges the trend of the user's intention confidence assessment, adjusts the dialogue guidance strategy according to the trend, and generates expression ambiguity markers and clarification path templates. Obtaining the comprehensive impact data includes: quantifying the user's response speed and content relevance for the user's response sequence and emotional fluctuation sequence to obtain a preliminary willingness to cooperate intensity value; mapping the preliminary willingness to cooperate intensity value to a preset grading scale to generate the cooperation for the current round. The willingness intensity grading data is used as the cooperation willingness intensity grading result; the cooperation willingness intensity grading data is used to analyze the changes in user dialogue participation before and after rounds; at the same time, emotional peaks with emotional score standard deviations exceeding a preset standard threshold are detected from the emotional fluctuation sequence; the dialogue points corresponding to the emotional peaks are marked as potential knowledge blind spots and associated with the dialogue context to form dynamic knowledge blind spot data; the dynamic knowledge blind spot data is used as the knowledge blind spot annotation information; the number of knowledge blind spot annotation information and the value of the cooperation willingness intensity grading data are weighted and summed according to preset weights to generate the comprehensive impact data. The matching unit is used to extract question type classification statistics from the clarification path template, calculate the difference in the proportion of question types in the previous and next rounds based on the question type classification statistics, obtain the control intensity change range, determine the effective guidance timing, match dialogue data of similar scenarios for the effective guidance timing, and optimize the hierarchical progression strategy and question boundary narrowing strategy of guidance questions based on the matching results. The interaction unit is used to process the dialogue data of the current interaction round using the optimized hierarchical progression strategy of the guiding questions and the question boundary narrowing strategy, determine the degree of narrowing of the expression ambiguity markers, generate a multi-round dialogue fluency index, and obtain a stable state of user intent confidence assessment.

9. A computer-readable storage medium storing instructions thereon, characterized in that, When the instruction is executed by the processor, it implements an interactive guidance method for an intelligent question-answering system as described in any one of claims 1-7.