Intelligent interaction decision explainability analysis method and system combined with causal reasoning

By constructing a causal graph topology and visualizing it, the problem of users' long-term behaviors and interests not being considered in intelligent interactive systems is solved, which improves users' understanding and trust in the system's decisions and enhances the user experience.

CN122065992BActive Publication Date: 2026-06-19SHANGHAI MINGQI NETWORK TECH CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
SHANGHAI MINGQI NETWORK TECH CO LTD
Filing Date
2026-04-16
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

Existing intelligent interaction systems lack comprehensive consideration of users' long-term behavior and interests when generating response strategies, making it difficult for users to understand the logic and reasons behind the decisions, thus affecting trust and user experience.

Method used

By acquiring the raw interaction log data stream of the intelligent interaction system, we extract the user's long-term interest preference distribution vector, query intent core semantic vector, and response strategy content semantic vector. We then use a counterfactual causal reasoning framework to construct a causal graph topology, calculate the intervention effect measurement value, filter interpretable response content, and visualize it.

Benefits of technology

It enhances users' understanding of system decisions, improves trust in intelligent interaction and user experience, and quantifies the impact of users' long-term interests on response strategies through causal path search and visualization.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122065992B_ABST
    Figure CN122065992B_ABST
Patent Text Reader

Abstract

This application provides a method and system for interpretability analysis of intelligent interactive decisions based on causal reasoning, belonging to the field of intelligent interaction technology. First, it acquires the original interaction log data stream of the intelligent interactive system. Then, it decouples the interaction elements of the original interaction log data stream, extracting the user's long-term interest preference distribution vector, the core semantic vector of the query intent, and the semantic vector of the response strategy content. Next, the former two are used as the set of antecedent variables, and the latter as the set of consequent variables, inputting them into a counterfactual causal reasoning framework to construct an initial causal graph topology. Then, it simulates intervention on the edges of the first causal path, calculates the first intervention effect measurement value, ranks the contribution, generates an interactive decision interpretation feedback data stream, and pushes it to the user interface for visualization. This allows users to intuitively understand the basis of the system's decisions, enhances user trust in the system, and improves the overall quality and user experience of intelligent interaction.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of intelligent interaction technology, and more specifically, to a method and system for interpretability analysis of intelligent interactive decisions that combines causal reasoning. Background Technology

[0002] Existing intelligent interaction systems, when processing user queries and generating response strategies, often focus on improving the accuracy and efficiency of responses, but neglect the importance of the logic and reasons behind the decisions to the user.

[0003] Currently, most intelligent interaction systems generate response strategies based solely on the user's submitted query text, using pre-set algorithms and models, lacking a comprehensive consideration of the user's long-term behavior and interests. Furthermore, the system-generated response strategies are often presented as a black box, making it difficult for users to understand why they received the given response, which to some extent affects user trust in the system and the user experience. In addition, while some existing technologies attempt to analyze the interaction process, most are merely simple data statistics and correlation analysis, failing to delve into the causal relationships between user interests, query intent, and response strategies, thus failing to provide users with clear and accurate explanations for their decisions. Summary of the Invention

[0004] In view of this, the purpose of this application is to provide a method and system for interpretability analysis of intelligent interactive decision-making that combines causal reasoning.

[0005] According to a first aspect of this application, a method for interpretability analysis of intelligent interactive decision-making combined with causal reasoning is provided, the method comprising:

[0006] Obtain the raw interaction log data stream of the intelligent interaction system. The raw interaction log data stream includes the user's unique identifier code, the original query text content submitted by the user, and the initial response strategy text content pushed by the interaction engine for the original query text content.

[0007] The original interaction log data stream is subjected to interaction element decoupling processing. The user's long-term interest preference distribution vector is extracted from the historical interaction records associated with the user's unique identifier code. The core semantic vector of the query intent is extracted from the original query text content. The semantic vector of the reply strategy content is extracted from the initial reply strategy text content.

[0008] The user's long-term interest preference distribution vector and the query intent core semantic vector are used as the set of antecedent variables, and the response strategy content semantic vector is used as the consequence variable. They are input into the counterfactual causal reasoning framework for causal path search processing to generate a first causal path edge starting from the user's long-term interest preference distribution vector and a second causal path edge starting from the query intent core semantic vector, so as to construct an initial causal graph topology.

[0009] The first causal path edge is subjected to path blocking intervention simulation processing. The trajectory of the conditional probability distribution change of the semantic vector of the response strategy content is observed when the long-term interest preference distribution vector of the user takes different intervention values. The first intervention effect measure of the long-term interest preference distribution vector of the user on the semantic vector of the response strategy content is calculated based on the trajectory of the conditional probability distribution change.

[0010] Based on the first intervention effect measurement value, the initial response strategy text content is sorted by contribution. Response strategy text units whose first intervention effect measurement value exceeds a preset contribution threshold are selected as interpretable response content. The interpretable response content is then fused with the path identification information of the first causal path edge to generate an interactive decision explanation feedback data stream. The interactive decision explanation feedback data stream is then pushed to the user interface for visualization rendering and display.

[0011] According to a second aspect of this application, a system for interpretable analysis of intelligent interactive decisions based on causal reasoning is provided. The system includes a machine-readable storage medium and a processor. The machine-readable storage medium stores machine-executable instructions. When the processor executes the machine-executable instructions, the system implements the aforementioned method for interpretable analysis of intelligent interactive decisions based on causal reasoning.

[0012] Based on any of the above aspects, the technical effect of this application is as follows:

[0013] By acquiring the raw interaction log data stream of the intelligent interaction system and decoupling its interaction elements, the system extracted the user's long-term interest preference distribution vector, query intent core semantic vector, and response strategy content semantic vector. The antecedent variables and consequent variables were input into a counterfactual causal reasoning framework for causal path search, constructing an initial causal graph topology that reveals the causal relationship between user interests, query intent, and response strategies. Path-blocking intervention simulations were performed on the edges of the first causal path, and the first intervention effect metric was calculated, accurately quantifying the impact of users' long-term interests on the response strategy. Based on the first intervention effect metric, contribution ranking was performed, and explainable response content was selected and a visual interactive decision explanation feedback data stream was generated. This allows users to intuitively understand the basis of the system's decisions, enhancing user trust in the system and improving the overall quality and user experience of the intelligent interaction. Attached Figure Description

[0014] Figure 1 A flowchart illustrating the interpretability analysis method for intelligent interactive decision-making combined with causal reasoning provided in an embodiment of this application is shown.

[0015] Figure 2 This illustration shows a schematic diagram of the component structure of the intelligent interactive decision interpretability analysis system that combines causal reasoning, as provided in an embodiment of this application. Detailed Implementation

[0016] Figure 1 This paper illustrates a flowchart of the intelligent interactive decision interpretability analysis method and system combining causal reasoning provided in an embodiment of this application. The detailed steps include:

[0017] Step S110: Obtain the raw interaction log data stream of the intelligent interaction system. The raw interaction log data stream includes the user's unique identifier code, the original query text content submitted by the user, and the initial response strategy text content pushed by the interaction engine for the original query text content.

[0018] In this embodiment, the raw interaction log data stream is first read in real-time or in batches from the log storage cluster of the intelligent interaction system. This raw interaction log data stream consists of log records arranged in a time sequence. Each log record is parsed into a triplet data structure. The first element is a unique user identifier, a fixed-length string that has undergone hash-based anonymization, used to uniquely associate all of the user's historical behaviors without revealing their true identity. The second element is the original query text content submitted by the user, a natural language string without any preprocessing, recording the user's input for this interaction. The third element is the initial response strategy text content pushed by the interaction engine based on the original query text content. This is also a natural language string, representing the original decision output before the interpretive analysis of this method. The obtained raw interaction log data stream is temporarily stored in a distributed message queue.

[0019] Step S120: Perform interaction element decoupling processing on the original interaction log data stream, extract the user's long-term interest preference distribution vector from the historical interaction records associated with the user's unique identifier code, extract the core semantic vector of the query intent from the original query text content, and extract the semantic vector of the reply strategy content from the initial reply strategy text content.

[0020] Step S121: Retrieve the user's historical behavior database based on the user's unique identifier code, and retrieve all historical interaction record entries associated with the user's unique identifier code from the user's historical behavior database. The historical interaction record entries include historical query timestamps, historical query text content, historical clicked item identifiers, and historical dwell time parameters.

[0021] After obtaining the log records in step S110, the user's unique identifier code is parsed out. Using this unique identifier code as the query key, a retrieval operation is performed in the user's historical behavior database. This database stores the historical behavior sequence corresponding to each user identifier code. The retrieval result is a list, where each item is a historical interaction record entry. Each historical interaction record entry is a structured data object containing four fields: historical query timestamp, recording the precise time of the historical interaction; historical query text content, the query string submitted by the user at that time; historical clicked item identifier, the ID of the specific content item clicked by the user after the interaction; and historical dwell time parameter, recording the duration the user spent on the page or consumed content after the interaction.

[0022] Step S122: Perform topic distribution modeling on the historical query text content in the historical interaction record entries to generate the historical query topic distribution probability vector corresponding to the user's unique identifier code; perform frequency statistical aggregation on the historical click item identifiers in the historical interaction record entries to generate the historical click preference frequency vector corresponding to the user's unique identifier code.

[0023] The historical query text content from all retrieved historical interaction records is aggregated into a text set. A topic modeling algorithm, such as Latent Dirichlet Allocation, is applied to this text set. This model treats each piece of historical query text content as a mixture of multiple latent topics, outputting a topic distribution probability vector. The dimension of the vector corresponds to the preset number of topics, for example, K_topic dimension. The value of each dimension represents the probability that all of the user's historical queries belong to the corresponding topic. Simultaneously, the frequency of historical clicked item identifiers in all historical interaction records is counted. First, a vocabulary containing all possible clicked items is built, with a dimension of M_item. Then, all historical clicked item identifiers of the user are traversed, and the frequency of each identifier is counted, generating a histogram vector of dimension M_item, i.e., the historical click preference frequency vector, where the value of each dimension represents the total number of times the corresponding user clicked a certain item.

[0024] Step S123: Standardize the historical query topic distribution probability vector and the historical click preference frequency vector respectively to obtain the standardized historical query topic distribution vector and the standardized historical click preference distribution vector.

[0025] Since the elements of the historical query topic distribution probability vector are probability values, which sum to 1, while the elements of the historical click preference frequency vector are frequencies, ranging from 0 to a wide range, their dimensions and numerical ranges differ. Standardization is necessary to eliminate the influence of these dimensions. For the historical query topic distribution probability vector, since its sum is 1, further scaling is usually unnecessary. However, for consistency with subsequent steps, Z-score standardization can be performed. The mean and standard deviation of each topic probability value are calculated, and then (original value minus mean) is divided by the standard deviation to obtain the standardized value. For the historical click preference frequency vector, Z-score standardization is used. First, the mean μ_click and standard deviation σ_click of all frequency values ​​in the vector are calculated. Then, for each item's frequency value f_i, the standardized click preference value is calculated by dividing (f_i minus μ_click) by σ_click. Finally, the standardized historical click preference distribution vector is obtained. After processing, both vectors have zero mean and unit variance, facilitating subsequent fusion calculations.

[0026] Step S124: Perform vector dimension concatenation and fusion processing on the standardized historical query topic distribution vector and the standardized historical click preference distribution vector to obtain the user long-term interest preference distribution vector corresponding to the user's unique identifier code.

[0027] The two standardized vectors obtained from the above steps are concatenated. Assuming the standardized historical query topic distribution vector has dimension K_topic and the standardized historical click preference distribution vector has dimension M_item, the concatenation operation merges along the feature dimensions to generate a new vector with dimensions K_topic plus M_item. This new vector is the user's long-term interest preference distribution vector, which comprehensively represents the user's long-term behavioral preferences in both query topics and clicked items. Each element of this user's long-term interest preference vector corresponds to the standardized preference strength for a specific topic or item.

[0028] Step S125: Perform syntactic structure parsing on the original query text content, identify the core query entity word units and core query intent word units in the original query text content, input the core query entity word units into the pre-trained first semantic encoding network for entity semantic vectorization mapping processing, and obtain the core query entity semantic vector.

[0029] Obtain the original query text content string from step S110. First, perform syntactic structure parsing on the string, for example, using a dependency parser to identify noun phrases as core query entity units and verb phrases or specific interrogative words as core query intent units. Combine all identified core query entity units into an ordered word sequence. Input this sequence into a pre-trained first semantic encoding network. This first semantic encoding network can be a Transformer-based encoder, such as an entity-aware variant of BERT. The input sequence is processed layer by layer through the network's embedding layer, multi-head self-attention layer, and feedforward neural network layer, ultimately encoding the entire entity sequence into a fixed-dimensional dense vector, i.e., the core query entity semantic vector, which captures the main objects or concepts involved in the user query.

[0030] Step S126: Input the core query intent word unit into the pre-trained second semantic encoding network for intent semantic vectorization mapping to obtain the core query intent semantic vector. Perform attention weighted fusion processing on the core query entity semantic vector and the core query intent semantic vector to output the query intent core semantic vector corresponding to the original query text content.

[0031] The core query intent word unit sequence identified in step S125 is input into a separate, pre-trained second semantic encoding network. This network architecture can be similar to the first semantic encoding network, but its pre-training task may focus more on intent classification, for example, training on a large amount of query-intent pair data. After encoding, a fixed-dimensional core query intent semantic vector is obtained. To fuse entity and intent information, an attention-weighted fusion mechanism is adopted. First, the attention weights are calculated as the query and the key and value of the core query intent semantic vector. Specifically, a single-layer neural network calculates the relevance score between the intent vector and each entity vector position, and then normalizes the score into weights using a Softmax function. Finally, these weights are used to weight and sum the various parts of the core query entity semantic vector to obtain a weighted entity representation. This weighted entity representation is then concatenated with the original core query intent semantic vector to obtain the query intent core semantic vector, which contains fine-grained semantic information on both "what to do" and "to whom to do it".

[0032] Step S127: Perform text component parsing on the initial response strategy text content, extract the strategy backbone description text fragments and strategy detail modification text fragments from the initial response strategy text content, input the strategy backbone description text fragments into a pre-trained backbone semantic encoding network for backbone semantic vectorization mapping processing, and obtain the strategy backbone semantic vector.

[0033] Obtain the initial response strategy text string from step S110. Perform component parsing on this text, for example, using rule-based template matching or sequence labeling models, to segment the text into two parts: the strategy backbone description and the strategy detail description. The strategy backbone description is the core information of the response, such as a conclusion or an action instruction; the strategy detail description provides supplementary explanations of the backbone, such as reasons, conditions, or examples. Input the strategy backbone description text fragment into a pre-trained backbone semantic encoding network. This backbone semantic encoding network can also be a Transformer-based encoder, outputting a fixed-dimensional strategy backbone semantic vector that captures the core decision content of the response.

[0034] Step S128: Input the text fragment of the strategy detail modification part into the pre-trained detail semantic encoding network for detail semantic vectorization mapping to obtain the strategy detail semantic vector. Perform gating fusion processing on the strategy backbone semantic vector and the strategy detail semantic vector to generate the response strategy content semantic vector corresponding to the initial response strategy text content.

[0035] The policy detail modification text fragment extracted in step S127 is input into an independent pre-trained detail semantic encoding network, which outputs a policy detail semantic vector. To fuse the core and detail information, a gating fusion mechanism is employed. First, a gating coefficient g is calculated, generated by a single-layer neural network that takes the policy core semantic vector and the policy detail semantic vector as input, with an output value between 0 and 1. Then, the fused vector equals g multiplied by the policy core semantic vector plus (1 minus g) multiplied by the policy detail semantic vector. Through this method, the model can adaptively determine the relative importance of core and detail information in the current response, ultimately generating a response policy content semantic vector that fully represents the initial response policy text content.

[0036] Step S129: Align the user's long-term interest preference distribution vector, the query intent core semantic vector, and the response strategy content semantic vector according to a preset vector dimension alignment rule, so that the user's long-term interest preference distribution vector, the query intent core semantic vector, and the response strategy content semantic vector have the same number of feature dimensions, and obtain the dimension-aligned user's long-term interest preference distribution vector, the dimension-aligned query intent core semantic vector, and the dimension-aligned response strategy content semantic vector.

[0037] Step S1210: Perform vector distribution normalization processing on the dimension-aligned user long-term interest preference distribution vector, query intent core semantic vector, and response strategy content semantic vector to obtain normalized user long-term interest preference distribution vector, normalized query intent core semantic vector, and normalized response strategy content semantic vector, and combine them into an interaction triple element set. Assign a unique element identifier to each vector in the interaction triple element set, establish a mapping relationship between the user long-term interest preference distribution vector and the user unique identifier code, establish a mapping relationship between the query intent core semantic vector and the original query text content, and establish a mapping relationship between the response strategy content semantic vector and the initial response strategy text content.

[0038] Since the dimensions of the three vectors generated in the above steps may be inconsistent—for example, the user long-term interest preference distribution vector might have dimensions K_topic + M_item, while the dimensions of the query intent core semantic vector and the response strategy content semantic vector might be fixed, such as H_dim—they need to be aligned to the same dimensional space D for subsequent causal analysis. A unified linear transformation layer is used for alignment. For example, for the user long-term interest preference distribution vector with dimensions K_topic + M_item, it is multiplied by a transformation matrix of shape (K_topic + M_item, D) to obtain a D-dimensional aligned vector. A similar operation is performed on the query intent core semantic vector and the response strategy content semantic vector, using their respective transformation matrices to map them all to a D-dimensional space, resulting in three dimensionally aligned vectors. Next, the vectors aligned across these three dimensions undergo distribution normalization, typically using layer normalization. This involves calculating the mean and variance of all elements for each vector, then dividing the mean by the variance to stabilize the distribution of each vector. This yields the final normalized user long-term interest preference distribution vector, the normalized query intent core semantic vector, and the normalized response strategy content semantic vector. Finally, these three vectors are combined into an interaction triplet element set, and a globally unique element identifier, such as a UUID string, is generated for each vector in the set. Simultaneously, mapping relationships are established in the metadata: the user long-term interest preference distribution vector is associated with the user unique identifier code from step S110; the query intent core semantic vector is associated with the original query text content; and the response strategy content semantic vector is associated with the initial response strategy text content. These mapping relationships are stored in a relational mapping table for subsequent tracing and interpretation.

[0039] Step S130: The user's long-term interest preference distribution vector and the query intent core semantic vector are used as the set of antecedent variables, and the response strategy content semantic vector is used as the consequence variable. They are input into the counterfactual causal reasoning framework for causal path search processing to generate a first causal path edge starting from the user's long-term interest preference distribution vector and a second causal path edge starting from the query intent core semantic vector, so as to construct an initial causal graph topology.

[0040] Step S131: Combine the user's long-term interest preference distribution vector and the query intent core semantic vector to form an antecedent variable matrix, and use the response strategy content semantic vector as a consequence variable vector. Input the antecedent variable matrix and the consequence variable vector into the input interface of the counterfactual causal reasoning framework. Perform data type validation and dimension matching processing on the antecedent variable matrix and the consequence variable vector through the input interface, and output the validated antecedent variable matrix and consequence variable vector.

[0041] From the set of interaction triplet elements obtained in step S1210, extract the normalized user long-term interest preference distribution vector, denoted as vector U, and the normalized query intent core semantic vector, denoted as vector Q. Combine these two vectors as column vectors into a matrix with the number of rows equal to the number of samples (here, 1, representing one interaction) and the number of columns equal to 2. This matrix is ​​the antecedent variable matrix X. The normalized response strategy content semantic vector, denoted as vector R, is used as the consequence variable vector Y. Input matrix X and vector Y into the input interface of the counterfactual causal reasoning framework. This input interface first performs data type validation to confirm that the elements of X and Y are all floating-point data and do not contain null values. Then, it performs dimension matching validation to confirm that the number of rows in X is equal to the length of Y, and that the feature dimension D in X and Y both conform to the framework's preset input dimension. After the validation passes, the validated X and Y are passed to the core processing module of the framework.

[0042] Step S132: Invoke the conditional independence test module of the counterfactual causal reasoning framework to perform conditional independence tests on the antecedent variable matrix and the consequent variable vector, calculate the first conditional mutual information value between the user's long-term interest preference distribution vector and the response strategy content semantic vector under the given query intent core semantic vector, calculate the second conditional mutual information value between the query intent core semantic vector and the response strategy content semantic vector under the given user's long-term interest preference distribution vector, and calculate the third conditional mutual information value between the user's long-term interest preference distribution vector and the query intent core semantic vector under the given response strategy content semantic vector.

[0043] The counterfactual causal reasoning framework invokes its built-in conditional independence testing module. The core of this module is the calculation of conditional mutual information. First, to calculate the first conditional mutual information value, I(U, R|Q), it is necessary to estimate the KL divergence of the product of the joint distribution and marginal distributions of U and R given Q. In implementation, an entropy estimator based on k-nearest neighbors can be used. Specifically, the joint data points of U, R, and Q are embedded into a high-dimensional space. For each point, its k-th nearest neighbor in the joint space is found, and the number of neighbors in each subspace of U, R, and Q is counted. Through complex counting and statistics, the value of the conditional mutual information is finally estimated. Similarly, the second and third conditional mutual information values ​​I(Q, R|U) and I(U, Q|R) are calculated using the exact same method. These three calculation processes are performed independently, ultimately yielding three scalar values, denoted as CMI_UR_Q, CMI_QR_U, and CMI_UQ_R, respectively.

[0044] Step S133: Compare the first conditional mutual information value, the second conditional information value, and the third conditional mutual information value with a preset unified conditional independence threshold respectively; when the first conditional mutual information value exceeds the preset unified conditional independence threshold, determine that there is a direct causal relationship between the user's long-term interest preference distribution vector and the response strategy content semantic vector, and generate a first causal path edge candidate identifier with the user's long-term interest preference distribution vector as the starting point and the response strategy content semantic vector as the ending point.

[0045] The framework predefines a conditional independence threshold parameter T_CI, which is a decimal greater than 0, used to determine whether the conditional dependency is significant. The CMI_UR_Q calculated in step S132 is compared with T_CI. If CMI_UR_Q is greater than T_CI, it indicates that even with the known query intent Q, user interest U still provides additional information to the response R, thus suggesting a direct causal relationship between U and R. At this point, a first causal path edge candidate identifier is generated. This first causal path edge candidate identifier is a data structure containing the starting node ID (i.e., the feature identifier of U), the ending node ID (i.e., the feature identifier of R), and the associated CMI value.

[0046] Step S134: When the value of the second conditional mutual information exceeds the preset unified conditional independence threshold, it is determined that there is a direct causal relationship between the core semantic vector of the query intent and the semantic vector of the response strategy content, and a second causal path edge candidate identifier is generated with the core semantic vector of the query intent as the starting point and the semantic vector of the response strategy content as the ending point.

[0047] Similarly, CMI_QR_U is compared with T_CI. If CMI_QR_U is greater than T_CI, it indicates that, given the user's interest U, the query intent Q still has a direct impact on the response R, thus determining that there is a direct causal relationship between Q and R. The system generates a second causal path edge candidate identifier, with its starting point being the feature identifier of Q and its ending point being the feature identifier of R.

[0048] Step S135: When the value of the third condition mutual information exceeds the preset unified condition independence threshold, it is determined that there is a covariant correlation between the user's long-term interest preference distribution vector and the query intent core semantic vector, and a covariant path edge candidate identifier between the user's long-term interest preference distribution vector and the query intent core semantic vector is generated.

[0049] Compare CMI_UQ_R with T_CI. If CMI_UQ_R is greater than T_CI, it indicates that a dependency still exists between U and Q given the known response R. This usually implies an unobserved confounding factor between U and Q, or that they are inherently correlated; such a correlation is called a covariant correlation. The system generates a candidate identifier for a covariant path edge connecting nodes U and Q. This edge does not represent a direct causal relationship, but rather an undirected correlation or confounding association.

[0050] Step S136: Perform path edge confidence assessment on the first causal path edge candidate identifier, the second causal path edge candidate identifier, and the covariant path edge candidate identifier. Use a bootstrap sampling method to perform multiple resampling processes on the antecedent variable matrix and the consequence variable vector. Repeat the conditional independence test based on the data subset obtained from each resampling. Calculate the frequency percentage of the first causal path edge candidate identifier in the multiple resampling processes as the first path edge confidence score, and calculate the frequency percentage of the second causal path edge candidate identifier in the multiple resampling processes as the second path edge confidence score.

[0051] To improve the robustness of causal discovery, confidence assessment is required. A bootstrap sampling method is used to resample the original data X and Y. Since we are currently dealing with a single interaction, the framework maintains a sliding window dataset containing a large number of historical interaction samples for bootstrap sampling. The (U, Q, R) triples of the current interaction are added to this window, forming a dataset containing N samples. N samples are randomly drawn with replacement from these N samples to form a new bootstrap sample set. This process is repeated B times, resulting in B bootstrap sample sets. For each bootstrap sample set, the conditional independence tests in steps S132 to S135 are re-executed. The number of times b1 is successfully generated in the B tests is counted. The confidence score of the first causal path edge candidate (i.e., the edge from U to R) is then calculated as b1 divided by B. Similarly, the confidence score of the second causal path edge candidate is calculated as b2 divided by B.

[0052] Step S137: When the confidence score of the first path edge exceeds the preset path edge confidence threshold, the first causal path edge candidate identifier is converted into a formal first causal path edge identifier. When the confidence score of the second path edge exceeds the preset path edge confidence threshold, the second causal path edge candidate identifier is converted into a formal second causal path edge identifier. An initial causal graph topology is constructed based on the formal first causal path edge identifier and the formal second causal path edge identifier.

[0053] The framework predefines a path edge confidence threshold T_conf, for example, 0.8. The confidence score of the first path edge is compared with T_conf. If b1 divided by B is greater than T_conf, the candidate edge is "converted" into a formal first causal path edge identifier, which includes the edge's unique ID, start point, end point, average CMI value, and confidence score. The second causal path edge is processed similarly. If the confidence score of a covariant path edge also exceeds the threshold, it is also retained as an undirected edge. Finally, using U, Q, and R as nodes, and these converted path edges as directed or undirected edges, an initial causal graph topology containing three nodes and several edges is constructed. This initial causal graph topology is typically stored in memory as an adjacency list or edge list.

[0054] Step S140: Perform path blocking intervention simulation processing on the first causal path edge, observe the change trajectory of the conditional probability distribution of the response strategy content semantic vector when the user's long-term interest preference distribution vector takes different intervention values, and calculate the first intervention effect measure value of the user's long-term interest preference distribution vector on the response strategy content semantic vector based on the change trajectory of the conditional probability distribution.

[0055] Step S141: Obtain the original distribution range of the user's long-term interest preference distribution vector, divide the original distribution range into multiple discrete intervention intervals according to the equal interval division method, assign an intervention value label to each discrete intervention interval, and generate an intervention value label set containing multiple intervention value labels.

[0056] To simulate intervention, the values ​​of the intervention values ​​must first be defined. From a sliding window dataset containing a large number of historical interaction samples, the long-term user interest preference distribution vector U of all historical interactions is extracted, forming a sample set of U. For each dimension of U, its minimum value u_min_d and maximum value u_max_d in the entire sample set are calculated. Since U is a D-dimensional vector, its distribution range is a D-dimensional hyperrectangle. To simplify the intervention simulation, one or several key dimensions of U are usually selected for intervention, or the entire vector is treated as a whole for intervention. Here, a global intervention approach is adopted: the Euclidean distance between all historical U vectors and their center point (e.g., the mean vector) is calculated, resulting in a sequence of distance values. The range of these distance values ​​[min_dist, max_dist] constitutes the original distribution range of U. This distance interval is divided into K equal intervals, and the center point or representative value of each interval is selected as an intervention value. Each intervention value is assigned an intervention value label, for example, from 1 to K, forming an intervention value label set S_intervene.

[0057] Step S142: Select each intervention value label from the set of intervention value labels in sequence as the current intervention value label, force the vector value of the user's long-term interest preference distribution vector to be the intervention value corresponding to the current intervention value label, maintain the natural change of the vector value of the query intent core semantic vector, and construct the post-intervention data distribution environment when the user's long-term interest preference distribution vector takes the current intervention value.

[0058] An intervention value label k is selected from S_intervene, and its corresponding intervention value is a distance value d_k. To achieve the intervention of "forcing U to a certain value" in the data, it's not enough to simply change a single data point; a post-intervention environment needs to be constructed. First, all samples whose distance from the U vector to the center point is closest to d_k are selected from the historical dataset. Then, these samples are replicated and slightly perturbed to generate a synthetic dataset with a sufficiently large number of U values ​​concentrated around d_k. In this dataset, the value of U is fixed near d_k, while Q retains its original natural variation. This constructs a post-intervention data distribution environment with "U equals d_k" as the intervention condition.

[0059] Step S143: In the post-intervention data distribution environment, collect multiple observation sample values ​​of the semantic vector of the response strategy content, perform probability density estimation processing on the multiple observation sample values, and generate the conditional probability density function of the semantic vector of the response strategy content under the condition that the user's long-term interest preference distribution vector takes the current intervention value.

[0060] In the post-intervention data distribution environment constructed in step S142, a large number of R values ​​can be observed. Since U in these data is fixed while Q varies naturally, the distribution of R reflects P(R|do(U=d_k)), i.e., the conditional distribution of R after intervention U. The R vectors of all samples are collected from this environment, resulting in a sample set of R. Probability density estimation is performed on this multidimensional R sample set, for example, using kernel density estimation. For each dimension of R, one-dimensional kernel density estimation or multi-dimensional kernel density estimation can be used to estimate the joint distribution. Finally, a conditional probability density function f_k(r) describing P(R|do(U=d_k)) is generated.

[0061] Step S144: Repeat the steps of selecting the intervention value label, constructing the post-intervention data distribution environment, and generating the conditional probability density function until all intervention value labels in the intervention value label set are traversed to obtain the sequence of conditional probability density functions of the semantic vector of the response strategy content under the condition that the user's long-term interest preference distribution vector takes each intervention value.

[0062] For each label in the set of intervention value labels S_intervene, from 1 to K, steps S142 and S143 are repeated sequentially. Each execution yields a conditional probability density function f_k(r) corresponding to the intervention value d_k. After all executions are completed, a sequence of conditional probability density functions F_seq=[f_1(r), f_2(r), ..., f_K(r)] is obtained. This sequence fully describes how the distribution of R changes when U is forcibly set at different levels.

[0063] Step S145: Perform numerical integration on the conditional probability density function sequence, calculate the expected value of the response strategy content semantic vector under the condition that the user's long-term interest preference distribution vector takes various intervention values, and obtain the trajectory curve of the expected value of the response strategy content semantic vector changing with the intervention value of the user's long-term interest preference distribution vector.

[0064] For each conditional probability density function f_k(r), its expected value E[R|do(U=d_k)] is calculated. Since R is a multidimensional vector, the expected value is also a multidimensional vector. The calculation method is to perform a multidimensional integration of f_k(r): over the entire domain of r, calculate the integral of r multiplied by f_k(r). In practical implementation, this is approximated using the Monte Carlo method: M R samples are sampled from the distribution represented by f_k(r), and then the average of these M samples is calculated to obtain an approximate expected vector. This operation is performed on each f_k in the sequence F_seq, resulting in K expected vectors E_k. Arranging these expected vectors in ascending order of the intervention value d_k yields an expected value change trajectory curve. The horizontal axis of this trajectory curve represents the intervention value d (a scalar, representing the distance to U), and the vertical axis represents the expected vector E (a D-dimensional vector). This trajectory curve visually demonstrates the average causal effect of the intervention U on R.

[0065] Step S146: Perform first-order difference processing on the expected value change trajectory curve, calculate the difference between the expected values ​​corresponding to adjacent intervention values, take the ratio of the difference to the interval distance between adjacent intervention values ​​as the local intervention effect value, perform weighted average processing on the local intervention effect value, and generate the first intervention effect measure value of the user's long-term interest preference distribution vector to the response strategy content semantic vector.

[0066] To obtain an overall measure of the intervention effect, the trajectory curves are further analyzed. First, for two adjacent points (d_k, E_k) and (d_{k+1}, E_{k+1}) on the trajectory, the difference in the expected vector ΔE_k = E_{k+1} - E_k is calculated. This is a D-dimensional vector. Simultaneously, the interval of the intervention value Δd_k = d_{k+1} - d_k is calculated. Then, the local intervention effect vector in interval k is ΔE_k divided by Δd_k. This represents the average rate of change of R when U changes by one unit near d_k. Since these local effect vectors are D-dimensional, they need to be integrated into a scalar. This can be obtained by calculating the L2 norm of ΔE_k, or by calculating its dot product with a reference direction (such as the global average direction). Then, for all K-1 intervals, a weighted average of the local intervention effect is calculated. The weights can be the interval length Δd_k, or the density of data points within that interval. Ultimately, the first intervention effect measure, ATE_U_R, is equal to the sum of the products of all interval local effects s_k and their weights w_k, divided by the sum of the weights. This final scalar value comprehensively measures the average causal effect strength of the user's long-term interest preference U on the response strategy content R.

[0067] Step S150: Based on the first intervention effect measurement value, perform contribution tracing and sorting processing on the initial response strategy text content, select response strategy text units whose first intervention effect measurement value exceeds the preset contribution threshold as interpretable response content, fuse the interpretable response content with the path identification information of the first causal path edge to generate an interactive decision explanation feedback data stream, and push the interactive decision explanation feedback data stream to the user interaction interface for visualization rendering and display.

[0068] Step S151: Perform text segmentation processing on the initial response strategy text content, divide the initial response strategy text content into multiple response strategy text units according to semantic integrity, assign a unit identifier to each response strategy text unit, and extract the local response strategy content semantic vector corresponding to each response strategy text unit.

[0069] Obtain the original initial response strategy text content string from step S110. Using semantic segmentation techniques, such as a BERT-based sentence boundary detection model, the long text is segmented into several short sentences or phrases with independent semantics, each serving as a response strategy text unit. A unit identifier is generated for each unit. Next, the semantic vector for each unit needs to be obtained. This can be achieved using the semantic detail encoding network from step S128, where each text unit is input separately to obtain its corresponding local response strategy content semantic vector. Assuming P units are segmented, P local vectors r_1, r_2, ..., r_P are obtained.

[0070] Step S152: Perform correlation analysis on the semantic vector of the local response strategy content corresponding to each response strategy text unit and the user's long-term interest preference distribution vector, calculate the path propagation contribution weight of each local response strategy content semantic vector on the first causal path edge, and multiply the path propagation contribution weight with the first intervention effect metric to obtain the first intervention effect contribution score corresponding to each response strategy text unit.

[0071] To calculate the contribution of each text unit to the overall effect, the overall intervention effect ATE_U_R (a scalar) needs to be decomposed into individual units. A gradient-based contribution allocation method is employed. First, the overall response vector R is viewed as a function of the individual local unit vectors r_i. For example, R can be obtained from r_i through some aggregation operation (such as averaging or weighted summation). Then, ATE_U_R can be seen as the indirect influence of U on R. According to the chain rule, the effect of U on R can be approximated by multiplying the effect of U on each r_i by the effect of r_i on R. The propagation contribution weight w_i of the local unit vector r_i on the causal path U->R is calculated. This weight can be obtained by calculating the derivative of R with respect to r_i; for example, if R is the average of r_i, then the weight is 1 / P. Then, the local intervention effect of U on each local unit r_i can be calculated, which requires repeating step S140, but replacing the consequence variable from R with r_i, to calculate ATE_U_ri. However, this is computationally intensive. Therefore, an approximation method is adopted: the overall ATE_U_R is multiplied by the weight w_i to obtain the contribution of this unit. That is, the contribution score c_i of the first intervention effect is equal to ATE_U_R multiplied by w_i.

[0072] Step S153: Sort multiple response strategy text units in descending order according to the first intervention effect contribution score corresponding to each response strategy text unit, and generate a contribution ranking list containing the response strategy text unit identifier and its corresponding first intervention effect contribution score. Starting from the beginning of the contribution ranking list, extract the response strategy text units with the highest contribution ranking as a set of candidate interpretable response content.

[0073] The contribution scores c_i of all P units calculated in step S152 and their corresponding unit identifiers are compiled into a list. This list is then sorted in descending order of c_i values ​​to obtain a contribution ranking list. Starting from the first element of the list, the first T units are extracted, where T is a preset number of interpretable units to be displayed. The extracted units and their related information constitute the candidate interpretable response content set.

[0074] Step S154: Select response strategy text units from the candidate explainable response content set whose contribution score of the first intervention effect exceeds the preset contribution threshold as formal explainable response content, assign an explainable content identifier to each formal explainable response content, and record the original position information of each formal explainable response content in the initial response strategy text content.

[0075] To ensure that the displayed units truly make a significant contribution, a contribution threshold T_contrib is set. Each unit in the candidate set is iterated through, and its contribution score c_i is compared with T_contrib. Only units with c_i greater than T_contrib are ultimately adopted as formally interpretable response content. For each adopted unit, a new interpretable content identifier is generated. Simultaneously, the start and end character positions of this unit in the original initial response strategy text are recorded for highlighting during visualization.

[0076] Step S155: Perform data encapsulation processing on the formal interpretable response content and the path identification information of the first causal path edge to generate a first interpretable data package containing interpretable response text data, the first causal path edge identifier and the original location information, and input the first interpretable data package into the visualization rendering engine.

[0077] For each formally interpretable response content unit, its text content, the first causal path edge identifier generated in step S137 (i.e., the ID of the U->R edge), and the original location information recorded in step S154 are encapsulated into a structured data object. All the above data objects are combined into a list, i.e., the first interpretability data package. This data package is then sent to the front-end visualization rendering engine.

[0078] Step S156: The first interpretable data packet is processed by the visualization rendering engine to map graphical elements, the formal interpretable response content is mapped to a highlighted text area, the first causal path edge identifier is mapped to an arrow line graphic pointing from the user's interest node to the response text area, an interactive decision explanation image frame containing causal path visualization elements is generated, and the interactive decision explanation image frame is pushed to the display buffer of the user interaction interface for real-time refresh display.

[0079] After receiving the data packet, the visualization rendering engine parses its content. First, based on the original location information, it highlights the corresponding text unit in the display area of ​​the original response text, for example, by changing its background color or making the font bold. Then, it draws a node representing the "user's long-term interest" (e.g., a circular icon) on the canvas, and draws an arrow line from this node pointing to the highlighted text area. Path identifiers or contribution scores can be labeled next to the arrow line. This generates a complete interactive decision explanation image, clearly showing that "because of the user's long-term interest (U), a certain part (r_i) in the response became key content." Finally, the rendering engine pushes this image frame to the display buffer of the user interface, completing real-time refresh and display, allowing the user to intuitively understand the causal logic behind the decision.

[0080] Step S160: Perform path blocking intervention simulation processing on the second causal path edge, fix the vector value of the query intent core semantic vector to each discrete value within the reference distribution range, observe the conditional probability distribution change trajectory of the response strategy content semantic vector when the query intent core semantic vector takes each discrete value, and calculate the second intervention effect measure value of the query intent core semantic vector on the response strategy content semantic vector based on the conditional probability distribution change trajectory.

[0081] This step is a symmetrical processing of step S140, but the intervention target is replaced by Q instead of U. First, the original distribution range of the core semantic vector Q of the query intent is obtained. This requires collecting and analyzing all Q vectors from the historical dataset. Similar to processing U, intervention can be chosen on a key dimension or overall norm of Q. For example, the cosine similarity between all Q vectors and a centroid (such as the mean vector) is calculated to obtain a sequence of similarity values, which serves as the basis for intervention. This range is then divided into multiple discrete intervention intervals, each corresponding to an intervention value (e.g., a specific cosine similarity value). For each intervention value, a post-intervention data distribution environment is constructed, whereby the Q value of all samples is forced to equal the intervention value, while U remains unchanged. In this environment, the distribution of R is observed, the conditional probability density is estimated, and the expected value is calculated. By analyzing the trajectory of the expected value of R as a function of the Q intervention value, the local effect of Q on R is calculated, and a weighted average is finally taken to obtain the second intervention effect measure, ATE_Q_R. This value quantifies the strength of the influence of the user's current query intent on the response strategy after excluding the interference of U.

[0082] Step S170: Compare and analyze the second intervention effect measure with the first intervention effect measure, and calculate the ratio of the second intervention effect measure to the first intervention effect measure as the relative importance index of the causal path.

[0083] Compare the ATE_Q_R calculated in step S160 with the ATE_U_R calculated in step S140. Calculate the ratio between the two; for example, RI_Q_over_U equals ATE_Q_R divided by ATE_U_R. This ratio RI is the relative importance index of the causal path. If RI is much greater than 1, it means that in the current interaction, the query intent has a much greater impact than the user's long-term interest; if RI is much less than 1, it means that the user's long-term interest plays a dominant role; if RI is close to 1, the two have roughly equal influence.

[0084] Step S180: Perform path edge weighting processing on the first causal path edge and the second causal path edge in the initial causal graph topology according to the relative importance index of the causal path, assigning a first path edge weight value to the first causal path edge and a second path edge weight value to the second causal path edge.

[0085] The relative importance index RI calculated in step S170 is equal to ATE_Q_R divided by ATE_U_R. However, for weight allocation, the two intervention effect measures need to be normalized. Specifically, the first path edge weight value W_UR is calculated as W_UR = ATE_U_R / (ATE_U_R + ATE_Q_R). The second path edge weight value W_QR is calculated as W_QR = ATE_Q_R / (ATE_U_R + ATE_Q_R). The two weight values ​​calculated by the above formulas satisfy W_UR + W_QR = 1, representing the relative contribution ratios of the two causal paths in the explanation response generation. These weight values ​​are then added as new attributes to the first and second causal path edges in the initial causal graph topology generated in step S137.

[0086] Step S190: Perform dual contribution tracing and sorting processing on the initial response strategy text content according to the first path edge weight value and the second path edge weight value, and generate a first weighted contribution score corresponding to the first causal path edge and a second weighted contribution score corresponding to the second causal path edge for each response strategy text unit in the initial response strategy text content.

[0087] For each response strategy text unit, its corresponding local response strategy content semantic vector is denoted as r_i. The formula for calculating the first weighted contribution score c_i_U of this response strategy text unit on the first causal path edge is c_i_U = ATE_U_R × w_i, where w_i is the aggregate weight of the local vector r_i with respect to the overall response vector R, which is obtained by calculating the derivative of R with respect to r_i. Similarly, the formula for calculating the second weighted contribution score c_i_Q of this unit on the second causal path edge is c_i_Q = ATE_Q_R × w_i. Ultimately, each unit r_i corresponds to two scores, c_i_U and c_i_Q, which quantify the independent contribution of the user's long-term interests and current query intent to this text unit, respectively.

[0088] Step S200: Perform a weighted summation and fusion process on the first weighted contribution score and the second weighted contribution score to obtain the comprehensive causal contribution score of each response strategy text unit. Perform a comprehensive sorting process on multiple response strategy text units according to the comprehensive causal contribution score to generate a comprehensive contribution ranking list.

[0089] The two contribution scores of each unit r_i in step S190 are weighted and fused according to the path edge weights determined in step S180. The formula for calculating the comprehensive contribution score c_i_total is c_i_total = W_UR × c_i_U + W_QR × c_i_Q, where W_UR and W_QR are the weight values ​​of the first and second path edges, respectively. This comprehensive score combines the joint influence of the two antecedent variables on the text unit. Then, the c_i_total values ​​of all units are sorted in descending order to generate a comprehensive contribution ranking list. This comprehensive contribution ranking list reflects the importance ranking of each part of the response after comprehensively considering the user's long-term interests and current query intent.

[0090] Step S210: Perform time window segmentation processing on the original interaction log data stream. Based on the timestamp information of each interaction event in the original interaction log data stream, divide the original interaction log data stream into multiple consecutive time window data blocks according to a preset time window length.

[0091] While acquiring the raw interaction log data stream in step S110, traffic analysis can be performed in parallel. The timestamp information, implicit or explicit, carried in each log record is parsed. A time window length parameter T_win is preset, for example, set to 10 minutes. Then, based on the timestamp of each log record, it is divided into corresponding consecutive and non-overlapping time window data blocks. For example, logs with timestamps from T0 to T0+T_win-1 belong to the first window, those from T0+T_win to T0+2T_win-1 belong to the second window, and so on. Each time window data block contains all interaction logs that occurred within that time period.

[0092] Step S220: Perform deduplication and statistical processing on the unique user identifier codes in each time window data block, count the number of different unique user identifier codes appearing in each time window data block, and generate the active user count statistics for each time window data block.

[0093] For each time window data block, extract the user unique identifier code field of all log records. Perform a set deduplication operation on the above codes, that is, keep only the first occurrence of the code and remove duplicates. Then, count the total number of deduplicated codes, and this value is the statistical value of the number of active users in that time window. Perform this operation on all time windows to obtain a time series U_active[t], where t represents the window index.

[0094] Step S230: Perform query frequency statistics on the original query text content within each time window data block, count the total number of times the original query text content appears within each time window data block, and generate the total query statistics value corresponding to each time window data block.

[0095] For each time window data block, count all log records within it without deduplication. This count result is the total query volume statistic for that time window. Perform this operation on all time windows to obtain a time series Q_total[t].

[0096] Step S240: Perform response strategy diversity analysis on the initial response strategy text content in each time window data block, extract the strategy type tags of the initial response strategy text content in each time window data block, count the frequency of different strategy type tags, calculate the proportion of each strategy type frequency to the total frequency of strategies in that time window, and generate the response strategy type proportion distribution vector corresponding to each time window data block.

[0097] For each time window data block, analyze the text content of all initial response strategies. A predefined strategy type classification system is used, for example, responses can be divided into categories such as "providing information," "guiding actions," "confirming questions," and "emotional reassurance." A pre-trained classification model is used to classify each response text and assign a strategy type label. Then, within the window, the frequency of each strategy type label is counted. Assuming there are M strategy types, the frequency vector F_win = [f1, f2, ..., fM] is obtained. Then, the total strategy frequency S_win for the window is calculated as f1 + f2 + ... + fM. Finally, the frequency vector is divided by S_win to obtain the proportion vector P_win = [f1 / S_win, f2 / S_win, ..., fM / S_win], which is the proportional distribution vector of the response strategy types for that window. This process is performed on all windows to obtain a proportional vector sequence P_win[t].

[0098] Step S250: Standardize the active user count sequence and the total query count sequence respectively. Then, perform time series alignment and splicing on the standardized active user count sequence, the standardized total query count sequence, and the response strategy type proportion distribution vector sequence to generate the interaction traffic time series feature matrix corresponding to the original interaction log data stream.

[0099] The active user count sequence U_active[t] obtained in step S220 is Z-score standardized. The mean μ_U and standard deviation σ_U of the entire sequence are calculated, and then (U_active[t] minus μ_U) is divided by σ_U to obtain the standardized sequence U_active_norm[t]. Similarly, the query total count sequence Q_total[t] obtained in step S230 is Z-score standardized to obtain Q_total_norm[t]. The response strategy type proportion distribution vector P_win[t] obtained in step S240 is already a proportion value, ranging from 0 to 1, and usually does not need to be standardized again, or it can be standardized by maximum and minimum values. Finally, the three sequences are aligned in the time dimension t. For each time window t, U_active_norm[t] (a scalar), Q_total_norm[t] (a scalar), and P_win[t] (an M-dimensional vector) are concatenated to form a feature vector with dimension (2+M). Stacking all the feature vectors of t in chronological order forms a two-dimensional interactive traffic time-series feature matrix of shape (number of windows, 2+M).

[0100] Step S310: Perform path strength quantization processing on the first causal path edge, calculate the path coefficient estimate between the user's long-term interest preference distribution vector and the response strategy content semantic vector, and use the path coefficient estimate as the initial value of the path strength of the first causal path edge.

[0101] After constructing the initial causal graph in step S137, the strength of each edge can be further quantified. For the first causal path edge U->R, the path coefficients are estimated using linear or nonlinear regression. A regression model is constructed using U as the input feature and R as the output target. If a linear relationship is assumed, multiple linear regression can be used to learn a weight matrix W_U_R with shape (D, D), then the mapping relationship from U to R can be approximated as R = U * W_U_R. A certain norm of this weight matrix, such as the Frobenius norm, can be used as the initial value of the path strength. If a nonlinear relationship is assumed, a shallow neural network can be used, and the expected norm of the Jacobian matrix of the input U with respect to the output R can be calculated as the path strength. This calculated scalar value is recorded as the initial value of the path strength S_U_R_init for the first causal path edge.

[0102] Step S320: Perform path strength quantization processing on the second causal path edge, calculate the path coefficient estimate between the core semantic vector of the query intent and the semantic vector of the response strategy content, and use the path coefficient estimate as the initial value of the path strength of the second causal path edge.

[0103] Similar to step S310, for the second causal path edge Q->R, a regression model is constructed with Q as input and R as output to estimate its mapping relationship. The initial value S_Q_R_init of the path strength of the second causal path edge is calculated using the norm of the learned weight matrix or Jacobian matrix.

[0104] Step S330: Input the initial path strength values ​​of the first causal path edge and the second causal path edge into the path strength normalization function for normalization processing to generate the normalized strength values ​​of the first causal path edge and the second causal path edge, which are in the range of zero to one.

[0105] To visually compare the strength of two edges, the strength values ​​need to be normalized. The Softmax function is used as the normalization function. The initial path strength values ​​S_UR_init and S_QR_init of the first and second causal path edges are taken as input. The formula for calculating the first normalized strength N_UR is N_UR = exp(S_UR_init) / (exp(S_UR_init) + exp(S_QR_init)). The formula for calculating the second normalized strength N_QR is N_QR = exp(S_QR_init) / (exp(S_UR_init) + exp(S_QR_init)). After calculation using these formulas, both normalized strength values ​​are within the range of 0 to 1 and satisfy N_UR + N_QR = 1, intuitively reflecting the relative strength of the two causal paths.

[0106] Step S340: Thicken the path edges of the initial causal graph topology based on the normalized intensity values ​​of the first and second causal path edges. In the initial causal graph topology, associate the thickness of the visual line of the first causal path edge with the normalized intensity value of the first causal path edge, and associate the thickness of the visual line of the second causal path edge with the normalized intensity value of the second causal path edge.

[0107] When generating the causal graph for display, the visualization attributes of the edges are dynamically adjusted based on the normalized intensity value calculated in step S330. A base line thickness value is set, for example, base_width equals 2 pixels. Then, the display width of the first causal path edge can be set to base_width multiplied by N_U_R multiplied by a certain magnification factor (e.g., 5). The display width of the second causal path edge is set to base_width multiplied by N_Q_R multiplied by the same magnification factor. In this way, in the rendered causal graph, the thickness of the edges intuitively reflects the relative strength of the causal path.

[0108] Step S350: Store the initial causal graph topology structure after path edge thickening processing to the causal graph structure cache database, and assign a version identifier and timestamp information to the initial causal graph topology structure after path edge thickening processing.

[0109] The causal graph data structure updated in step S340 with the edge thickness attribute is serialized, for example, stored in JSON or GraphML format. This data is then written to a dedicated causal graph structure cache database, such as Redis or MongoDB. During writing, a globally unique version identifier, such as a UUID, is assigned to the graph data, and the current timestamp is recorded. This allows subsequent queries or analyses to trace the causal graph structure at different times based on the timestamp and version.

[0110] Step S410: Obtain the original distribution function of the user's long-term interest preference distribution vector in the natural state, randomly sample multiple natural state sample values ​​from the original distribution function, and record the natural state observation value of the response strategy content semantic vector corresponding to each natural state sample value.

[0111] This step provides an alternative method for calculating the intervention effect based on counterfactual comparison. First, a set of all U vectors is obtained from the historical dataset, and its distribution function P(U) under natural conditions is fitted using methods such as kernel density estimation. From this distribution function P(U), multiple random samplings are performed, for example, N_nat times, resulting in a set of natural state sample values ​​U_nat_1, U_nat_2, ..., U_nat_N_nat. For each sampled U_nat_i, the closest real interaction sample is found from the historical dataset, and the corresponding response policy content semantic vector R_nat_i is recorded. This yields a set of paired observations of U and R under natural conditions (i.e., without intervention).

[0112] Step S420: Apply an intervention operation to the user's long-term interest preference distribution vector, forcibly setting the vector value of the user's long-term interest preference distribution vector to a preset intervention target value, sampling and generating multiple intervention state sample values ​​from the post-intervention distribution corresponding to the intervention target value, and recording the intervention state observation value of the response strategy content semantic vector corresponding to each intervention state sample value.

[0113] Define an intervention target value U_do, which could be the mean vector of U or a specific quantile vector. Construct a post-intervention distribution P(U|do(U=U_do)), a degenerate distribution where all probability mass is concentrated on U_do. Sampling from this distribution essentially involves repeatedly taking the value U_do. To obtain the corresponding R value, a post-intervention environment can be constructed as in step S142, i.e., finding all samples in the historical dataset whose U values ​​are closest to U_do, and then using the R values ​​of these samples as the intervention state observations. Randomly sample N_do times from these samples (which can be repeated with replacement) to obtain a set of intervention state sample values ​​R_do_1, R_do_2, ..., R_do_N_do.

[0114] Step S430: Perform difference calculation on the intervention state observation value and the natural state observation value to obtain the set of changes in the semantic vector of the response strategy content caused by the user's long-term interest preference distribution vector changing from the natural state to the intervention target value.

[0115] The set of intervention state observations R_do_j obtained in step S420 is paired with the set of natural state observations R_nat_i obtained in step S410 to calculate the change. Since the two sets of samples may have different sizes and are not naturally paired, a comparison strategy is needed. For example, the mean E_R_do of all intervention state observations and the mean E_R_nat of all natural state observations can be calculated, and then the difference vector ΔR = E_R_do - E_R_nat can be calculated. Alternatively, a more refined pairing can be performed; for example, for each R_nat_i, the most similar sample R_do_j in the R_do set can be found, and the difference ΔR_ij can be calculated. Finally, a set of changes is obtained, where each element is a D-dimensional vector representing the change in R when switching from one state to another.

[0116] Step S440: Perform statistical summary processing on the set of changes, calculate the average value of the set of changes as the average intervention effect estimate, and calculate the standard deviation of the set of changes as the intervention effect volatility estimate.

[0117] Statistical analysis is performed on the set of changes obtained in step S430. Assume the set of changes contains L difference vectors, each denoted as ΔR_l. The average intervention effect estimate ATE_vec is calculated as ATE_vec = (∑{l=1}^{L}ΔR_l) / L, which is the arithmetic mean of all difference vectors, resulting in a D-dimensional average vector. The intervention effect volatility estimate STE_vec is calculated as STE_vec = √((∑{l=1}^{L}(ΔR_l-ATE_vec)²) / L), which is the standard deviation calculated for each dimension, resulting in a D-dimensional standard deviation vector. These two vectors together provide a richer description of the intervention effect: ATE_vec reflects the average direction and magnitude of the effect, while STE_vec reflects the stability and volatility range of the effect.

[0118] Step S450: Combine the average intervention effect estimate and the intervention effect volatility estimate into a supplementary descriptive parameter for the first intervention effect measure of the user's long-term interest preference distribution vector to the response strategy content semantic vector.

[0119] The average intervention effect estimate ATE_vec and the intervention effect volatility estimate STE_vec calculated in step S440 are used as supplementary parameters and associated with the scalar first intervention effect measure ATE_U_R calculated in step S146. Specifically, an intervention effect description object containing three components is constructed. This intervention effect description object consists of (ATE_U_R, ATE_vec, STE_vec), which includes not only the scalar value of the overall effect strength, but also the specific direction and magnitude changes of the effect in each dimension, as well as the corresponding volatility information.

[0120] For example, step S510: collect user interaction feedback signals triggered by the user on the user interaction interface in response to the interactive decision explanation feedback data stream. The user interaction feedback signals include the user's click and view operation records of explainable response content, the user's hover browsing time records of causal path visualization elements, and the user's satisfaction rating records of the overall interactive decision.

[0121] After displaying the explanation information to the user in step S156, front-end event tracking is initiated. Three types of user interaction feedback signals are collected. The first is click-to-view operation logs: when a user clicks on a highlighted, explainable response content unit, the front-end records the explainable content identifier and click timestamp. The second is hover browsing duration recording: when the user's mouse hovers over the arrow line along a causal path, a timer starts and continues until the mouse moves away, recording the path identifier and hover duration. The third is satisfaction rating recording: after the interaction, the user can rate their overall satisfaction with the explanation, for example, by clicking on controls from 1 to 5 stars; the front-end records the rating value and submission timestamp. All these signals, along with the user identifier and session identifier, are packaged and sent back to the back-end server.

[0122] Step S520: Associate and store the user interaction feedback signal with the corresponding interpretable response content identifier and the first causal path edge identifier to generate an interaction decision interpretation effect evaluation dataset containing user feedback annotation information.

[0123] After receiving user interaction feedback signals, the backend server parses the content. For click and hover signals, it associates them with corresponding interpretable content identifiers and causal path edge identifiers. For example, a click record is stored as a triple (user identifier, interpretable content ID, click timestamp). All collected feedback signals, including clicks, hovers, and ratings, are stored in a dedicated evaluation dataset. Each record is associated with the specific session that generated the interpretation, allowing the server to trace which User (U), User (Q), User (R), and causal graph version led to these user responses.

[0124] Step S530: Perform periodic statistical analysis on the interactive decision explanation effect evaluation dataset, calculate the average click-through rate of different user groups for different types of explainable response content, calculate the average hover browsing time corresponding to different causal path edges, and calculate the average satisfaction score of the overall interactive decision explanation scheme.

[0125] Perform offline statistical analysis on the evaluation dataset collected in step S520 periodically (e.g., daily or weekly). Divide users into different user groups based on the clustering results of their long-term interest preferences U. For each group, calculate the average click-through rate (CTR) of interpretable response content for each type (e.g., categories obtained through text clustering), which is the number of people in that group who clicked on that type of content divided by the total number of people in that group. For each causal path edge, calculate the average hover time for all users. Simultaneously, calculate the average satisfaction rating submitted by all users. These statistical indicators constitute a quantitative evaluation of the effectiveness of the explanation scheme.

[0126] Step S540: Generate a framework performance evaluation report of the counterfactual causal reasoning framework based on the average click-through rate, the average hover browsing time, and the average satisfaction score, and compare the evaluation indicators in the framework performance evaluation report with preset performance benchmark thresholds.

[0127] The statistical indicators calculated in step S530 are compiled into a framework performance evaluation report. The report includes click-through rates for each group, hover duration for each side, and overall satisfaction. These indicators are compared with pre-set performance benchmark thresholds. For example, if the preset overall satisfaction threshold is 4.0 points, and the current average satisfaction score is below 3.8 points, it is considered unsatisfactory. A preset hover duration threshold for a critical path is 5 seconds; if the actual average hover duration is below 4 seconds, it indicates that the user may not have understood or is not interested in the explanation.

[0128] Step S550: When the average satisfaction score is lower than the preset satisfaction threshold, the framework parameter adjustment process of the counterfactual causal reasoning framework is triggered, and the conditional independence threshold parameter and path edge confidence threshold parameter in the counterfactual causal reasoning framework are adaptively adjusted to generate an updated version of the counterfactual causal reasoning framework.

[0129] When the comparison in step S540 finds that the evaluation index is below a threshold, such as satisfaction being below a threshold, the framework's self-adjustment process is automatically triggered. The hyperparameter optimization algorithm is initiated, using the framework's key parameters—the conditional independence threshold T_CI and the path edge confidence threshold T_conf—as variables to be optimized. The objective function F(T_CI, T_conf) is defined as maximizing the average satisfaction score S_avg. The optimization algorithm iteratively tries different combinations of (T_CI, T_conf), calculates the corresponding F(T_CI, T_conf) value, and finally selects the parameter combination that maximizes F(T_CI, T_conf) as the new framework parameters, generating an updated version of the counterfactual causal reasoning framework.

[0130] For example, step S610: Obtain the historical interaction session sequence belonging to the same user dimension as the current interaction session. The historical interaction session sequence includes multiple historical interaction session units arranged in chronological order. Each historical interaction session unit includes a historical user long-term interest preference distribution vector, a historical query intent core semantic vector, and a historical response strategy content semantic vector.

[0131] This step introduces causal transmission over time. For a user in the current interaction session, in addition to extracting their static long-term interest preferences U, it is also necessary to obtain their historical interaction sequence. From the user's historical behavior database, retrieve all historical interaction records associated with the user's unique identifier. Sort these records by timestamp to form a historical interaction session sequence. For each historical interaction session unit in the sequence, a corresponding element vector is required: the historical user long-term interest preference distribution vector U_prev, the historical query intent core semantic vector Q_prev, and the historical response strategy content semantic vector R_prev. These vectors can be obtained by processing the historical logs using the same method as in step S120.

[0132] Step S620: Perform reverse traversal processing on the historical interaction session sequence along the time axis. Starting from the historical interaction session unit furthest from the current interaction session time point, extract the historical user long-term interest preference distribution vector, the historical query intent core semantic vector, and the historical response strategy content semantic vector from each historical interaction session unit in sequence to construct a historical interaction element time series matrix.

[0133] From the historical interaction session sequence, extract the triples (U_prev, Q_prev, R_prev) of each unit in chronological order from oldest to most recent. Arrange the extracted vectors in chronological order (oldest first, most recent last) to form a time series matrix. The rows of the matrix correspond to time points, and the columns correspond to variables. Assuming there are T historical interaction units, and each vector has dimension D, then the shape of the matrix is ​​(T, 3*D), and each row is a concatenation of three vectors at a single time point.

[0134] Step S630: Input the historical interaction element time series matrix into the temporal causal transmission analysis module for causal influence backpropagation path mining along the time axis, calculate the first temporal influence transmission coefficient of the historical response strategy content semantic vector of each historical interaction session unit in the historical interaction session sequence on the historical user long-term interest preference distribution vector of subsequent adjacent historical interaction session units, and calculate the second temporal influence transmission coefficient of the historical response strategy content semantic vector of each historical interaction session unit in the historical interaction session sequence on the historical query intent core semantic vector of subsequent adjacent historical interaction session units.

[0135] The time series matrix constructed in step S620 is input into the temporal causal transit analysis module. This module uses a vector autoregression model for modeling. For adjacent time points, regression equations are established: U_t=A×R_{t-1}+B×U_{t-1}+C×Q_{t-1}+ε_u, and Q_t=D×R_{t-1}+E×U_{t-1}+F×Q_{t-1}+ε_q. The coefficient matrix A obtained through learning has a shape of (D, D), representing the influence strength of historical responses R_{t-1} on each dimension of the current user interest U_t. The formula for calculating the first temporal influence transit coefficient α_t is α_t=√(∑{i=1}^{D}∑{j=1}^{D}A_{ij}²), which is the Frobenius norm of matrix A. Similarly, the formula for calculating the second time-series influence propagation coefficient β_t is β_t=√(∑{i=1}^{D}∑{j=1}^{D}D_{ij}²), which is the Frobenius norm of matrix D.

[0136] Step S640: Construct a cross-session causal chain propagation network based on the first temporal influence propagation coefficient and the second temporal influence propagation coefficient. The cross-session causal chain propagation network includes a first type of temporal causal edge that starts from the semantic vector of the historical reply strategy content and points to the distribution vector of the long-term interest preferences of the subsequent historical users, and a second type of temporal causal edge that starts from the semantic vector of the historical reply strategy content and points to the core semantic vector of the subsequent historical query intent.

[0137] Based on the influence coefficients calculated in step S630, a cross-session causal chain propagation network is constructed. The nodes of the network are U_t, Q_t, and R_t at each time point. Two types of directed edges are added between adjacent time points: the first type of temporal causal edge, pointing from R_{t-1} to U_t, with the edge weight being the first temporal influence propagation coefficient; and the second type of temporal causal edge, pointing from R_{t-1} to Q_t, with the edge weight being the second temporal influence propagation coefficient. In this way, a causal chain is formed along the time axis from the past to the future, revealing how past system responses shape users' future interests and queries.

[0138] Step S650: Perform network topology analysis on the cross-session causal chain propagation network to identify causal chain convergence nodes and causal chain forking nodes in the cross-session causal chain propagation network. The causal chain convergence node is a historical user long-term interest preference distribution vector or historical query intent core semantic vector that is jointly pointed to by multiple temporal causal edges. The causal chain forking node is a historical response strategy content semantic vector that sends out multiple temporal causal edges pointing to different subsequent nodes.

[0139] Perform graph analysis on the network constructed in step S640. Calculate the in-degree of each node. For a node U_t or Q_t, if its in-degree is greater than 1, meaning multiple edges from different historical responses point to it, then this node is marked as a causal chain convergence node, indicating that multiple historical events jointly influence the current state. Calculate the out-degree of each node. For a node R_t, if its out-degree is greater than 1, meaning there are edges pointing to subsequent nodes such as U_{t+1} and Q_{t+1}, then this node is marked as a causal chain branching node, indicating that a historical response simultaneously influences multiple aspects of the future.

[0140] Step S660: Based on the positional distribution of the causal chain convergence node and the causal chain branch node in the network topology, generate the causal transmission path backbone of the cross-session causal chain propagation network, and perform graph structure overlay and fusion processing on the causal transmission path backbone and the initial causal graph topology of the current interactive session to generate an enhanced causal graph topology containing cross-session temporal causal transmission information.

[0141] A subnetwork consisting of convergence and branching nodes is extracted from the cross-session causal chain propagation network to serve as the backbone of the causal propagation path. This backbone highlights the key nodes affecting propagation. Then, this backbone is superimposed and merged with the initial causal graph topology of the current interaction session constructed in step S137 (containing nodes U, Q, and R and their edges). During merging, the nodes U_{t-1}, Q_{t-1}, and R_{t-1} from the last time point in the historical sequence (i.e., the historical session immediately adjacent to the current session), along with their temporal causal edges pointing to the current sessions U and Q, are added to the current causal graph. This generates an enhanced causal graph topology that not only contains the causal logic within the current interaction but also the influence paths of historical interactions on the current state.

[0142] Step S670: Based on the enhanced causal graph topology, perform path blocking intervention simulation again, fix the vector value of the user's long-term interest preference distribution vector to each discrete value within the reference distribution range, observe the conditional probability distribution change trajectory of the response strategy content semantic vector under the condition of considering the cross-session temporal causal transmission information, and generate a first intervention effect correction metric value that incorporates the influence of historical interactions.

[0143] Using the enhanced causal graph generated in step S660, the intervention simulation similar to step S140 is re-executed. However, this time, although the object of intervention is the current session's U, the data environment needs to consider the influence of historical causal chains. When constructing the post-intervention data distribution environment, it is not enough to simply select samples with similar U values; it is also necessary to ensure that the historical causal chain structure of these samples is similar, or to use a time-series model to simulate how historical influences are transmitted after U is intervened. For example, a time-series simulator based on structural equation modeling can be used. The current value of U is fixed as d_k, and then the learned time-series influence coefficients are used to deduce the possible historical paths under the new U value, thereby generating synthetic samples that conform to the constraints of the enhanced causal graph. Then, the distribution of R is observed on these samples, the conditional probability density and expectation are calculated, and finally a modified intervention effect measure ATE_U_R_augmented is obtained. This value reflects the true causal effect of current user interest on the response after considering the long-term shaping effect of historical interactions.

[0144] Step S680: Based on the first intervention effect correction metric value that incorporates historical interaction influence, perform contribution tracing and sorting processing on the initial response strategy text content, and select response strategy text units whose first intervention effect correction metric value that incorporates historical interaction influence exceeds a preset contribution threshold as interpretable response content of the fused temporal causal chain. Encapsulate and fuse the interpretable response content of the fused temporal causal chain with the path identification information of the causal transmission path backbone to generate an enhanced interactive decision explanation feedback data stream containing the historical interaction causal transmission trajectory.

[0145] Finally, the modified metric ATE_U_R_augmented obtained in step S670 is used to replace ATE_U_R in step S140, and the contribution tracing and sorting process in step S150 is repeated. Response text units that remain important after considering historical influence are selected. When generating the final explanation data package, it must include not only the causal path identifier of the current session but also the key nodes and edge information in the backbone of the causal transmission path related to the user identified in step S650. This information is encapsulated together to generate an enhanced interactive decision explanation feedback data stream. After being pushed to the front end, the visualization rendering engine can not only display the edge from U to R but also the temporal edge from a past response to the current U, thus providing the user with a more comprehensive and time-deep explanation of the decision, clearly revealing the complete causal chain that "your past interactions shaped your current interests, which in turn influenced this response."

[0146] For example, step S710: Obtain a set of parallel interactive sessions that belong to the same interactive scenario dimension as the current interactive session. The set of parallel interactive sessions contains multiple parallel interactive session units. Each parallel interactive session unit corresponds to a different user unique identifier code but has the same original query text content. Each parallel interactive session unit contains a parallel user long-term interest preference distribution vector, a parallel query intent core semantic vector, and a parallel response strategy content semantic vector.

[0147] This step introduces a group comparison dimension. All interaction records with the same original query text content as the current interaction session (e.g., both are "how to subscribe to a data package") are retrieved from the historical database. These records come from different users (with different user unique identifier codes), thus forming a set of parallel interaction sessions. For each session unit in the set, the corresponding parallel user long-term interest preference distribution vector U_parallel, the parallel query intent core semantic vector Q_parallel (although the query text is the same, the semantic vector may be slightly different due to context or word segmentation differences), and the parallel response strategy content semantic vector R_parallel are extracted according to the method in step S120.

[0148] Step S720: Perform counterfactual alignment processing on the set of parallel interaction sessions, forcibly setting the core semantic vector of the parallel query intent of all parallel interaction session units in the set to the same vector value as the core semantic vector of the query intent of the current interaction session, and constructing the query intent... Figure 1 A set of parallel interactive conversations after normalization.

[0149] To fairly compare the impact of different user interests on responses, it's necessary to eliminate interference from differences in query intent. Therefore, an intervention operation is performed on the set of parallel interaction sessions, forcibly replacing the Q_parallel vector of each session with the core semantic vector Q_current of the current interaction session's query intent. This replacement is a counterfactual operation, assuming that these parallel users also asked the exact same questions as the current user (completely identical in semantic vector space). After the replacement, a new dataset is obtained, namely the query intent... Figure 1 The normalized parallel interactive session control set. In this set, all samples have the same Q value (equal to Q_current), but the U value (U_parallel) and R value (R_parallel) retain their original, natural variations.

[0150] Step S730: For the query intent Figure 1The normalized parallel interaction conversation control set is subjected to parallel intervention effect analysis. The parallel intervention effect measure of the parallel user long-term interest preference distribution vector of each parallel interaction conversation unit in the parallel interaction conversation control set on the corresponding parallel response strategy content semantic vector is calculated, and a set of parallel intervention effect measure values ​​is generated.

[0151] In the control set constructed in step S720, since Q is fixed, variations in R can only be attributed to differences in U. For each sample (U_parallel_i, R_parallel_i) in the set, its "intervention effect" can be calculated, but this requires defining the "dosage" of the intervention. Here, we are not concerned with the absolute effect of U_parallel_i itself on R_parallel_i, but rather with the effect of U_parallel_i relative to a certain benchmark. One approach is to use the mean vector U_mean of the entire set of U_parallels as the benchmark. Then, for each sample i, we calculate the degree to which its U_parallel_i deviates from the benchmark, ΔU_i = U_parallel_i - U_mean, and the degree to which its R_parallel_i deviates from the benchmark, ΔR_i = R_parallel_i - R_mean. The parallel intervention effect measure ATE_parallel_i can be defined as the ratio of a certain norm of ΔR_i to a certain norm of ΔU_i, or estimated by a local linear regression model, to assess the sensitivity of changes in U to changes in R around U_mean. Performing this operation on each sample in the set yields a set of N_parallel scalar values ​​representing the parallel intervention effect measure.

[0152] Step S740: Perform distribution statistical processing on the set of parallel intervention effect measures, and calculate the statistical distribution characteristic parameters of the set of parallel intervention effect measures. The statistical distribution characteristic parameters include the mean, variance, and quantile distribution interval of the parallel intervention effect measures.

[0153] Descriptive statistical analysis is performed on the set of parallel intervention effect measures generated in step S730. The arithmetic mean μ_parallel, variance σ²_parallel, and values ​​of each quantile, such as the 25th quantile, median, and 75th quantile, are calculated. These statistical parameters characterize the average sensitivity and range of fluctuation in responses to the same query at the "group" level, when users have different long-term interests.

[0154] Step S750: Compare and analyze the statistical distribution characteristic parameters of the first intervention effect measure corresponding to the current interactive session with the set of parallel intervention effect measures, calculate the percentile of the relative position of the first intervention effect measure corresponding to the current interactive session in the distribution of the set of parallel intervention effect measures, and generate the intervention effect abnormality index of the current interactive session.

[0155] The first intervention effect measure ATE_U_R of the current interaction calculated in step S146 is taken as an observation and placed into the distribution of the set of parallel intervention effect measures obtained in step S740. The percentile position of this ATE_U_R value in the distribution of the parallel effect set is calculated. For example, if the distribution of the parallel effect set is normal, it can be converted using the Z-score. A more robust method is to directly calculate the ordinal percentile: compare ATE_U_R with all values ​​in the parallel set, count the number of values ​​in the set less than ATE_U_R, divide by the total number of samples in the set, and obtain a value between 0 and 1, which is the intervention effect anomaly index AI. If AI is close to 0.5, it indicates that the current effect is at the group median level; if AI is close to 0.95, it indicates that the current effect is much larger than most cases in the group, showing a positive anomaly; if AI is close to 0.05, it indicates that it is much smaller than most cases in the group, showing a negative anomaly.

[0156] Step S760: When the abnormality index of the intervention effect exceeds the preset abnormality threshold, the counterfactual anomaly detection alarm process is triggered, the key difference factors that cause the abnormality of the intervention effect are extracted, the key difference factors are correlated with the user long-term interest preference distribution vector of the current interaction session, and an anomaly detection report containing the abnormality identifier of the intervention effect and the key difference factor label is generated.

[0157] An anomaly threshold T_AI is preset, for example, 0.9 or 0.1. If the AI ​​calculated in step S750 is greater than 0.9 or less than 0.1, the causal effect of the current interaction is considered to be significantly different from the group's normal state, triggering anomaly detection. To find the cause of the anomaly, feature attribution analysis needs to be performed on the current user's U vector. For example, the current user's U vector is compared with the U vectors of users in the normal effect range in the parallel set. A binary classifier can be trained to distinguish between the "normal group" and the "abnormal group" (the current user), and then feature importance analysis (such as SHAP value) can be used to find which dimensions of features (i.e., which aspects of the user's interests) are the key factors that cause them to be classified into the abnormal group. These key difference factors are extracted, such as an abnormally high score for a certain interest topic. The above factors are associated with the current user's U vector to generate an anomaly detection report, which includes an anomaly indicator of the intervention effect, as well as labels (e.g., "Preference for topic A is far greater than normal") and quantified values ​​of the key difference factors.

[0158] Step S770: The anomaly detection report is associated and encapsulated with the first intervention effect measurement value to generate an interpretability-enhanced data unit containing group comparison reference information. The interpretability-enhanced data unit is then pushed to the advanced analysis area of ​​the user interface for comparative visualization.

[0159] Finally, the anomaly detection report generated in step S760 is encapsulated with the first intervention effect measure ATE_U_R of the current interaction to form a richer data unit. This unit not only includes "the current user's interest led to the above response" (individual causal effect) but also "compared to most other users with the same question, the current user's interest has an abnormally high / low impact on the response" (group comparison information). This interpretability-enhanced data unit is pushed to the advanced analysis area on the front end for display. In this area, two curves can be displayed simultaneously: one is the trajectory of the current user's expected value change, and the other is the trajectory of the group average expected value change. At the same time, areas with significant differences are marked with special colors or labels, and the key difference factors in the anomaly detection report are displayed in text form.

[0160] Figure 2 This illustration shows a system 100 for interpretable analysis of intelligent interactive decisions based on causal reasoning, provided in an embodiment of this application. The system includes a processor 1001, a memory 1003, and program code stored in the memory 1003. The processor 1001 executes the program code to implement the steps of the method for interpretable analysis of intelligent interactive decisions based on causal reasoning. The processor 1001 and the memory 1003 are connected, for example, via a bus 1002. Optionally, the system 100 may further include a transceiver 1004, which can be used for data interaction between this system and other systems, such as sending and / or receiving data. It should be noted that in actual scheduling, the transceiver 1004 is not limited to one, and the structure of this system does not constitute a limitation on the embodiments of this application. The memory 1003 stores the program code for executing the embodiments of this application and is controlled by the processor 1001. The processor 1001 is used to execute program code stored in the memory 1003 to implement the steps shown in the foregoing method embodiments.

[0161] This application provides a computer-readable storage medium storing program code, which, when executed by a processor, can implement the steps and corresponding content of the aforementioned method embodiments.

[0162] The above description is only an optional implementation method for some implementation scenarios of this application. It should be noted that for those skilled in the art, other similar implementation methods based on the technical concept of this application, without departing from the technical concept of this application, also fall within the protection scope of the embodiments of this application.

Claims

1. An intelligent interaction decision explainability analysis method combined with causal reasoning, characterized in that, The method includes: Obtain the raw interaction log data stream of the intelligent interaction system. The raw interaction log data stream includes the user's unique identifier code, the original query text content submitted by the user, and the initial response strategy text content pushed by the interaction engine for the original query text content. The original interaction log data stream is subjected to interaction element decoupling processing. The user's long-term interest preference distribution vector is extracted from the historical interaction records associated with the user's unique identifier code. The core semantic vector of the query intent is extracted from the original query text content. The semantic vector of the reply strategy content is extracted from the initial reply strategy text content. The user's long-term interest preference distribution vector and the query intent core semantic vector are used as the set of antecedent variables, and the response strategy content semantic vector is used as the consequence variable. They are input into the counterfactual causal reasoning framework for causal path search processing to generate a first causal path edge starting from the user's long-term interest preference distribution vector and a second causal path edge starting from the query intent core semantic vector, so as to construct an initial causal graph topology. The first causal path edge is subjected to path blocking intervention simulation processing. The trajectory of the conditional probability distribution change of the semantic vector of the response strategy content is observed when the long-term interest preference distribution vector of the user takes different intervention values. The first intervention effect measure of the long-term interest preference distribution vector of the user on the semantic vector of the response strategy content is calculated based on the trajectory of the conditional probability distribution change. Based on the first intervention effect measurement value, the initial response strategy text content is sorted by contribution. Response strategy text units whose first intervention effect measurement value exceeds a preset contribution threshold are selected as interpretable response content. The interpretable response content is then fused with the path identification information of the first causal path edge to generate an interactive decision explanation feedback data stream. The interactive decision explanation feedback data stream is then pushed to the user interface for visualization rendering and display. 2.The method of claim 1, wherein, The process of decoupling interaction elements from the original interaction log data stream, extracting the user's long-term interest preference distribution vector from the historical interaction records associated with the user's unique identifier code, extracting the core semantic vector of the query intent from the original query text content, and extracting the semantic vector of the response strategy content from the initial response strategy text content, includes: The user's historical behavior database is retrieved based on the user's unique identifier code. All historical interaction record entries associated with the user's unique identifier code are retrieved from the user's historical behavior database. The historical interaction record entries include historical query timestamps, historical query text content, historical clicked item identifiers, and historical dwell time parameters. The historical query text content in the historical interaction record entries is modeled for topic distribution to generate a historical query topic distribution probability vector corresponding to the user's unique identifier. The historical click item identifiers in the historical interaction record entries are aggregated for frequency statistics to generate a historical click preference frequency vector corresponding to the user's unique identifier. The historical query topic distribution probability vector and the historical click preference frequency vector are standardized to obtain standardized historical query topic distribution vectors and standardized historical click preference distribution vectors. The standardized historical query topic distribution vectors and the standardized historical click preference distribution vectors are then concatenated and fused to obtain the user's long-term interest preference distribution vector corresponding to the user's unique identifier. The original query text content is parsed using syntactic structure analysis to identify the core query entity word units and core query intent word units in the original query text content. The core query entity word units are then input into a pre-trained first semantic encoding network for entity semantic vectorization mapping to obtain the core query entity semantic vector. The core query intent word unit is input into the pre-trained second semantic encoding network for intent semantic vectorization mapping to obtain the core query intent semantic vector. Attention weighted fusion processing is performed on the core query entity semantic vector and the core query intent semantic vector to output the query intent core semantic vector corresponding to the original query text content. The initial response strategy text content is parsed to extract the strategy backbone description text fragments and strategy detail modification text fragments from the initial response strategy text content. The strategy backbone description text fragments are then input into a pre-trained backbone semantic encoding network for backbone semantic vectorization mapping to obtain the strategy backbone semantic vector. The text fragments of the strategy detail modification are input into a pre-trained detail semantic encoding network for detail semantic vectorization mapping to obtain the strategy detail semantic vector. The strategy backbone semantic vector and the strategy detail semantic vector are then subjected to gating fusion processing to generate the response strategy content semantic vector corresponding to the initial response strategy text content. The user's long-term interest preference distribution vector, the query intent core semantic vector, and the response strategy content semantic vector are aligned according to a preset vector dimension alignment rule, so that the user's long-term interest preference distribution vector, the query intent core semantic vector, and the response strategy content semantic vector have the same number of feature dimensions, resulting in dimension-aligned user's long-term interest preference distribution vector, dimension-aligned query intent core semantic vector, and dimension-aligned response strategy content semantic vector. The user long-term interest preference distribution vector, query intent core semantic vector, and response strategy content semantic vector, after dimension alignment, are subjected to vector distribution normalization processing to obtain normalized user long-term interest preference distribution vector, normalized query intent core semantic vector, and normalized response strategy content semantic vector. These are then combined into an interaction triplet element set. Each vector in the interaction triplet element set is assigned a unique element identifier. A mapping relationship is established between the user long-term interest preference distribution vector and the user's unique identifier code, a mapping relationship is established between the query intent core semantic vector and the original query text content, and a mapping relationship is established between the response strategy content semantic vector and the initial response strategy text content. 3.The method of claim 1, wherein, The step of taking the user's long-term interest preference distribution vector and the query intent core semantic vector as antecedent variables, and the response strategy content semantic vector as a consequent variable, and inputting them into a counterfactual causal reasoning framework for causal path search processing to generate a first causal path edge starting from the user's long-term interest preference distribution vector and a second causal path edge starting from the query intent core semantic vector includes: The user's long-term interest preference distribution vector and the query intent core semantic vector are used to form an antecedent variable matrix. The response strategy content semantic vector is used as a consequence variable vector. The antecedent variable matrix and the consequence variable vector are input into the input interface of the counterfactual causal reasoning framework. The antecedent variable matrix and the consequence variable vector are processed by data type validation and dimension matching through the input interface. The validated antecedent variable matrix and consequence variable vector are output. The conditional independence test module of the counterfactual causal reasoning framework is invoked to perform conditional independence tests on the antecedent variable matrix and the consequent variable vector. The first conditional mutual information value between the user's long-term interest preference distribution vector and the response strategy content semantic vector is calculated under the condition of the query intent core semantic vector. The second conditional mutual information value between the query intent core semantic vector and the response strategy content semantic vector is calculated under the condition of the user's long-term interest preference distribution vector. The third conditional mutual information value between the user's long-term interest preference distribution vector and the query intent core semantic vector is calculated under the condition of the response strategy content semantic vector. The first conditional mutual information value, the second conditional mutual information value, and the third conditional mutual information value are compared with a preset unified conditional independence threshold. When the first conditional mutual information value exceeds the preset unified conditional independence threshold, it is determined that there is a direct causal relationship between the user's long-term interest preference distribution vector and the response strategy content semantic vector, and a first causal path edge candidate identifier is generated with the user's long-term interest preference distribution vector as the starting point and the response strategy content semantic vector as the ending point. When the value of the second conditional mutual information exceeds the preset unified conditional independence threshold, it is determined that there is a direct causal relationship between the core semantic vector of the query intent and the semantic vector of the response strategy content, and a second causal path edge candidate identifier is generated with the core semantic vector of the query intent as the starting point and the semantic vector of the response strategy content as the ending point. When the value of the third conditional mutual information exceeds the preset unified conditional independence threshold, it is determined that there is a covariant relationship between the user's long-term interest preference distribution vector and the query intent core semantic vector, and a covariant path edge candidate identifier between the user's long-term interest preference distribution vector and the query intent core semantic vector is generated. The first causal path candidate identifier, the second causal path candidate identifier, and the covariant path candidate identifier are subjected to path edge confidence assessment. The antecedent variable matrix and the consequent variable vector are resampled multiple times using a bootstrap sampling method. The conditional independence test is repeated based on the data subset obtained from each resampling. The frequency percentage of the first causal path candidate identifier in the multiple resampling processes is counted as the first path edge confidence score, and the frequency percentage of the second causal path candidate identifier in the multiple resampling processes is counted as the second path edge confidence score. When the confidence score of the first path edge exceeds the preset path edge confidence threshold, the first causal path edge candidate identifier is converted into a formal first causal path edge identifier. When the confidence score of the second path edge exceeds the preset path edge confidence threshold, the second causal path edge candidate identifier is converted into a formal second causal path edge identifier. An initial causal graph topology is constructed based on the formal first causal path edge identifier and the formal second causal path edge identifier. 4.The method of claim 1, wherein, The step of performing path blocking intervention simulation on the first causal path edge, observing the conditional probability distribution change trajectory of the response strategy content semantic vector when the user's long-term interest preference distribution vector takes different intervention values, and calculating the first intervention effect measure value of the user's long-term interest preference distribution vector on the response strategy content semantic vector based on the conditional probability distribution change trajectory, includes: Obtain the original distribution range of the user's long-term interest preference distribution vector, divide the original distribution range into multiple discrete intervention intervals according to an equally spaced division method, assign an intervention value label to each discrete intervention interval, and generate an intervention value label set containing multiple intervention value labels; Each intervention value label is selected sequentially from the set of intervention value labels as the current intervention value label. The vector value of the user's long-term interest preference distribution vector is forcibly set to the intervention value corresponding to the current intervention value label. The vector value of the query intent core semantic vector changes naturally, and the post-intervention data distribution environment when the user's long-term interest preference distribution vector takes the current intervention value is constructed. In the post-intervention data distribution environment, multiple observation sample values ​​of the semantic vector of the response strategy content are collected, and probability density estimation processing is performed on the multiple observation sample values ​​to generate the conditional probability density function of the semantic vector of the response strategy content under the condition that the user's long-term interest preference distribution vector takes the current intervention value. Repeat the steps of selecting intervention value labels, constructing the post-intervention data distribution environment, and generating the conditional probability density function until all intervention value labels in the intervention value label set are traversed to obtain the sequence of conditional probability density functions of the semantic vector of the response strategy content under the condition that the user's long-term interest preference distribution vector takes each intervention value. Numerical integration is performed on the conditional probability density function sequence to calculate the expected value of the response strategy content semantic vector under the condition that the user's long-term interest preference distribution vector takes various intervention values, and the trajectory curve of the expected value of the response strategy content semantic vector as the intervention value of the user's long-term interest preference distribution vector changes is obtained. The expected value change trajectory curve is subjected to first-order difference processing to calculate the difference between the expected values ​​corresponding to adjacent intervention values. The ratio of the difference to the interval distance between adjacent intervention values ​​is used as the local intervention effect value. The local intervention effect value is subjected to weighted average processing to generate the first intervention effect measure value of the user's long-term interest preference distribution vector to the semantic vector of the response strategy content. 5.The method of claim 1, wherein, The step of performing contribution tracing and sorting processing on the initial response strategy text content based on the first intervention effect metric, selecting response strategy text units whose first intervention effect metric exceeds a preset contribution threshold as interpretable response content, fusing the interpretable response content with the path identification information of the first causal path edge to generate an interactive decision explanation feedback data stream, and pushing the interactive decision explanation feedback data stream to the user interface for visualization rendering and display includes: The initial response strategy text content is processed by text segmentation. The initial response strategy text content is divided into multiple response strategy text units according to semantic integrity. A unit identifier is assigned to each response strategy text unit. The local response strategy content semantic vector corresponding to each response strategy text unit is extracted. The semantic vector of the local response strategy content corresponding to each response strategy text unit is correlated with the user's long-term interest preference distribution vector. The path propagation contribution weight of each local response strategy content semantic vector on the first causal path edge is calculated. The path propagation contribution weight is multiplied with the first intervention effect measurement value to obtain the first intervention effect contribution score corresponding to each response strategy text unit. Based on the contribution score of the first intervention effect corresponding to each response strategy text unit, multiple response strategy text units are sorted in descending order to generate a contribution ranking list containing the identifier of the response strategy text unit and its corresponding first intervention effect contribution score. Starting from the beginning of the contribution ranking list, the response strategy text units with the highest contribution ranking are extracted sequentially as a set of candidate interpretable response content. From the set of candidate explainable response content, response strategy text units whose contribution score of the first intervention effect exceeds a preset contribution threshold are selected as formal explainable response content. Each formal explainable response content is assigned an explainable content identifier, and the original position information of each formal explainable response content in the initial response strategy text content is recorded. The formal interpretable response content and the path identification information of the first causal path edge are encapsulated to generate a first interpretable data package containing interpretable response text data, the first causal path edge identifier and the original location information. The first interpretable data package is then input into the visualization rendering engine. The visualization rendering engine performs graphical element mapping processing on the first interpretable data package, maps the formal interpretable response content to a highlighted text area, maps the first causal path edge identifier to an arrow line connecting the user's interest node to the response text area, generates an interactive decision explanation image frame containing causal path visualization elements, and pushes the interactive decision explanation image frame to the display buffer of the user interaction interface for real-time refresh display.

6. The intelligent interactive decision interpretability analysis method combining causal reasoning according to claim 1, characterized in that, The method further includes: performing path blocking intervention simulation on the first causal path edge, observing the conditional probability distribution change trajectory of the response strategy content semantic vector when the user's long-term interest preference distribution vector takes different intervention values, and calculating the first intervention effect measure of the user's long-term interest preference distribution vector on the response strategy content semantic vector based on the conditional probability distribution change trajectory; The second causal path edge is subjected to path blocking intervention simulation processing. The vector value of the query intent core semantic vector is fixed to each discrete value within the reference distribution range. The conditional probability distribution change trajectory of the response strategy content semantic vector when the query intent core semantic vector takes each discrete value is observed. The second intervention effect measurement value of the query intent core semantic vector on the response strategy content semantic vector is calculated based on the conditional probability distribution change trajectory. The second intervention effect measure is compared and analyzed with the first intervention effect measure, and the ratio of the second intervention effect measure to the first intervention effect measure is calculated as the relative importance index of the causal path. Based on the relative importance index of the causal path, the first causal path edge and the second causal path edge in the initial causal graph topology are weighted, and a first path edge weight value is assigned to the first causal path edge, and a second path edge weight value is assigned to the second causal path edge. The initial response strategy text content is sorted by dual contribution based on the weight values ​​of the first path edge and the second path edge, generating a first weighted contribution score corresponding to the first causal path edge and a second weighted contribution score corresponding to the second causal path edge for each response strategy text unit in the initial response strategy text content; The first weighted contribution score and the second weighted contribution score are weighted and summed to obtain the comprehensive causal contribution score of each response strategy text unit. Based on the comprehensive causal contribution score, multiple response strategy text units are comprehensively sorted to generate a comprehensive contribution ranking list.

7. The intelligent interactive decision interpretability analysis method combining causal reasoning according to claim 1, characterized in that, The method further includes, after obtaining the raw interaction log data stream of the intelligent interaction system, which includes the user's unique identifier code, the original query text content submitted by the user, and the initial response strategy text content pushed by the interaction engine in response to the original query text content: The original interaction log data stream is divided into multiple consecutive time window data blocks according to the timestamp information of each interaction event in the original interaction log data stream and the preset time window length. Perform deduplication and statistical processing on the unique user identifier codes in each time window data block, count the number of different unique user identifier codes appearing in each time window data block, and generate the active user count statistics for each time window data block. Perform query frequency statistics on the original query text content within each time window data block, count the total number of times the original query text content appears within each time window data block, and generate the total query statistics value corresponding to each time window data block. The initial response strategy text content within each time window data block is subjected to response strategy diversity analysis. The strategy type tags of the initial response strategy text content within each time window data block are extracted. The frequency of occurrence of different strategy type tags is counted, and the proportion of each strategy type frequency to the total frequency of strategies in that time window is calculated. The response strategy type proportion distribution vector corresponding to each time window data block is generated. The active user count sequence and the total query count sequence are standardized respectively. The standardized active user count sequence, the standardized total query count sequence, and the response strategy type proportion distribution vector sequence are then time-series aligned and concatenated to generate the interaction traffic time-series feature matrix corresponding to the original interaction log data stream.

8. The intelligent interactive decision interpretability analysis method combining causal reasoning according to claim 1, characterized in that, The method further includes: taking the user's long-term interest preference distribution vector and the query intent core semantic vector as antecedent variables, and the response strategy content semantic vector as a consequent variable, inputting them into a counterfactual causal reasoning framework for causal path search processing, and generating a first causal path edge starting from the user's long-term interest preference distribution vector and a second causal path edge starting from the query intent core semantic vector; The first causal path edge is subjected to path strength quantization processing, and the path coefficient estimate between the user's long-term interest preference distribution vector and the response strategy content semantic vector is calculated. The path coefficient estimate is used as the initial value of the path strength of the first causal path edge. The second causal path edge is subjected to path strength quantization processing, and the path coefficient estimate between the core semantic vector of the query intent and the semantic vector of the response strategy content is calculated. The path coefficient estimate is used as the initial value of the path strength of the second causal path edge. The initial values ​​of the path strength of the first causal path edge and the second causal path edge are input into the path strength normalization function for normalization processing to generate the normalized strength values ​​of the first causal path edge and the second causal path edge, which are in the range of zero to one. The path edges of the initial causal graph topology are thickened based on the normalized strength values ​​of the first and second causal path edges. In the initial causal graph topology, the thickness of the visual line of the first causal path edge is associated with the normalized strength value of the first causal path edge, and the thickness of the visual line of the second causal path edge is associated with the normalized strength value of the second causal path edge. The initial causal graph topology structure after path edge thickening is stored in the causal graph structure cache database, and a version identifier and timestamp information are assigned to the initial causal graph topology structure after path edge thickening.

9. The intelligent interactive decision interpretability analysis method combining causal reasoning according to claim 1, characterized in that, The method further includes: performing path blocking intervention simulation on the first causal path edge, observing the conditional probability distribution change trajectory of the response strategy content semantic vector when the user's long-term interest preference distribution vector takes different intervention values, and calculating the first intervention effect measure of the user's long-term interest preference distribution vector on the response strategy content semantic vector based on the conditional probability distribution change trajectory; Obtain the original distribution function of the user's long-term interest preference distribution vector in the natural state, randomly sample multiple natural state sample values ​​from the original distribution function, and record the natural state observation value of the response strategy content semantic vector corresponding to each natural state sample value; An intervention operation is applied to the user's long-term interest preference distribution vector, forcibly setting the vector value of the user's long-term interest preference distribution vector to a preset intervention target value. Multiple intervention state sample values ​​are generated by sampling from the post-intervention distribution corresponding to the intervention target value, and the intervention state observation value of the response strategy content semantic vector corresponding to each intervention state sample value is recorded. The difference between the intervention state observation value and the natural state observation value is calculated to obtain the set of changes in the semantic vector of the response strategy content caused by the change of the user's long-term interest preference distribution vector from the natural state to the intervention target value. The set of changes is statistically summarized, and the average value of the set of changes is calculated as the average intervention effect estimate, and the standard deviation of the set of changes is calculated as the intervention effect volatility estimate. The average intervention effect estimate and the intervention effect volatility estimate are combined to form a supplementary descriptive parameter for the first intervention effect measure of the user's long-term interest preference distribution vector on the response strategy content semantic vector.

10. A smart interactive decision interpretability analysis system combining causal reasoning, characterized in that, The method includes a processor and a computer-readable storage medium storing machine-executable instructions that, when executed by the processor, implement the intelligent interactive decision interpretability analysis method combining causal reasoning as described in any one of claims 1-9.