Intelligent pre-judgment and adaptive quality control method for credibility of interactive behavior dynamic modeling

By collecting user interaction events in real time, constructing individual dynamic baselines and matching them with an anomaly pattern library, the problem of real-time dynamic analysis of user behavior characteristics in online evaluation systems is solved, improving the accuracy and personalization of credibility prediction, identifying and distinguishing abnormal behaviors, and ensuring the effectiveness of evaluation results.

CN122196478APending Publication Date: 2026-06-12HANGZHOU SEVENTH PEOPLES HOSPITAL

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
HANGZHOU SEVENTH PEOPLES HOSPITAL
Filing Date
2026-03-16
Publication Date
2026-06-12

AI Technical Summary

Technical Problem

In online assessment systems, users may exhibit behaviors such as perfunctory responses, random selections, copying and pasting external content, or distraction during the evaluation process. This results in data quality that fails to accurately reflect the user's actual state. Existing technologies lack real-time dynamic analysis of user behavior characteristics and do not consider individual differences, leading to a high rate of misjudgment.

Method used

By collecting user interaction events in real time, extracting multi-dimensional behavioral feature values, constructing an individual dynamic baseline using an exponentially weighted moving average method, and matching the deviation vector with an abnormal behavior pattern library, a verification inquiry process is triggered, the credibility level is adaptively adjusted, and abnormal behaviors such as mechanically rapid, hesitant and repetitive, and inattentive behavior are identified.

Benefits of technology

It enables real-time process monitoring of users' cognitive state and response attitude, reduces misjudgments, improves the accuracy and personalization of credibility prediction, can identify various abnormal behaviors and distinguish between normal and abnormal behaviors through verification queries, and ensures the effectiveness of evaluation results.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122196478A_ABST
    Figure CN122196478A_ABST
Patent Text Reader

Abstract

The application provides a credibility intelligent pre-judgment and self-adaptive quality control method for dynamic modeling of interactive behaviors, which collects interactive behavior events in the user answering process in real time, extracts multi-dimensional behavior characteristic values, constructs an individual dynamic baseline evolving with the user answering process by using an exponential weighted moving average method, calculates a deviation vector of each characteristic relative to the individual baseline, and matches a preset abnormal behavior pattern library to pre-judge a credibility level; triggers a verification inquiry process for a question item matching an abnormal pattern, and adaptively corrects the credibility level mark according to the user response. The application realizes real-time abnormal behavior recognition and closed-loop quality control based on individualized dynamic baseline, and effectively solves the problems of high misjudgment rate of a fixed threshold and lack of process monitoring in the prior art.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the fields of computer information processing and artificial intelligence technology, specifically to a method for intelligent prediction and adaptive quality control of credibility in dynamic modeling of interactive behavior. Background Technology

[0002] With the rapid development of internet technology, online assessment systems have been widely used in fields such as psychological testing, educational examinations, and questionnaire surveys. These systems present assessment items to users online, collect users' subjective response data, and then conduct psychological state assessments, knowledge and ability tests, or opinion and attitude surveys. However, because the online assessment environment lacks the on-site supervision of proctors found in traditional offline assessments, users may exhibit behaviors that affect data quality, such as perfunctory answers, random selections, copying and pasting external content, or distraction. This results in the collected response data failing to accurately reflect the user's actual state or true thoughts, seriously affecting the validity and credibility of the assessment results.

[0003] In existing technologies, the following methods are mainly used to address the credibility issue of online assessment data. The first method involves setting lie detector questions or consistency checks. This involves inserting semantically opposite or similar questions into a scale and judging the seriousness of the responses based on the consistency of the user's answers. The limitation of this method is that it can only be checked post-assessment, and cannot detect problems in real-time during the response process. Furthermore, the addition of lie detector questions increases the scale length, which may cause user aversion. The second method sets a minimum response time limit. If a user's time to complete a question is less than a preset threshold, the response is considered invalid. This method uses a fixed time threshold and does not consider individual differences between users, making it prone to misjudgment for users with fast reading speeds or familiarity with the question content. The third method identifies anomalies by analyzing the distribution patterns of users' answers, such as detecting whether users selected too many of the same options or whether the answers exhibit obvious regularities. This method, too, can only be analyzed post-assessment and is difficult to effectively identify random responses.

[0004] A common problem with the existing methods is the lack of real-time dynamic analysis of user behavior during the response process. A user's cognitive state and attitude towards answering questions are reflected in their interactive behavior. For example, response speed, modification frequency, and input rhythm can reflect whether the user has carefully read the question, considered the answer before answering, and whether there is any lack of concentration. Existing technologies fail to fully utilize this procedural behavioral data for credibility assessment. Furthermore, existing methods generally employ fixed judgment criteria, failing to consider the behavioral differences between individual users. Different users have different cognitive styles and operating habits; some users are naturally quick to react, while others prefer to deliberate carefully before answering. Using a uniform, fixed threshold for judgment inevitably leads to a high false positive rate.

[0005] Therefore, there is a need for a method to predict the reliability of assessment data by using an adaptive quality control mechanism to improve the accuracy of judgment. Summary of the Invention

[0006] To overcome the shortcomings of existing technologies, this invention proposes a method for intelligent prediction and adaptive quality control of credibility based on dynamic modeling of interactive behavior. The core of this method is: real-time collection of interactive behavior events during user response, extraction of multi-dimensional behavioral feature values, construction of individual dynamic baselines using an exponentially weighted moving average method, prediction of credibility level by matching deviation vectors with an abnormal behavior pattern library, triggering a verification query process for items matching abnormal patterns, and adaptively correcting the credibility level label based on user responses.

[0007] Specifically, the following steps are included: S1: During the user's response to the assessment questions, the computer system collects the user's interactive behavior events in real time through an event listener. The interactive behavior events include keyboard input events, mouse click events, touch events, option selection events, text pasting events, deletion operation events, and submit operation events. The computer system generates an event record for each collected interactive behavior event, which includes the event type, event content, millisecond-level timestamp, and event sequence number, and organizes them into an event record sequence in chronological order and stores them in a memory buffer. S2: Based on the event record sequence, the processor calculates the following four behavioral characteristic values ​​for the current question item: First response latency, defined as the time interval between the completion of the rendering of the question item content and the user's first valid input event; Editing efficiency ratio, defined as the ratio of the effective character length of the final submitted content to the cumulative number of operations, reflecting the accuracy of the user's input; Backtrack density, defined as the ratio of the number of deletion or backtracking operations to the answering time, reflecting the frequency of the user's modification behavior; Rhythm stability coefficient, defined as the ratio of the geometric mean to the arithmetic mean of the time interval sequence of adjacent input events, with a value range of 0 to 1, reflecting the uniformity and stability of the user's input rhythm; S3: The processor maintains the sequence of behavioral feature values ​​of the questions completed by the current user in this evaluation session. For each behavioral feature, the processor uses an exponentially weighted moving average method to calculate the individual dynamic baseline value of the feature. When it is the first question in the evaluation session, the individual dynamic baseline value is taken as the initial value of the group baseline corresponding to the question type. When it is a subsequent question, the individual dynamic baseline value is obtained by weighting the feature value of the previous question with the baseline value of the previous question. The smoothing coefficient determines the relative weight of the recent feature value and the historical baseline, thereby constructing a personalized behavior model that dynamically evolves with the user's answering process. S4: For each row of the current question item, the processor calculates its deviation from the corresponding individual dynamic baseline value. The deviation is the difference between the current feature value and the individual dynamic baseline value divided by the individual dynamic baseline value. The sign of the deviation indicates the direction of deviation. A positive value indicates that the current feature value is higher than the baseline, and a negative value indicates that it is lower than the baseline. The processor combines the first response delay deviation, editing efficiency ratio deviation, rollback density deviation, and rhythm stability coefficient deviation to form a four-dimensional deviation vector for the current question item. S5: The processor matches the deviation vector with a preset abnormal behavior pattern library. The abnormal behavior pattern library stores multiple abnormal pattern templates. Each abnormal pattern template is defined as a logical AND combination of multiple conditional constraints that the deviation vector must satisfy simultaneously. The processor traverses all abnormal pattern templates in the abnormal behavior pattern library. When the deviation vector satisfies all the conditional constraints defined by a certain abnormal pattern template, the processor automatically determines that the current item matches the abnormal pattern and records the identifier of the abnormal pattern into the matching result set. The processor predicts the credibility level of the current item based on whether the matching result set is empty. S6: When the matching result set is empty, meaning the current item does not match any abnormal pattern, the processor sets the credibility level of the current item to normal, includes the behavioral feature value of the current item in the subsequent baseline calculation, and controls the evaluation process to proceed to the next item; when the matching result set is not empty, meaning the current item matches at least one abnormal pattern, the processor sets the credibility level of the current item to pending verification, retrieves the corresponding verification query template from the verification query configuration library according to the abnormal pattern type in the matching result set, generates the verification query content and presents it to the user, and adaptively decides whether to correct the credibility level identifier, whether to include the behavioral feature value of the current item in the subsequent baseline calculation, and whether to adjust the trigger threshold of the abnormal pattern template based on the user's response to the verification query.

[0008] Furthermore, the abnormal behavior pattern templates stored in the abnormal behavior pattern library include: The Mechanical Fast Mode is used to identify behaviors that users may use, such as copying and pasting, quickly filling in preset answers, or randomly clicking. Its conditions are that the first response delay deviation is negative and the absolute value exceeds the first threshold, while the backoff density deviation is lower than the second threshold. That is, the first response delay is significantly lower than the individual baseline and the backoff operation is very rare or non-existent. The Hesitation and Repetition Mode is used to identify repeated modification behaviors caused by users’ difficulty in understanding the content of the question, emotional fluctuations, or lack of concentration. Its conditional constraints are that the backtracking density deviation is positive and exceeds the third threshold, and the rhythm stability coefficient deviation is negative and the absolute value exceeds the fourth threshold, that is, the number of deletion and backtracking operations is significantly more than the individual baseline and the input rhythm is significantly unstable. The distraction pattern is used to identify behaviors that may indicate that a user is distracted, inattentive, or engaged in other activities simultaneously. Its conditional constraints are that the deviation of the first response delay is positive and exceeds the fifth threshold, while the deviation of the editing efficiency ratio is negative and the absolute value exceeds the sixth threshold. That is, the first response delay is significantly higher than the individual baseline and the editing efficiency is significantly lower than the individual baseline.

[0009] Furthermore, the abnormal behavior pattern templates stored in the abnormal behavior pattern library also include: The fatigue reduction mode is used to identify cognitive fatigue and slow response behavior that users may experience due to excessive evaluation time. Its conditions are that the deviation of the first response delay is positive and exceeds the seventh threshold, the deviation of the rhythm stability coefficient is negative and the absolute value exceeds the eighth threshold, and the current question number is greater than the preset question number threshold. The paste-fill mode is used to identify behaviors where users may directly paste external content instead of answering independently. Its conditions are: there is a text paste event in the event record sequence, the length of the pasted content exceeds the proportion of the final submitted content to the preset paste ratio threshold, and the first response delay deviation is negative and the absolute value exceeds the ninth threshold.

[0010] Further, in step S3, the method for obtaining the initial value of the group baseline is as follows: the processor pre-establishes a group baseline dataset for each question type, collects behavioral feature data of multiple users under the same question type to form the original dataset, and uses the interquartile range method to remove outliers for each behavioral feature. Specifically, the first quartile and the third quartile of the feature data are calculated, data points that exceed 1.5 times the interquartile range are marked as outliers and removed from the dataset, and the median of each behavioral feature is calculated for the dataset after removing outliers as the initial value of the group baseline for that question type.

[0011] Furthermore, in step S3, the smoothing coefficient ranges from 0.1 to 0.4. When the number of items completed by the user in the current evaluation session is less than the preset minimum sample size, the processor dynamically adjusts the smoothing coefficient by multiplying the smoothing coefficient by the ratio of the number of completed items to the minimum sample size. This enhances the weight of the group baseline in the early stage of evaluation and enhances the weight of individual characteristics in the later stage, thereby realizing the progressive personalized modeling of the individual dynamic baseline model.

[0012] Furthermore, the specific calculation method of the rhythm stability coefficient is as follows: the processor extracts the timestamps of all input events during the current question answering process from the event record sequence, calculates the time intervals between adjacent input events to form a time interval sequence, and the rhythm stability coefficient is equal to the geometric mean of the time interval sequence divided by the arithmetic mean; when there is a zero value or a negative value in the time interval sequence, the processor replaces the abnormal interval value with the preset minimum interval value; when the number of time intervals is less than 2, the processor sets the rhythm stability coefficient to the default value of 1.

[0013] Furthermore, the specific calculation method for the editing efficiency ratio is as follows: the processor counts the cumulative number of operations performed by the user during the current question's answering process, the cumulative number of operations including the total number of keyboard key presses, mouse clicks, touch operations, and paste operations; the processor obtains the effective length of the final submitted content, the effective length for text-based questions is the number of characters after removing leading and trailing whitespace, and the effective length for selection-based questions is set to 1; the editing efficiency ratio is equal to the effective length divided by the cumulative number of operations; when the cumulative number of operations is zero, the processor sets the editing efficiency ratio to the median of the editing efficiency ratios in the group baseline.

[0014] Furthermore, the specific calculation method for the rollback density is as follows: the processor identifies and counts the number of rollback-type operations performed by the user during the current question's answering process from the event record sequence. The rollback-type operations include backspace key press event, delete key press event, text selection and deletion event, undo operation event, and option deselection event. The processor calculates the answering time from the timestamp of the first input event to the timestamp of the submission event. The rollback density is equal to the number of rollback-type operations divided by the answering time. When the answering time is less than a preset minimum time threshold, the processor corrects the answering time to the minimum time threshold to avoid numerical anomalies.

[0015] Furthermore, the specific calculation method for the first response delay is as follows: the processor records the timestamp of the current question content completing rendering, and the determination method for the completion of rendering is that the DOM element of the question content has been loaded and its visibility state has become visible; the processor identifies the first valid input event from the event recording sequence and obtains its timestamp, the valid input events include keyboard key events that can generate actual input content, option click or touch selection events, and text paste events, but do not include mouse movement events, focus switching events, and scrolling events; the first response delay is equal to the difference between the timestamp of the first valid input event and the timestamp of the question rendering completion; when the user skips directly without generating any valid input event for the current question, the processor sets the first response delay to a preset maximum value and marks the question as skipped in the event recording sequence.

[0016] Furthermore, in step S6, the adaptive quality control response includes a verification query sub-process: S6.1: The processor retrieves the corresponding verification query template from the verification query configuration library according to the matched exception mode type. The verification query template includes the query type, query content text, response options, and processing logic corresponding to each response option. S6.2: The processor generates a verification query interface based on the verification query template. The verification query interface is presented in the form of a pop-up window, an embedded area within a page, or an independent page, and includes query content text and interactive response options. S6.3: The processor receives the user's response input to the verification query, executes the corresponding quality control processing logic according to the response option selected by the user, and when the user's response confirms that the original answer is valid, the processor corrects the credibility level of the current question item to normal and includes the behavioral feature value of the current question item in the subsequent baseline calculation. When the user's response confirms that there is a problem with the original answer, the processor sets the credibility level of the current question item to abnormal and does not include the behavioral feature value of the current question item in the subsequent baseline calculation, and may selectively allow the user to answer the current question item again. When the user chooses to skip the verification query, the processor sets the credibility level of the current question item to unverified and marks the state in the evaluation output data.

[0017] Furthermore, the verification query configuration library contains differentiated verification query templates configured for different anomaly mode types: For the mechanical fast mode, the verification question prompts the user to confirm whether the answer truly reflects their thoughts when responding quickly. The response options include an option to confirm the true thoughts and an option to reconsider. For the hesitant and iterative pattern, the verification query prompts the user to confirm whether they are satisfied with the final answer after multiple revisions, and the response options include an "OK" option and a "Think about it again" option. For the inattentive mode, the verification question prompts the user to confirm whether they have carefully read and understood the question after spending a long time. The response options include an option that the question has been understood and an option that it needs to be read again.

[0018] Furthermore, it also includes session-level quality control monitoring steps: When evaluating session initialization, the processor creates an exception accumulation counter and sets its initial value to zero. Whenever a question is judged to be trustworthy, the processor updates the exception accumulation counter according to the judgment result. When a question is judged to be normal, the counter is decremented by an increment but not lower than zero. When a question is judged to be abnormal or unverified, the counter is incremented by an increment. The increment can be set to different values ​​according to the severity of the matched exception pattern. When the value of the abnormal accumulation counter exceeds the preset accumulation threshold, the processor determines that the current session has entered a continuous low-trust state. In the metadata of the evaluation output data, the overall trust level of this session is marked as low trust. The processor may selectively present an overall prompt to the user to remind the user to pay attention to the quality of the answer. The processor may selectively trigger a forced rest process to pause the evaluation process for a preset time before it can continue.

[0019] Furthermore, the adaptive quality control response in step S6 also includes a threshold adaptive adjustment mechanism: The processor maintains a matching statistics structure for each abnormal pattern template, which includes the total number of matches, the number of misjudgments, and the number of correct judgments. Each time an abnormal pattern match triggers a verification query, the processor updates the statistical data of the corresponding abnormal pattern template based on the user's response. When the user's response indicates that the original answer is valid, the processor records the match as a misjudgment. When the user's response confirms that there is a problem with the original answer, the processor records the match as a correct judgment. The processor periodically calculates the false positive rate of each abnormal pattern template. When the false positive rate of an abnormal pattern template exceeds the preset upper limit of the false positive rate and the total number of matches exceeds the preset minimum sample size, the processor increases the threshold of each condition constraint in the template by a preset adjustment amount, thereby adaptively increasing the trigger threshold of the abnormal pattern and reducing the false positive rate.

[0020] Furthermore, the threshold adaptive adjustment mechanism also includes threshold downsizing logic: The processor periodically calculates the omission risk index for each abnormal mode template. The omission risk index is obtained by statistically analyzing the proportion of items with a normal confidence level whose deviation vectors are close but have not reached the abnormal mode triggering conditions. When the missed detection risk index of a certain abnormal mode template exceeds the preset risk threshold and the false detection rate is lower than the preset false detection rate lower limit, the processor reduces the threshold of each condition constraint in the template by a preset adjustment range, thereby adaptively reducing the trigger threshold of the abnormal mode and reducing the missed detection risk. The processor sets a threshold fluctuation range for each abnormal mode template to ensure that the dynamically adjusted threshold is always within a reasonable range.

[0021] Furthermore, in step S5, when the current item matches two or more abnormal patterns simultaneously, the processor executes a composite quality control process in step S6: S6.A: The processor sorts the abnormal patterns in the matching result set according to a preset priority, wherein the priority is determined based on the severity of the abnormal pattern and the verification cost; S6.B: The processor retrieves the verification query templates corresponding to each abnormal mode from the verification query configuration library according to the sorted abnormal mode sequence, and generates a sequence of verification query content. S6.C: The processor sequentially presents the verification query content to the user, receives the user's response to each verification query, and updates the exclusion status of each abnormal mode according to the response results. When the user's response to a certain verification query indicates that the original answer is valid, the processor marks the corresponding abnormal mode as excluded. When the user's response to a certain verification query confirms that there is a problem, the processor marks the corresponding abnormal mode as confirmed and terminates the presentation of subsequent verification queries. S6.D: When all anomalous patterns are marked as excluded, the processor corrects the confidence level of the current item to normal; when at least one anomalous pattern is marked as confirmed, the processor sets the confidence level to anomalous.

[0022] Furthermore, it also includes the quality control result output step: After all questions have been answered, the processor generates an evaluation output data packet. The evaluation output data packet includes evaluation result data, credibility metadata, session-level quality control indicators, and behavioral feature statistics. The evaluation result data includes the user's answers and scores for each question. The credibility metadata includes the credibility level identifier for each question, a list of matched abnormal pattern types, and verification query response records. The session-level quality control indicators include the overall credibility level of this evaluation session, the proportion of abnormal questions, and the final value of the abnormality accumulation counter. The behavioral feature statistics include the session mean, standard deviation, and distribution characteristics of each behavioral feature. The processor will evaluate and encapsulate the output data packet in a structured manner, and output it as a data file in a preset format or return it to the calling system through an interface.

[0023] Furthermore, it also includes differentiated modeling steps based on question type: The processor is pre-configured with a question type classification system, which includes at least multiple choice questions, multiple selection questions, fill-in-the-blank questions, short answer questions, scale questions, and ranking questions. For different question types, the processor uses differentiated behavioral feature calculation parameters and abnormal pattern thresholds. For multiple-choice questions, including single-choice, multiple-answer, and scale questions, the processor mainly uses the first response delay and backoff density to predict credibility and reduces the weight of editing efficiency ratio. For text input questions, including fill-in-the-blank and short-answer questions, the processor comprehensively considers all four dimensions of behavioral features. For ranking questions, the processor additionally calculates the trajectory features of drag operations and incorporates them into the prediction criteria. For different question types, the processor maintains independent group baseline datasets and individual dynamic baseline value sequences to achieve refined dynamic modeling for each question type.

[0024] Furthermore, it also includes a multi-device adaptive step: The processor detects the type of device currently being used by the user when evaluating session initialization, and the device type includes at least desktop computers, tablets, and smartphones; For different device types, the processor adopts different interactive behavior event collection methods and behavior feature calculation parameters. For desktop computer processors, keyboard input events and mouse operation events are collected, while for tablet and smartphone processors, touch events, virtual keyboard input events and gesture operation events are collected. For different device types, the processor maintains an independent group baseline dataset to eliminate the impact of device differences on behavior modeling and credibility prediction.

[0025] Furthermore, it also includes privacy protection processing steps: In the S1 interactive behavior event acquisition step, the processor only collects the structured feature data of the user's interactive behavior, and does not collect or store the specific text content entered by the user. After the S2 multidimensional behavioral feature extraction step is completed, the processor deletes the original event record sequence from memory and retains only the extracted behavioral feature values. In the quality control result output step, the processor performs desensitization processing on the behavioral feature statistics, mapping continuous feature values ​​to discrete levels or quantile rankings, thereby minimizing the exposure of personal behavioral data while ensuring the credibility prediction and quality control functions.

[0026] Compared with the prior art, the beneficial effects of the present invention are: 1. This invention provides a method for intelligent prediction and adaptive quality control of credibility based on dynamic modeling of interactive behavior. By collecting interactive behavior events during the user's answering process in real time, it extracts behavioral features in four dimensions: first response delay, editing efficiency ratio, rollback density, and rhythm stability coefficient. This enables process monitoring of the user's cognitive state and answering attitude. Compared with existing methods that rely solely on the answer content for post-event verification, this method can detect abnormal behaviors that may affect data credibility in real time during the user's answering process, and intervene in quality control in a timely manner, avoiding the accumulation of a large amount of low-quality data.

[0027] 2. This invention provides a method for intelligent prediction and adaptive quality control of credibility based on dynamic modeling of interactive behavior. It uses an exponentially weighted moving average method to construct an individual dynamic baseline that evolves dynamically with the user's response process. Anomaly judgment is made by calculating the deviation of the current behavioral characteristics from the individual baseline. This fully considers the individual differences between different users and avoids the misjudgment problem of fast-response users or cautious users caused by the use of fixed thresholds in the prior art. It significantly improves the accuracy and personalization of credibility prediction.

[0028] 3. This invention provides a method for intelligent prediction and adaptive quality control of credibility in dynamic modeling of interactive behavior. By performing pattern matching through a pre-set abnormal behavior pattern library, it can identify a variety of typical abnormal response behaviors such as mechanical speed, hesitant repetition, and inattentiveness. Different verification inquiry processes are triggered for different abnormal patterns. Through interaction with the user, it confirms the distinction between real abnormal behavior and normal behavior variation. The credibility level label is adaptively adjusted according to the user response, which avoids misjudging normal behavior and ensures effective identification of real abnormal behavior. Attached Figure Description

[0029] To more clearly illustrate the specific embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the specific embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are some embodiments of the present invention. For those skilled in the art, other drawings can be obtained from these drawings without creative effort.

[0030] Figure 1 This is a schematic diagram of the system architecture of the present invention; Figure 2 A schematic diagram of the individual dynamic baseline evolution of this invention. Detailed Implementation

[0031] The technical solution of the present invention will be more clearly and completely explained below with reference to the accompanying drawings and through the description of preferred embodiments of the present invention.

[0032] The present invention is as follows Figure 1 As shown, the process includes the following steps in sequence: The S1 interactive behavior event acquisition step is used to capture in real time the interactive behavior events generated by the user during the answering process, such as keyboard input, mouse click, touch operation, text pasting, deletion and rollback, and submission, and generate an event record sequence with millisecond-level timestamps; The S2 multidimensional behavioral feature extraction step is used to calculate behavioral feature values ​​in four dimensions from the event record sequence: first response delay, edit efficiency ratio, rollback density, and rhythm stability coefficient. The S3 individual dynamic baseline modeling step is used to construct a personalized behavioral baseline model that dynamically evolves with the user's response process using an exponentially weighted moving average method. The S4 feature deviation quantification step is used to calculate the degree of deviation of each behavioral feature value of the current item from the corresponding individual dynamic baseline value and form a deviation vector. The S5 credibility intelligent prediction step is used to match the deviation vector with a preset abnormal behavior pattern library to identify abnormal answering patterns such as mechanical speed, hesitation and repetition, and inattention. The S6 adaptive quality control response step performs branching processing based on the matching results. When no abnormal pattern is matched, the credibility level is set to normal and the evaluation process proceeds to the next item. When at least one abnormal pattern is matched, the corresponding verification query process is triggered, and the user's response to the verification query is used to determine whether the answer is valid. If the user confirms that the original answer is valid, the credibility level is corrected to normal. If the user confirms that the original answer has a problem, the credibility level is set to abnormal.

[0033] like Figure 2 As shown, taking the behavioral characteristic of first response delay as an example, the diagram visually illustrates the dynamic change pattern of how the individual dynamic baseline evolves and updates question by question as the user answers. The horizontal axis represents the sequence number of the assessment item, ranging from question 1 to question 10; the vertical axis represents the numerical value of the first response delay, in milliseconds. The horizontal dashed line in the diagram represents the initial value of the group baseline. Its value is 2800 milliseconds, which serves as the baseline reference for assessing the first item of the session. The solid line with dots in the figure represents the individual dynamic baseline. As the response process evolves, the baseline value can be observed to gradually adjust from the initial group value based on the user's actual response behavior, reflecting an adaptive modeling process that considers the characteristics of individual user behavior. The hollow squares in the diagram represent the normal behavioral characteristic values ​​of each question item. These feature values ​​are near the individual baseline and fall within the normal range of behavioral fluctuations. The solid squares in the diagram represent abnormal behavioral feature values, located at question 8. Their initial response delay is approximately 1000 milliseconds, significantly lower than the individual dynamic baseline value for that question, forming a clear abnormal deviation. This is marked with the text "Abnormal Deviation" in the diagram, indicating that this question will trigger abnormal pattern matching in the credibility prediction process. The formula for updating the individual dynamic baseline is shown below the diagram. , where α is the smoothing coefficient, which is 0.2 in this embodiment. This formula reflects the weighted fusion mechanism of the exponentially weighted moving average method on recent feature values ​​and historical baselines.

[0034] The reliability-based intelligent prediction and adaptive quality control method for dynamic modeling of interactive behavior provided by this invention is applicable to application scenarios that require the collection of user subjective response data, such as psychological assessment systems, online examination platforms, and questionnaire survey systems.

[0035] As a specific implementation, this method is deployed in a browser-based online mental health assessment system. The system front-end captures user interaction behavior through JavaScript event listeners, while the back-end server is responsible for behavioral feature calculation, baseline maintenance, pattern matching, and credibility prediction, triggering real-time, question-by-question detection when the user submits each question.

[0036] Interactive event capture is achieved by registering event listeners on the DOM elements of the evaluation page. For keyboard input, the system listens for the keydown, keyup, and keypress events; for mouse operations, the system listens for the click, mousedown, and mouseup events; and for touch devices, the system listens for the touchstart, touchend, and touchmove events. Additionally, the system listens for the paste event to detect paste operations and for the input event to track changes in the input content. Each event is encapsulated as a structured record containing the event type, event content, millisecond-level timestamp, and event sequence number, organized chronologically into an event record sequence.

[0037] When a user submits an answer, the system calculates behavioral characteristic values ​​across four dimensions from the event log sequence: The initial response delay is determined by the completion time of question content rendering, using either the DOMContentLoaded event or the IntersectionObserver interface. Valid input events include key presses, option clicks, text pasting, and other events that change the answer content, but exclude mouse movement, scrolling, and focus switching. The initial response delay is the timestamp of the first valid input event minus the timestamp of question content rendering completion.

[0038] The editing efficiency ratio equals the effective character length of the final submitted content divided by the cumulative number of operations. The cumulative number of operations includes the sum of key presses, clicks, touches, and paste operations. The effective character length for text input questions is the number of characters after removing leading and trailing whitespace; for multiple-choice questions, it is uniformly set to 1.

[0039] The rollback density equals the number of rollback operations divided by the response time. Rollback operations include the backspace key, delete key, undo operation, and deselect event. The response time is the time span from the first valid input event to the submission event, adjusted to 1 second if it is less than 1 second.

[0040] The rhythm stability coefficient is equal to the geometric mean of the time interval sequence of adjacent input events divided by the arithmetic mean, and its value ranges from 0 to 1. The closer the value is to 1, the more uniform and stable the input rhythm.

[0041] Individual dynamic baselines are constructed using an exponentially weighted moving average method. Let the current item be the nth item, and the behavioral characteristic value of the i-th item be... The individual dynamic baseline value of this feature is The smoothing coefficient is α (0 < α < 1). When n = 1, , This is the initial value of the population baseline; when n > 1, The initial baseline value for the group was obtained by calculating the median after removing outliers from behavioral characteristic data of at least 500 users on the same question type using the interquartile range method.

[0042] In this embodiment, the smoothing coefficient is set to 0.2. The initial baseline values ​​for the population using the Likert five-point scale are: first response delay of 2800 milliseconds, edit efficiency ratio of 0.85, rollback density of 0.05 times per second, and rhythm stability coefficient of 0.92. When the sample size is small in the early stages of evaluation, the system dynamically adjusts the smoothing coefficient by multiplying it by the ratio of the number of completed items to the minimum sample size (set to 5), thereby achieving progressive personalized modeling.

[0043] Feature deviation is equal to the difference between the current behavioral feature value and the individual dynamic baseline value, divided by the individual dynamic baseline value. A positive value indicates that the deviation is above the baseline, and a negative value indicates that the deviation is below the baseline. The system combines the deviations of the four features into a four-dimensional deviation vector. The abnormal behavior pattern library stores multiple abnormal pattern templates, each template being defined as a logical AND combination of the conditional constraints that the deviation vector must simultaneously satisfy.

[0044] This embodiment configures three basic modes: Mechanical fast mode: The initial response delay deviation is negative and the absolute value exceeds 0.5, while the backoff density deviation is less than 0.3.

[0045] Hesitant and repetitive mode: The deviation of the retreat density is positive and exceeds 0.6, while the deviation of the rhythm stability coefficient is negative and the absolute value exceeds 0.4.

[0046] Distracted attention mode: First response delay deviation is positive and exceeds 1.0, while editing efficiency ratio deviation is negative and the absolute value exceeds 0.5.

[0047] During the intelligent confidence prediction process, the system traverses the abnormal behavior pattern library and records the abnormal pattern identifiers that meet all the conditional constraints into the matching result set. When the matching result set is empty, the confidence level is set to normal, and the behavioral feature value is included in the subsequent baseline calculation; when it is not empty, it is set to pending verification, triggering a verification query.

[0048] The verification query is generated by retrieving the corresponding template from the verification query configuration library based on the anomaly pattern type. For the mechanically rapid pattern, it asks whether the answer truly reflects the thought; for the hesitant and repetitive pattern, it asks whether the final answer is satisfactory; and for the inattentive pattern, it asks whether the question has been understood.

[0049] When a user confirms that the original answer is valid, the credibility level is corrected to normal and included in the baseline calculation; when a problem is confirmed, it is set to abnormal, not included in the baseline calculation, and a re-answer is allowed.

[0050] When multiple abnormal patterns are matched at the same time, the system presents verification questions in sequence according to preset priority. If the user confirms that it is valid, it is marked as excluded and continues. If the user confirms that there is a problem, it is marked as confirmed and subsequent questions are terminated.

[0051] Session-level quality control monitoring is implemented through an anomaly accumulation counter, with an initial value of 0. The counter decrements by 1 (but not lower than 0) when an item is judged as normal, and increments by 2 when judged as abnormal or unverified. When the accumulated value exceeds a preset threshold of 10, the overall session credibility level is marked as low credibility.

[0052] The threshold adaptive adjustment mechanism maintains matching statistics for each abnormal pattern template. When the false positive rate exceeds 30% and the total number of matches exceeds 50, the threshold is increased by 0.05; when the risk of missed detection is too high and the false positive rate is low, the threshold is appropriately decreased. The adjustment range is limited to ±30% of the initial value.

[0053] Differentiated modeling is employed for different question types: for multiple-choice questions, the initial response latency and backoff density are primarily considered, while the weighting of editing efficiency is reduced; for text input questions, all four dimensions are used comprehensively; and for ranking questions, drag trajectory features are additionally calculated. Each question type maintains its own independent baseline dataset.

[0054] For different device types, device categories are detected by User-Agent and screen size. Desktop computers collect keyboard and mouse events, while mobile devices collect touch and virtual keyboard events, and separate population baseline datasets are maintained for each.

[0055] In terms of privacy protection, only structured feature data is collected without storing specific text content. After feature extraction, the original event records are deleted, and the feature values ​​are mapped to discretization levels or quantile rankings during output.

[0056] After the evaluation is completed, an evaluation output data packet in JSON format is generated, which includes evaluation result data, credibility metadata, session-level quality control indicators and behavioral characteristic statistics, and is returned to the calling system via HTTPS interface.

[0057] As a specific implementation, a university's psychological counseling center deployed this method for student psychological screening. The system contains 30 questions (20 Likert scale questions and 10 short answer questions).

[0058] Student Zhang accessed the system using a laptop. The system initialized the evaluation session and loaded the initial values ​​of the group baseline.

[0059] Zhang answered questions 1-7 normally, and none of them matched the abnormal pattern, so the credibility level was marked as normal. After baseline update, the individual dynamic baseline values ​​were adjusted as follows: first response latency 2420 milliseconds, edit efficiency ratio 0.91, rollback density 0.02 times per second, and rhythm stability coefficient 0.95.

[0060] In question 8, Zhang clicked the option and submitted within 680 milliseconds. The calculated first response delay deviation was -0.719, meeting the conditions for the mechanical fast mode, and the credibility level was set to pending verification. The system popped up a verification question, and Zhang selected "Yes, this is my true thought," correcting the credibility level to normal.

[0061] Question 15 was a short answer question. Zhang made 23 backspace operations during his answer, with a backspace density deviation of 3.29. However, his rhythm stability coefficient deviation was only -0.193, failing to meet all the conditions for a hesitant and repetitive pattern, and was therefore judged as normal. In question 22, Zhang was distracted by a phone call, resulting in an initial response delay of 12500 milliseconds and an editing efficiency ratio of 0.28. The initial response delay deviation was 3.03, and the editing efficiency ratio deviation was -0.517, meeting the conditions for the inattentiveness mode. The credibility level was set to "Unverified," and the anomaly accumulation counter was incremented by 2. Zhang selected "I need to reread," the credibility level was set to "Abnormal," and after re-answering, the test passed.

[0062] For questions 25-28, Zhang increased his speed, triggering the mechanical fast mode. For questions 25, 26, and 28, Zhang confirmed his true thoughts and corrected them to normal; for question 27, he chose to reconsider and answer again.

[0063] After all questions are completed, the system generates an evaluation output data package: the session-level quality control indicator shows that the final value of the anomaly accumulation counter is 4 (not exceeding the threshold of 10), the overall credibility level is normal, and the proportion of abnormal items is 20%. Psychological counselors can refer to the credibility metadata to assess data quality and develop a counseling plan.

[0064] This embodiment demonstrates that the method of the present invention can perform real-time intelligent prediction of credibility, achieve adaptive quality control response through verification queries, effectively identify abnormal behavior, and provide credibility reference for evaluation results.

[0065] The above-described specific embodiments are merely preferred embodiments of the present invention and are not intended to limit the scope of protection of the present invention. Various modifications, substitutions, and improvements made by those skilled in the art to the technical solutions of the present invention based on the provided textual description and drawings, without departing from the design concept and spirit of the present invention, should all fall within the scope of protection of the present invention. The scope of protection of the present invention is determined by the claims.

Claims

1. A method for intelligent prediction and adaptive quality control of credibility based on dynamic modeling of interactive behavior, characterized in that, Includes the following steps: S1: During the process of users answering assessment questions, the computer system collects user interaction events in real time and generates a sequence of event records with timestamps; S2: Based on the event record sequence, the processor calculates behavioral feature values ​​for the current question item from at least three dimensions: response speed, editing behavior, and input rhythm; S3: The processor constructs an individual dynamic baseline value that evolves dynamically with the user's answering process based on the behavioral characteristic values ​​of the questions completed by the current user in this evaluation session, using an exponentially weighted moving average method. S4: The processor calculates the deviation of each behavioral feature value of the current question item from the corresponding individual dynamic baseline value, and forms a deviation vector; S5: The processor matches the deviation vector with a preset abnormal behavior pattern library. The abnormal behavior pattern library stores multiple abnormal pattern templates. Each abnormal pattern template is defined as multiple condition constraints that the deviation vector must satisfy simultaneously. When the deviation vector satisfies all the condition constraints of an abnormal pattern template, it is determined that the current item matches the abnormal pattern. S6: When the current item does not match any abnormal pattern, the processor sets the credibility level to normal and controls the evaluation process to proceed to the next item; when the current item matches at least one abnormal pattern, the processor triggers the corresponding verification query process and decides whether to correct the credibility level to normal or set it to abnormal based on the user's response.

2. The reliability intelligent prediction and adaptive quality control method for dynamic modeling of interactive behavior according to claim 1, characterized in that, The multidimensional behavioral feature values ​​include first response latency, edit efficiency ratio, rollback density, and rhythm stability coefficient; The first response delay is the time interval between the completion of the rendering of the question content and the generation of the first valid input event by the user; the editing efficiency ratio is the ratio of the effective length of the final submitted content to the cumulative number of operations; The rollback density is the ratio of the number of deletion or rollback operations to the response time; The rhythm stability coefficient is the ratio of the geometric mean to the arithmetic mean of the time interval sequence of adjacent input events.

3. The reliability intelligent prediction and adaptive quality control method for dynamic modeling of interactive behavior according to claim 1, characterized in that, In step S3, let the current question be question n, and let a certain behavioral feature value of question i be... The individual dynamic baseline value corresponding to question n is The smoothing coefficient is α, where 0 < α < 1; When n=1, , This is the initial value of the population baseline corresponding to this question type; When n > 1 ; The initial baseline value B0 for the group was obtained by calculating the median of each feature after removing outliers from a dataset of multiple user behavior features of the same question type using the interquartile range method.

4. The reliability intelligent prediction and adaptive quality control method for dynamic modeling of interactive behavior according to claim 1, characterized in that, The abnormal behavior pattern templates in the abnormal behavior pattern library include: The mechanical fast mode is subject to the following conditions: the deviation of the first response delay exceeds the first threshold and the deviation direction is negative, while the deviation of the backoff density is lower than the second threshold. The hesitant and repetitive mode is constrained by the following conditions: the deviation of the retreat density exceeds the third threshold and the deviation direction is positive, while the deviation of the rhythm stability coefficient exceeds the fourth threshold and the deviation direction is negative. The distraction mode is constrained by the following conditions: the deviation of the first response delay exceeds the fifth threshold and the deviation direction is positive, while the deviation of the editing efficiency ratio exceeds the sixth threshold and the deviation direction is negative.

5. The reliability intelligent prediction and adaptive quality control method for dynamic modeling of interactive behavior according to claim 3, characterized in that, The smoothing coefficient α ranges from 0.1 to 0.

4. When the number of completed items by the user is less than the preset minimum sample size, the processor multiplies the smoothing coefficient by the ratio of the number of completed items to the minimum sample size. This enhances the weight of the group baseline in the early stage of evaluation and enhances the weight of individual features in the later stage, thereby achieving progressive personalized modeling.

6. The reliability intelligent prediction and adaptive quality control method for dynamic modeling of interactive behavior according to claim 1, characterized in that, The verification query process in step S6 includes: The processor retrieves the corresponding template from the verification query configuration library based on the matched exception pattern type, generates the verification query content, and presents it to the user. When the user responds and confirms that the original answer is valid, the processor corrects the credibility level to normal and includes the behavioral feature value in the subsequent baseline calculation; when the user responds and confirms that there is a problem with the original answer, the processor sets the credibility level to abnormal and does not include it in the baseline calculation; when the user skips the verification question, the processor sets the credibility level to unverified.

7. The reliability intelligent prediction and adaptive quality control method for dynamic modeling of interactive behavior according to claim 1, characterized in that, When the current question item matches two or more abnormal patterns at the same time, the processor will present the verification questions corresponding to each abnormal pattern to the user in sequence according to the preset priority. When all abnormal patterns are ruled out by the user response, the processor will correct the trust level flag to normal; when at least one abnormal pattern is confirmed by the user response, the processor will set the trust level flag to abnormal.

8. The reliability intelligent prediction and adaptive quality control method for dynamic modeling of interactive behavior according to claim 1, characterized in that, It also includes session-level quality control monitoring steps: the processor maintains an abnormality accumulation counter. When an item is judged to be normal, the counter is decremented by a certain amount but not lower than zero. When an item is judged to be abnormal, the counter is incremented by a preset increment value. When the counter exceeds the preset accumulation threshold, the processor determines that the current session has entered a continuous low-trust state and marks the overall trust level in the evaluation output data.

9. The reliability intelligent prediction and adaptive quality control method for dynamic modeling of interactive behavior according to claim 1, characterized in that, It also includes a threshold adaptive adjustment mechanism: the processor records the user's response to the verification query after each abnormal pattern match, and records the match in which the user confirms that the original answer is valid as a misjudged sample; the processor periodically calculates the misjudgment rate of each abnormal pattern template, and when the misjudgment rate exceeds the preset upper limit, it adaptively increases the threshold of each condition constraint of the template to reduce the misjudgment rate.