A multi-source data fusion human-computer interaction task online acceleration method and system
By adopting a task-adaptive multi-source data fusion framework and an online update method for fusion parameters, the latency and reliability issues of online multi-source data fusion are solved, achieving efficient and reliable acceleration of human-computer interaction tasks.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- BEIHANG UNIV
- Filing Date
- 2023-03-31
- Publication Date
- 2026-06-26
AI Technical Summary
Existing technologies have not yet achieved online fusion of multi-source data, and suffer from problems such as limited interaction methods, high data analysis latency, and poor reliability.
By using a task-adaptive multi-source data fusion framework and an online fusion parameter update method, various physiological or psychological perception data are acquired. Signal denoising, feature extraction, fusion discrimination, and a Bayesian causal inference model are employed to correct and update the fusion parameters in real time, thereby achieving online fusion of multi-source data with high reliability.
It improves the accuracy of prediction results from multi-source sensor data and the reliability of human-computer interaction, reduces the probability of decision-making errors under uncertain conditions, and meets the needs of various scenarios for rapid response and safety and reliability.
Smart Images

Figure CN116561698B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to an online acceleration method and system for multi-source data fusion human-computer interaction tasks, belonging to the field of human-computer interaction. Background Technology
[0002] Human-computer interaction technology has been applied to many fields such as medical rehabilitation, smart education, Internet of Things, and military. It enables information exchange and cognition with the system through sound, movement or bioelectric signals, promoting the transformation of traditional applications to intelligence. The perception methods of human-computer interaction are also increasing, such as voice interaction, emotional interaction, body-sensing interaction, brain-computer interaction, etc., involving multiple types of information such as touch, vision and hearing. The data types acquired by multimodal human-computer interaction are diverse, mainly including the following types: (1) one-dimensional data: such as time series data such as voice, pulse, heart rate, etc.; (2) two-dimensional data: such as image data such as face, fingerprint, iris, expression, gesture, etc.; (3) high-dimensional data: such as complex physiological data such as electroencephalogram, cerebral blood oxygen, electromyography, etc.
[0003] Various interference issues exist in human-computer interaction, leading to problems such as missing data, noise, and mutual exclusion. This poses challenges to instruction execution, decision-making, and task feedback, resulting in high latency and uncertain reliability. Taking multimodal human-computer interaction for assisted wheelchair movement as an example, the use of multiple sensors, including EEG and speech sensors, involves the analysis and processing of multi-source information. Problems such as delays in human-computer interaction tasks due to slow data generation, sensor transmission, or recognition speeds (e.g., EEG signals), difficulty in coordinating multiple types of information, and susceptibility to interference with speech or EEG signals under various uncertain conditions, hindering the generation of effective instructions, remain unresolved. Building upon existing sensing capabilities, improving the real-time performance of interactive decisions through software algorithms hinges on data fusion methods for multi-source sensing devices, including data preprocessing, fusion algorithms, and fusion parameter calibration. Currently, there is considerable research on multi-source information fusion methods, primarily focusing on pixel-level, feature-level, and decision-level fusion.
[0004] Multi-source perception human-computer interaction applications, characterized by task adaptation, low-latency execution, group collaboration, and information management, require feature or decision-level data fusion. Existing research has used decision-level fusion of EEG, ECG, and EMG signals for fatigue assessment; a human-computer interaction method based on the fusion of forearm surface EMG signals and inertial information has been proposed for predicting robot movement intentions; researchers have also fused EEG and eye-tracking signals to determine human-computer interaction intentions. However, these studies rely on offline training using historical datasets, resulting in poor applicability to different users. Online fusion of multi-source data has not yet been achieved, leading to problems such as high data analysis latency, poor reliability, and limited interaction methods. Therefore, an efficient and reliable online acceleration method and system for multi-source data fusion human-computer interaction tasks plays a crucial role in the development of human-computer interaction technology. Summary of the Invention
[0005] To address the limitations of existing technologies in achieving online fusion of multi-source data, which suffers from restricted interaction methods, high data analysis latency, and poor reliability, the main objective of this invention is to propose an online acceleration method and system for multi-source data fusion human-computer interaction tasks. Through a task-adaptive multi-source data fusion framework and an online update method for fusion parameters, this invention improves the accuracy of multi-source sensor data prediction results and the reliability of human-computer interaction under various uncertain conditions.
[0006] The objective of this invention is achieved through the following technical solution:
[0007] This invention discloses an online acceleration method and system for multi-source data fusion human-computer interaction tasks. By deploying various types of sensors, it acquires multiple physiological or psychological sensory data required during human-computer interaction across various scenarios. Unlike typical human-computer interaction, it offers multiple interaction modes and exhibits high adaptability and reliability under various uncertain conditions. Then, through a task-adaptive multi-source data fusion method, feature parameters are extracted from the acquired multi-source sensory signals, and data fusion is determined based on fusion discrimination criteria. This reduces the processing time for multiple types of information in multi-modal human-computer interaction tasks under uncertain conditions, solves the problem of redundant interaction commands, facilitates the analysis and decision-making of multi-type interactive sensory data, and improves response speed. Simultaneously, combined with an online fusion parameter update method, the fusion parameters are predicted in advance through multiple steps and corrected and updated in real time. This addresses the problem of low prediction accuracy and inability to fuse multi-modal human-computer interaction data online, improving the accuracy of multi-source sensor data prediction results and the reliability of human-computer interaction under various uncertain conditions.
[0008] This invention discloses an online acceleration method for multi-source data fusion human-computer interaction tasks, which specifically includes the following steps:
[0009] Step 1: Obtain multi-source information data containing the intent of human-computer interaction task instructions through a multi-source data fusion human-computer interaction task online acceleration system;
[0010] The multi-source information data specifically includes: electroencephalogram (EEG), cerebral blood oxygenation, electrooculogram (EOG), speech, and electromyography (EMG) signals; wherein the EEG signals include: steady-state visual evoked potentials, auditory evoked potentials, motor imagery, and event-related potentials; wherein the speech signals include: acoustic speech and bone conduction speech.
[0011] Step 2: Establish a task instruction dataset with multi-source information;
[0012] The dataset consists of various human-computer interaction task instruction data, and further, the dataset is divided into a training set and a test set;
[0013] Step 3: The task instruction dataset of the multi-source information is filtered using signal denoising and feature extraction methods to remove noise interference, and trend features of various data are obtained through feature extraction methods.
[0014] Step 4: Use a signal classification and recognition algorithm to obtain the single-task classification result of multi-source information data;
[0015] Step 5: Based on the fusion discriminant learner, determine whether to fuse the multi-source data;
[0016] The fusion discriminative learner is obtained based on the combination strategy method:
[0017] A strategy combining average, maximum, minimum, and weighted average values is used to merge the task results from multiple sensors. The merged results are analyzed, and if there are differences between the merged results and those from previous steps, data fusion is performed; otherwise, data fusion is not performed.
[0018] Step 6: Construct a task-adaptive multi-source information data feature fusion model;
[0019] The multi-source information data is normalized.
[0020] The set of multi-source data nodes is obtained by calculating the angle between multi-source data points and the time-time preservation capability. Here, the time-time preservation capability refers to the time difference between two key points, and the angle between multi-source information data points is expressed as...
[0021]
[0022] In the formula, Δ i-1,i =|x′ i-1 -x′ i |,Δ i,i+1 =|x′ i -x′i+1 |,Δ i-1,i+1 =|x′ i-1 -x′ i+1 | represents the time interval between multiple data points, using a uniform time interval.
[0023] Dynamic discretization is performed on multi-source data nodes. Specifically, considering that multi-source data nodes can be continuous, discrete, binary, or other types, an optimal set Ψ is found within the defined intervals. X and the optimal constant equation When the dynamic discretization process converges, Approximately close to f X The relative entropy is used to evaluate the quality of dynamic discretization. The relative entropy (Kullback-Leibler distance) between the probability density equations f(x) and g(x) is calculated using the following formula:
[0024]
[0025] The relative entropy of the probability density function before and after discretization of multi-source data nodes is:
[0026]
[0027] The boundary of each term on the right side of formula (3) is determined by the following formula:
[0028]
[0029] In the formula, represent The length of the parameter f; l , and These represent the mean, upper limit, and lower limit of f, respectively.
[0030] Fuzzy clustering is used to determine whether the time series data of a certain node set of multi-source data contains key decision points. The key decision points specifically include state transitions such as start, stop, danger, and invalid.
[0031] Based on the Bayesian causal inference model, the ability to identify key decision points is analyzed, that is, whether it can identify key task decision points such as start, stop, danger, invalidity and other state transitions, and obtain multi-source information data processing and fusion parameters.
[0032] By training a Long Short-Term Memory (LSTM) time series model for decision nodes using historical data with diverse information, multi-step prediction of time series data can be achieved, including regression prediction and deep learning.
[0033] The task-adaptive multi-source information data feature fusion model is continuously updated and learned based on the Sample Average Approximation (SAA) method, enabling the constructed model to perform incremental learning. The SAA method obtains the solution to the stochastic discrete optimization problem through Monte Carlo computation; when the number of samples N is sufficiently large, the probability of the event converges to 1.
[0034] The approximate formula for the sampled average is expressed as follows:
[0035]
[0036] In the formula, f(x,w) i () is Lipschitz continuous in x.
[0037] For any β > 0, based on the central limit theorem,
[0038]
[0039] in, The probability of this being true is 1-α. Therefore, The convergence rate to E[f(ω) is O(N) -1 / 2 ).
[0040] Step 7: Real-time update of fusion parameters to predict future data from multiple sources. This involves determining the fusion coefficients before implementation and obtaining prediction results for future data from each multi-source sensor, thereby accelerating the control of human-computer interaction tasks. This improves the accuracy of multi-source sensor data prediction results and the reliability of human-computer interaction under various uncertain conditions.
[0041] Furthermore, the online update method for the fusion parameters specifically includes:
[0042] Obtain multi-step prediction results for time series data from multiple sources;
[0043] The generation of independent and identically distributed (IOD) data solves the problem of insufficient data in the prediction stage. Specifically, at the k-th time step, a large amount of IOD data is generated and input into the (kn)-th (n=1,2...)-th data fusion model. The predicted value is used as the true value at the k-th time step, and the parameter combination that meets the requirements is extracted and averaged as the fusion parameter at the k-th time step.
[0044] Based on the fusion parameters at the k-th time step, at the (k+n)-th time step (n=1,2...), the difference between the predicted result and the actual value is compared.
[0045] Set up a feedback loop between the predicted result and the actual value. The predicted result is t. p The actual value is tr The default value for an abnormal status is 'b', and 'warn' represents a warning status. The judgment criteria for the feedback process are as follows:
[0046]
[0047] When the difference is greater than or equal to the preset value b, warn=1 indicates that the human-computer interaction system is in an abnormal operating state; otherwise, no abnormality occurs. When warn=1, the multi-source information data feature fusion model for adaptive construction tasks is adjusted to increase the number of prediction calculations to avoid the problem of missing key decision points.
[0048] Update the fusion parameters, optimize the prediction error, and improve the fusion model's adaptability to uncertain environments.
[0049] This invention discloses an online acceleration system for human-computer interaction tasks using multi-source data fusion, comprising five modules: acquisition, storage, processing, updating, and prediction. The acquisition module collects multi-source information data for human-computer interaction and stores it in the storage module, establishing a multi-source information dataset for various task instructions. The processing module extracts features from the dataset, analyzes task results, and determines whether the results need to be fused, outputting the features of the multi-source information to the updating module. The updating module obtains fusion parameters based on the features learned from the multi-source information dataset of task instructions, updates the fusion parameters in real time, and outputs the updated fusion parameters. Finally, the prediction module calculates multi-step prediction results for the information time series based on the updated fusion parameters, thereby achieving online acceleration of human-computer interaction tasks.
[0050] The acquisition module is used to acquire multi-source perception data for human-computer interaction, and the acquisition module further includes:
[0051] EEG sensors, with no more than 32 leads, are distributed in the occipital and temporal lobe areas of the brain and are used to collect human EEG signals, including steady-state visual evoked potentials, motor imagery, and P300 signals.
[0052] Electrooculo sensor, used to collect electrooculo signals, including horizontal electrooculometry (HEOG) and vertical electrooculometry (VEOG) data;
[0053] The sound sensor, including an air microphone and flexible sensors distributed in the neck and behind the ears, is used to collect human voice signals;
[0054] Electromyography (EMG) sensors are used to acquire electromyographic signals, including surface muscle signals from the hand, forearm, and leg.
[0055] The storage module is used to store and establish a multi-source information dataset of various task instructions, wherein the multi-source information dataset of task instructions consists of various human-computer interaction instructions;
[0056] The processing module is used to obtain the classification results of the multi-source information task dataset and determine whether the results need to be fused.
[0057] The update module obtains fusion parameters based on the features of the multi-source information dataset of the task instructions, and updates the fusion parameters in real time based on the prediction results;
[0058] The prediction module is used to predict the next information trend using the fusion parameters, thereby accelerating the online processing of multimodal human-computer interaction tasks.
[0059] Beneficial effects:
[0060] 1. The present invention discloses a multi-source data fusion human-computer interaction task online acceleration method and system. By deploying various types of sensors for the acquisition of interactive data, it can acquire various physiological or psychological perception data required in the human-computer interaction process, under multiple scenarios, and comprehensively perceive the human's intention to the "machine". Compared with single interaction methods such as touch and voice, it can solve the problems of difficult information acquisition and data loss in complex environments in single human-computer interaction. It has high adaptability and high reliability in interaction under various uncertain conditions.
[0061] 2. This invention discloses an online acceleration method and system for multi-source data fusion human-computer interaction tasks. This method utilizes the fusion of multiple physiological signal features required for human-computer interaction, forming an accelerated auxiliary decision-making framework for multi-source data. By constructing a task-adaptive multi-source information data feature fusion model, the feature parameters of the acquired multi-source sensing signals are used in the fusion model, and a fusion discrimination criterion is adopted to determine whether data fusion should be performed. This solves the problem of redundant interaction instructions, reduces the processing time for multiple task information in multi-modal human-computer interaction under uncertain conditions, facilitates information coupling and decision calculation in multi-modal human-computer interaction, and improves the response speed of multi-modal human-computer interaction.
[0062] 3. This invention discloses an online acceleration method and system for multi-source data fusion human-computer interaction tasks, combined with an online fusion parameter update method. The method has incremental learning capabilities, predicts fusion parameters in advance through multiple steps, and corrects and updates fusion parameters online in real time. It can perform multi-step prediction of future multi-source data, accelerate the intent judgment of human-computer interaction command tasks, solve the problem that multi-modal human-computer interaction data cannot be fused online and the prediction accuracy is low, reduce the probability of decision-making errors in single human-computer interaction in complex environments, improve the accuracy of multi-source sensor data prediction results and the reliability of human-computer interaction under various uncertain conditions, and meet the needs of various scenarios for rapid response and safe and reliable human-computer interaction. Attached Figure Description
[0063] Figure 1This invention discloses a flowchart of an online acceleration method and system for multi-source data fusion human-computer interaction tasks.
[0064] Figure 2 This is a schematic diagram of an online acceleration method for multi-source data fusion human-computer interaction tasks provided in an embodiment of the present invention;
[0065] Figure 3 This is a schematic diagram of a task-adaptive multi-source data fusion model provided in an embodiment of the present invention;
[0066] Figure 4 This is a schematic diagram of an online fusion parameter update method provided in an embodiment of the present invention;
[0067] Figure 5 This is a flowchart of a multi-source information sensing device provided in an embodiment of the present invention;
[0068] Figure 6 This is a schematic diagram of the layout of the acquisition module of the multi-source information sensing device provided in an embodiment of the present invention. Detailed Implementation
[0069] The present invention will now be described in detail with reference to the accompanying drawings and embodiments. The technical problems solved by the present invention and its beneficial effects are also described. It should be noted that the described embodiments are only intended to facilitate understanding of the present invention and do not constitute any limitation thereof.
[0070] Figure 1 This invention provides a flowchart of an online acceleration method and system for human-computer interaction tasks using multi-source data fusion. A multi-source information sensing device acquires various physiological or psychological sensory data required during the interaction task, across multiple scenarios, and establishes a multi-source information task instruction dataset. Then, the multi-source information task instruction dataset is filtered, noise is removed, and trend features of the multi-source information are obtained. A classification algorithm is used to obtain the task classification results of the multi-source information. Based on the task classification results, a learner determines whether data fusion is necessary. If fusion is required, a task-adaptive multi-source data fusion framework is constructed based on the trend features of the multi-source information, and fusion parameters are obtained. Finally, the predicted trend of the multi-source information in the time domain is obtained based on the fusion parameters, prediction errors are corrected in real time, and the fusion parameters are updated online. This completes the determination of multi-step fusion parameters for future multi-source information data, achieving accelerated control of the human-computer interaction task. Figure 2 This is a schematic diagram of an online acceleration method for multi-source data fusion human-computer interaction tasks, corresponding to steps S103-S106 in this invention. Figure 3 and Figure 4 A schematic diagram of a task-adaptive multi-source data fusion model and a schematic diagram of an online update method for fusion parameters. Figure 5 and Figure 6The flowchart of the multi-source information sensing device and the layout diagram of the multi-source information sensing device acquisition module are shown below. The specific implementation steps are as follows:
[0071] This embodiment discloses an online acceleration method for multi-source data fusion human-computer interaction tasks, such as... Figure 1 As shown, it includes the following steps:
[0072] S101: Acquire multi-source information data containing the intent of human-computer interaction task instructions. The multi-source data is collected by the acquisition module of the multi-source information sensing device. The layout diagram of the acquisition module is shown below. Figure 6 As shown, sensors (1-4) are deployed on the user's head, around the eyes, neck, arms, and legs to collect data, thereby improving the diversity of human-computer interaction methods and adaptability to different interaction scenarios.
[0073] Sensor 1 is an EEG sensor with no more than 32 leads, distributed in the occipital and temporal lobe areas of the brain, used to collect human EEG signals, including steady-state visual evoked potentials, motor imagery, and P300 signals;
[0074] Sensor 2 is an electrooculogram (EOG) sensor used to collect EOG signals, including horizontal EOG (HEOG) and vertical EOG (VEOG) data.
[0075] Sensor 3 is a sound sensor, including an air microphone and flexible sensors distributed on the neck and behind the ears of a person, used to collect human voice signals;
[0076] Sensor 4 is an electromyography (EMG) sensor used to acquire EMG signals, which include surface muscle EMG signals from the hand, forearm, and leg.
[0077] Furthermore, the multi-source information data specifically includes: electroencephalogram (EEG), cerebral blood oxygenation, electrooculogram (EOG), speech, and electromyography (EMG) signals; wherein the EEG signals include: steady-state visual evoked potentials, auditory evoked potentials, motor imagery, and event-related potentials; wherein the speech signals include: acoustic speech and bone conduction speech.
[0078] S102: Establish a multi-source information task instruction dataset, which is constructed from multi-source data such as EEG, speech, EEG, and EMG signals. Further, the multi-source information task instruction dataset is divided into a training set and a test set.
[0079] S103: Obtain trend characteristics from multi-source information, such as Figure 2 As shown, this specifically includes using a signal denoising method to filter the task instruction dataset of the multi-source information to remove noise interference, and using a feature extraction method to obtain trend features of various data.
[0080] The denoising methods for multi-source information include: filtering, wavelet decomposition, mode decomposition, low-rank matrix decomposition, and deep learning;
[0081] The feature extraction methods for multi-source information include: statistical methods, sparse transform, deep learning, and autoencoders. Among these, the Stacked Autoencoder (SAE) is a deep neural network composed of multiple autoencoders. The training process of the stacked autoencoder involves training each autoencoder layer by layer, then using the output of the encoder part of each autoencoder as the input to the next layer. During the training of each layer, the encoder output is used as the input to the next layer, and the weight matrix and bias vector of each layer are updated using the backpropagation algorithm. The entire training process of the stacked autoencoder can be completed using the backpropagation algorithm.
[0082] S104: The task classification results for multi-source information data are obtained using signal classification and recognition algorithms. The classification and recognition methods include: Short-Time Fourier Transform, Wavelet Transform, Support Vector Machine, K-Nearest Neighbor Algorithm, Canonical Correlation Analysis (CCA), Hidden Markov Model, Vector Quantization, and Deep Learning. Among these, CCA is a method for analyzing electroencephalogram (EEG) signals. CCA can be used to calculate the correlation coefficient between EEG signals and standard sine and cosine signals of different frequencies. For example, if a multi-channel EEG testing system acquires a time-domain signal X and generates a standard signal Y, CCA calculations are performed on sets X and Y respectively to obtain the correlation coefficient ρ. Finally, the value with the largest correlation coefficient is selected as the prediction result, enabling feature recognition of the EEG signal.
[0083] S105: Based on the task classification results of the multi-source information, and using the fusion discriminant learner, determine whether multi-source data fusion is necessary. Specifically, this includes obtaining the fusion discriminant learner using a combination strategy of average, maximum, minimum, and weighted average values; merging the task results from multiple sensors; analyzing the merged results; and if the task results are the same as those from previous steps, no further data fusion is performed; otherwise, data fusion is performed. This reduces the processing time for multiple types of information in multi-modal human-computer interaction tasks under uncertain conditions and solves the problem of redundant interaction instructions.
[0084] S106: Construct a task-adaptive multi-source information data feature fusion model, such as Figure 3 As shown, this specifically includes acquiring a set of multi-source information data nodes, identifying key task decision points, and analyzing the ability to identify key decision points based on a Bayesian causal inference model to obtain fusion parameters. This is beneficial for the analysis and decision-making of multi-type interactive perception data and improves response speed.
[0085] Specifically, the multi-source information time-series data is normalized, and the angle between data points and the time-series preservation capability are calculated to obtain the multi-source data node set. Here, the time-series preservation capability refers to the time difference between two key points, and the angle between multi-source information data points can be expressed as...
[0086]
[0087] In the formula, Δ i-1,i =|x′ i-1 -x′ i |,Δ i,i+1 =|x′ i -x′ i+1 |,Δ i-1,i+1 =|x′ i-1 -x′ i+1 | represents the time interval between multiple data points, using a uniform time interval.
[0088] Dynamic discretization is performed on multi-source data nodes. Specifically, considering various scenarios such as continuous, discrete, and binary data nodes, an optimal set Ψ is found within the divided intervals. X and the optimal constant equation When the dynamic discretization process converges, Approximately close to f X The relative entropy is used to evaluate the quality of dynamic discretization. The relative entropy (Kullback-Leibler distance) between the probability density equations f(x) and g(x) can be calculated using the following formula:
[0089]
[0090] The relative entropy of the probability density function before and after discretization of multi-source data nodes is,
[0091]
[0092] The boundary of each term on the right side of formula (3) can be determined by the following formula:
[0093]
[0094] In the formula, represent The length of the parameter f; l , and These represent the mean, upper limit, and lower limit of f, respectively.
[0095] Fuzzy clustering is used to determine whether the time series data of a certain node set of multi-source data contains key decision points. The key decision points specifically include state transitions such as start, stop, danger, and invalid.
[0096] Based on the Bayesian causal inference model, the ability to identify key decision points is analyzed, that is, whether it can identify key task decision points such as start, stop, danger, invalidity and other state transitions, and obtain multi-source information data processing and fusion parameters.
[0097] By training a Long Short-Term Memory (LSTM) time series model for decision nodes using historical data with diverse information, multi-step prediction of time series data can be achieved, including regression prediction and deep learning.
[0098] The task-adaptive multi-source information data feature fusion model is continuously updated and learned based on the Sample Average Approximation (SAA) method, enabling the constructed model to perform incremental learning. The SAA method obtains the solution to the stochastic discrete optimization problem through Monte Carlo computation; when the number of samples N is sufficiently large, the probability of the event converges to 1.
[0099] The SAA formula is expressed as follows.
[0100]
[0101] In the formula, f(x,w) i ) is Lipschitz continuous in x.
[0102] For any β > 0, based on the central limit theorem,
[0103]
[0104] in, The probability of this being true is 1-α. Therefore, The convergence rate to E[f(ω) is O(N) -1 / 2 ).
[0105] S107: Real-time update of fusion parameters for error correction, prediction of future data from multiple sources to determine multi-step fusion coefficients, obtain prediction results of future data from each multi-source sensor, and accelerate control of human-computer interaction tasks.
[0106] like Figure 4 As shown, based on the multi-step prediction results of multi-source information time series, fusion parameters are generated, the error between the true value and the predicted data is calculated, the error is corrected, and new fusion parameters are generated, specifically including:
[0107] At the k-th time step, a large amount of independent and identically distributed data is generated and input into the data fusion model at the (kn)-th step (n=1,2...). The predicted value is used as the true value at the k-th step. The parameter combination that meets the requirements is extracted and averaged as the fusion parameter at the k-th time step.
[0108] Based on the fusion parameters at the k-th time step, at the (k+n)-th time step (n=1,2...), the difference between the predicted result and the actual value is compared.
[0109] Set up a feedback loop between the predicted result and the actual value. Assume the predicted result is t. p Compared with the actual value t r The difference between the values, with an abnormal state preset value of 'b', and 'warn' representing a warning state. Judgment criteria for the feedback loop:
[0110]
[0111] When the difference is greater than or equal to the preset value b, warn=1 indicates that the human-computer interaction system is in an abnormal operating state; otherwise, it is considered that no abnormality has occurred. When warn=1, the multi-source information data feature fusion model for adaptive construction tasks is adjusted to increase the number of prediction calculations to avoid the problem of missing key decision points.
[0112] Update the fusion parameters, optimize the prediction error, and improve the fusion model's adaptability to uncertain environments.
[0113] Example 2
[0114] To implement the above embodiments, this embodiment also discloses an online acceleration system for multi-source data fusion human-computer interaction tasks, such as... Figure 5 As shown, the multi-source information sensing device includes: an acquisition module 100, a storage module 200, an update module 300, a processing module 400, and a prediction module 500. The acquisition module collects multi-source information data for human-computer interaction and stores it in the storage module, establishing a multi-source information dataset for various task instructions. The processing module extracts features from the dataset, analyzes the task results, determines whether the results need to be fused, and outputs the features of the multi-source information to the update module. The update module obtains fusion parameters based on the features learned from the multi-source information dataset of the task instructions, updates the fusion parameters in real time, and outputs the updated fusion parameters. Finally, the prediction module calculates multi-step prediction results for the information time series based on the updated fusion parameters, achieving online acceleration of human-computer interaction tasks.
[0115] The acquisition module is used to acquire multi-source information about the user and perceived data from the interactive environment. A schematic diagram of the acquisition module's layout is shown below. Figure 6 As shown, where:
[0116] Sensor 1 is an EEG sensor with no more than 32 leads, distributed in the occipital and temporal lobe areas of the brain, used to collect human EEG signals, including steady-state visual evoked potentials, motor imagery, and P300 signals;
[0117] Sensor 2 is an electrooculogram (EOG) sensor used to collect EOG signals, including horizontal EOG (HEOG) and vertical EOG (VEOG) data.
[0118] Sensor 3 is a sound sensor, including an air microphone and flexible sensors distributed on the neck and behind the ears of a person, used to collect human voice signals;
[0119] Sensor 4 is an electromyography (EMG) sensor used to acquire EMG signals, which include surface muscle EMG signals from the hand, forearm, and leg.
[0120] The storage module is used to store and establish a multi-source information dataset of various task instructions, wherein the multi-source information dataset of task instructions consists of various human-computer interaction instructions;
[0121] The processing module is used to obtain the classification results of the multi-source information task dataset and determine whether the results need to be fused.
[0122] The update module is used to obtain fusion parameters based on the features learned from the multi-source information dataset of the task instructions;
[0123] The prediction module is used to predict the next information trend using the fusion parameters, thereby accelerating the online execution of multi-modal human-computer interaction tasks. The prediction module includes:
[0124] Specifically, a feedback loop is set up between the predicted data and the measured data in the prediction module. When the difference is greater than the preset value / range, it is generally considered that there is an abnormal operating state or environmental state. The piecewise linear approximation scale in the prediction module is reduced, that is, the number of prediction calculations is increased, in order to avoid the problem of key states not being acquired. The connection strength parameters between nodes can be obtained through offline training and online optimization, thereby improving the adaptability of the multi-source information fusion model to uncertain environments.
[0125] The above detailed description further illustrates the purpose, technical solution, and beneficial effects of the invention. It should be understood that the above description is only a specific embodiment of the present invention and is not intended to limit the scope of protection of the present invention. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of the present invention should be included within the scope of protection of the present invention.
Claims
1. A method for accelerating online human-computer interaction tasks through multi-source data fusion, characterized in that: Includes the following steps, Step 1: Obtain multi-source information data containing the intent of human-computer interaction task instructions through a multi-source data fusion human-computer interaction task online acceleration system; Step 2: Establish a task instruction dataset with multi-source information; Step 3: The task instruction dataset of the multi-source information is filtered using signal denoising and feature extraction methods to remove noise interference, and trend features of various data are obtained through feature extraction methods. Step 4: Use a signal classification and recognition algorithm to obtain the single-task classification result of multi-source information data; Step 5: Based on the fusion discriminant learner, determine whether to fuse the multi-source data; Step 5 is implemented as follows: The fusion discriminative learner is based on a combination strategy method: it uses a combination strategy method of average value, maximum value, minimum value and weighted average value to merge the task results of multiple sensors, analyzes the merged result, and if the merged result differs from the previous steps, it performs data fusion. Otherwise, data fusion will not be performed; Step 6: Construct a task-adaptive multi-source information data feature fusion model; Step 7: Update fusion parameters in real time to predict future data from multiple sources. This will help determine the fusion coefficients before implementation, obtain the prediction results for future data from each multi-source sensor, accelerate the control of human-computer interaction tasks, improve the accuracy of multi-source sensor data prediction results, and enhance the reliability of human-computer interaction under various uncertain conditions. Step 7 is implemented as follows: Obtain multi-step prediction results for time series data from multiple sources; The generation of independently distributed data solves the problem of no data in the prediction stage. Specifically, at the k-th time step, a large amount of independently and identically distributed data is generated and input into the data fusion model at the kn-th time step. The predicted value is used as the true value at the k-th time step. The parameter combination that meets the requirements is extracted and averaged as the fusion parameter at the k-th time step. Based on the fusion parameters at the k-th time step, at the (k+n)-th time step, compare the difference between the predicted result and the actual value. Set up a feedback loop between the predicted results and the actual values; the predicted results are... Actual value The default value for an abnormal status is b, and warn represents a warning status. Criteria for determining feedback: When the difference is greater than or equal to the preset value b, warn=1 indicates that the human-computer interaction system is in an abnormal operating state; otherwise, no abnormality occurs. When warn=1, the number of prediction calculations is increased to adjust the multi-source information data feature fusion model that is adaptive to the construction task, so as to avoid the problem of not obtaining key decision points; the prediction error is optimized to improve the adaptability of the fusion model to uncertain environments.
2. The online acceleration method for multi-source data fusion human-computer interaction tasks as described in claim 1, characterized in that: The multi-source information data mentioned in step 1 specifically includes: electroencephalogram (EEG), cerebral blood oxygenation, electrooculogram (EOG), speech, and electromyography (EMG) signals; wherein, the EEG signals include: steady-state visual evoked potentials, auditory evoked potentials, motor imagery, and event-related potentials; wherein, the speech signals include: acoustic speech and bone conduction speech.
3. The online acceleration method for multi-source data fusion human-computer interaction tasks as described in claim 2, characterized in that: The dataset described in step 2 consists of various human-computer interaction task instruction data, and the dataset is divided into a training set and a test set.
4. The online acceleration method for multi-source data fusion human-computer interaction tasks as described in claim 1, characterized in that: Step 6 is implemented as follows: The multi-source information data is normalized: A multi-source data node set is obtained by calculating the angle between multi-source data points and the time-time preservation capability. The time-time preservation capability refers to the time difference between two key points, and the angle between multi-source information data points is expressed as follows: (1) In the formula, , , , representing the time interval between multiple data points, using a uniform time interval; Dynamic discretization is performed on multi-source data nodes. Specifically, this takes into account various scenarios where data nodes can be continuous, discrete, or binary, and seeks an optimal set within the defined intervals. and the optimal constant equation This allows the dynamic discretization process to converge. Approximately close to The relative entropy is used to evaluate the quality of dynamic discretization; probability density equation. and The relative entropy between them is calculated by the following formula: (2) The relative entropy of the probability density function before and after discretization of multi-source data nodes is: (3) The boundary of each term on the right side of formula (3) is determined by the following formula: (4) In the formula, represent Length; Parameter , and Represent The mean, upper limit, and lower limit values; Fuzzy clustering method is used to determine whether the time series data of a certain node set of multi-source data contains key decision points. The key decision points specifically include state transitions such as start, stop, danger and invalid. Based on the Bayesian causal inference model, the ability to identify key decision points is analyzed, that is, whether key task decision points such as start, stop, danger, and invalid state transitions can be identified, and multi-source information data processing and fusion parameters are obtained. By training a long short-term memory time series model of decision nodes with historical data of diverse information, multi-step prediction of time series data can be achieved, including regression prediction and deep learning. The sample average approximation method continuously updates and learns the task-adaptive multi-source information data feature fusion model, enabling the constructed task-adaptive multi-source information data feature fusion model to perform incremental learning. The sample average approximation method obtains the solution of the stochastic discrete optimization problem through Monte Carlo calculation. When the number of samples N is large enough, the probability of the event converges to 1. The approximate formula for the sampled average is expressed as follows: (5) In the formula, exist The middle part is Lipschitz continuum; For any Based on the central limit theorem, (6) in, , The probability of it being true is ;thus, convergence to The convergence rate is .
5. A multi-source data fusion human-computer interaction task online acceleration system, used to implement the multi-source data fusion human-computer interaction task online acceleration method as described in claim 1, 2, 3 or 4, characterized in that: It includes five modules: acquisition, storage, processing, update, and prediction. The acquisition module collects multi-source information data for human-computer interaction, stores it in the storage module, and establishes a multi-source information dataset for various task instructions. The processing module performs feature extraction and task result analysis on the dataset and determines whether the results need to be fused, and outputs the features of multi-source information to the update module; The update module learns the features from the multi-source information dataset of the task instructions, obtains the fusion parameters, updates the fusion parameters in real time, and outputs the updated fusion parameters. Finally, the prediction module calculates the multi-step prediction results of the information time series based on the updated fusion parameters, thereby realizing online acceleration of human-computer interaction tasks.
6. The online acceleration system for multi-source data fusion human-computer interaction tasks as described in claim 5, characterized in that: The acquisition module is used to acquire multi-source perception data for human-computer interaction, and the acquisition module further includes: EEG sensors, with no more than 32 leads, are distributed in the occipital and temporal lobe areas of the brain and are used to collect human EEG signals, including steady-state visual evoked potentials, motor imagery, and P300 signals. Electrooculo sensor, used to acquire electrooculo signals, including horizontal electrooculometry (HEOG) and vertical electrooculometry (VEOG) data; The sound sensor, including an air microphone and flexible sensors distributed in the neck and behind the ears, is used to collect human voice signals; Electromyography (EMG) sensors are used to acquire electromyographic signals, which are surface electromyographic signals, including those from the hand, forearm, and leg. The storage module is used to store and build a multi-source information dataset of various task instructions, wherein the multi-source information dataset consists of various human-computer interaction instructions; The processing module is used to obtain the classification results of the multi-source information dataset and determine whether the results need to be fused. The update module obtains fusion parameters based on the features of the multi-source information dataset and updates the fusion parameters in real time based on the prediction results; The prediction module is used to predict the next information trend using the fusion parameters, thereby accelerating the online processing of multimodal human-computer interaction tasks.