A face facial expression classification method and system

By using MXene-based hydrogel electrodes to collect facial electromyography and electrooculography signals and performing feature cross-referencing and fusion, the problems of poor signal quality and low classification performance in existing technologies are solved, achieving higher expression recognition accuracy and reduced skin irritation.

CN119150095BActive Publication Date: 2026-06-23BEIJING INST OF TECH

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
BEIJING INST OF TECH
Filing Date
2024-08-27
Publication Date
2026-06-23

AI Technical Summary

Technical Problem

Existing facial expression classification methods and systems suffer from poor signal quality, poor feature extraction, and poor facial expression classification performance. In particular, in VR, prolonged wearing of medical electrodes can cause skin irritation and inflammation, affecting recognition accuracy.

Method used

MXene-based hydrogel electrodes were used to collect facial electromyography and electrooculography signals. The recognition accuracy was improved through signal preprocessing, data preprocessing, feature extraction, cross-referencing and fusion, screening and machine learning classification.

Benefits of technology

By combining and fusing features, the number and types of salient features were significantly increased, improving the accuracy of facial expression classification and alleviating skin irritation caused by prolonged wear of medical electrodes.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN119150095B_ABST
    Figure CN119150095B_ABST
Patent Text Reader

Abstract

The present application belongs to the technical field of data representation and signal classification, and relates to a face expression classification method and system. The system comprises a facial bioelectricity sensor, a hardware data noise filtering module and an upper computer. The facial bioelectricity sensor is used to collect facial electromyography, electrooculography signals and posture data. The hardware data noise filtering module is used to process the electromyography and electrooculography signals. The method extracts time domain, frequency domain and transform domain features from the digital electromyography and electrooculography data after digital-analog conversion. The features extracted are cross-correlated and fused after correlation coefficient calculation and judgment. After the average MIC is calculated, the selected data is classified to obtain the expression type. The method and system can increase the number and type of significant features by cross-correlating and fusing the features according to the correlation coefficient and threshold value. After statistics, screening and alignment, the classification is more accurate than the existing method and system.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention belongs to the field of data representation and signal classification technology, and relates to a method and system for classifying human facial expressions. Background Technology

[0002] With the rapid development of virtual reality (VR) and augmented reality (AR) technologies, VR technology has been widely applied in areas such as rehabilitation for autistic patients, training of children's social skills, cognitive training for the elderly, and teleconferencing. Facial expressions are one of the important ways for humans to communicate emotions, and the demand for facial expression recognition in the field of human-computer interaction is growing. Facial expressions also play a key role in VR technology; therefore, facial expression recognition technology has become a hot research topic.

[0003] In VR facial expression recognition technology, traditional cameras track the face to identify the VR user's facial expressions. However, due to the wearing of the VR headset, the camera is often obstructed. In recent years, many researchers have attempted to use wearable devices to collect facial bioelectrical signals from users immersed in VR experiences, such as E0G and FEMG signals. However, due to limitations in the number of electrodes and algorithms, the recognition accuracy is low. Furthermore, during prolonged wear, medical electrode patches are used to collect physiological signals, which can easily cause skin irritation and inflammation. Using MXene-based hydrogel electrodes can solve the problem of wear affecting VR usage.

[0004] This application aims to improve the low recognition accuracy and effectively alleviate the problem of prolonged wearing of medical electrodes by using electromyography and electrooculography signals collected by hydrogel electrodes through signal preprocessing, data preprocessing, feature extraction, cross-referencing, fusion, screening, statistics, and machine learning classification. Summary of the Invention

[0005] The purpose of this invention is to address the shortcomings of existing facial expression classification methods and systems, such as poor signal quality, poor feature extraction, and poor facial expression classification performance. This invention proposes a facial expression classification method and system, comprising a facial bioelectric sensor, a hardware data filtering module, and a host computer. The facial bioelectric sensor is used to collect facial electromyography (EMG), electrooculography (EOG) signals, and posture data. The hardware data filtering module processes the EMG and EOG signals. The method extracts features from the digital-to-analog converted EMG and EOG data to obtain time-domain, frequency-domain, and transform-domain features. The extracted features are then cross-referenced and fused after correlation coefficients are calculated and judged. The average microarray value (MIC) is calculated, and the features are then filtered and aligned. The selected data is then classified to obtain the expression type. This method and system, based on correlation coefficients and thresholds, increases the number and types of significant features through feature cross-references and fusion. After statistical analysis, filtering, and alignment, the classification achieves higher accuracy compared to existing methods and systems.

[0006] To achieve the above objectives, the present invention adopts the following technical solution:

[0007] In a first aspect, the present invention provides a facial expression classification system, comprising a data acquisition unit, a hardware data noise filtering module, a data preprocessing module, and a machine learning expression classification module. The data acquisition unit includes several electrodes and is connected to the hardware data noise filtering module, which is also connected to the data preprocessing module. The data preprocessing module outputs data for feature processing and selection. The system also includes an electromyography (EMG) feature processing module, an electrooculography (EOG) feature processing module, and a feature selection module. The data acquisition unit acquires and outputs facial EMG and EOG signals. The facial EMG and EOG signals are respectively input into the hardware data noise filtering module and the data preprocessing module to obtain preprocessed EMG and EOG data. The preprocessed EMG and EOG data are processed by the EMG and EOG feature processing modules to extract features, calculate correlation coefficients, and cross-reference preprocessed EMG and EOG data that meet thresholds to obtain feature-processed EMG and EOG data. The feature-processed EMG and EOG data are then processed by the feature selection module for feature fusion, statistics, filtering, and alignment to obtain feature-selected data. The feature-selected data is then classified by the machine learning expression classification module to obtain expression types.

[0008] The electromyography (EMG) feature processing module includes an EMG feature extraction unit, and the electrooculography (EOG) feature processing module includes an EOG feature extraction unit. The EMG feature extraction unit extracts the temporal, frequency, statistical, and transform domain features of the preprocessed facial EMG data to obtain EMG extracted features. The EOG feature extraction unit extracts the temporal, frequency, statistical, and transform domain features of the preprocessed EOG data to obtain EOG extracted features.

[0009] The electromyography (EMG) feature processing module further includes an EMG correlation coefficient calculation unit, an EMG correlation coefficient judgment unit, and an EMG feature cross-validation unit. The EMG correlation coefficient calculation unit calculates the EMG correlation coefficient of the extracted EMG features. The EMG correlation coefficient judgment unit determines whether the calculated EMG correlation coefficient is less than the EMG correlation coefficient threshold. If it is less, the extracted EMG features corresponding to the EMG correlation coefficient are sent to the EMG feature cross-validation unit for cross-validation to obtain the EMG data after feature processing. Otherwise, if it is greater than or equal to the EMG correlation coefficient threshold, the extracted EMG features corresponding to the EMG correlation coefficient are sent to the feature selection module as the EMG data after feature processing.

[0010] The electrooculogram (EOG) feature processing module further includes an EOG correlation coefficient calculation unit, an EOG correlation coefficient judgment unit, and an EOG feature cross-validation unit. The EOG correlation coefficient calculation unit calculates the EOG correlation coefficient of the extracted EOG features. The EOG correlation coefficient judgment unit determines whether the calculated EOG correlation coefficient is less than the EOG correlation coefficient threshold. If it is less, the extracted EOG features corresponding to the EOG correlation coefficient are sent to the EOG feature cross-validation unit for cross-validation to obtain the feature-processed EOG data. If it is greater than or equal to the EOG correlation coefficient threshold, the extracted EOG features corresponding to the EOG correlation coefficient are sent to the feature selection module as the feature-processed EOG data.

[0011] In specific implementation, the crossing of the electromyographic feature crossing unit and the electrooculographic feature crossing unit is achieved by Kronecker product, which is used to expand the number of salient features;

[0012] The feature selection module includes a feature fusion unit, an average MIC calculation unit, a feature filtering unit, and a feature alignment unit. The feature fusion unit receives and fuses the feature-processed EMG and EMG data from the EMG feature processing module and the EMG feature processing module, respectively, to obtain fused features. The average MIC calculation unit calculates the average MIC of the fused features. The feature filtering unit filters the fused features based on the calculated average MIC to obtain filtered features. The feature alignment unit aligns the filtered features to obtain aligned features. The aligned features are then input into the machine learning expression classification module.

[0013] The fusion refers to one or more combinations of addition, series fusion, multi-level fusion, and multi-scale fusion.

[0014] The fusion can yield new effective feature types;

[0015] In specific implementation, the screening can be based on the calculated average MIC, or it can employ statistical screening, model screening, and embedded screening. The statistical screening includes one or more of the following: variance selection, chi-square test, mutual information, and recursive methods. The model screening includes one or a combination of Pandas module screening, Sklearn module screening, LSTM model screening, CNN model screening, and DNN model screening. The embedded screening includes one or a combination of filtering, wrapping, and tree-based methods.

[0016] The classification system calculates the average MIC of the electromyography and electrooculography data after feature processing, and then filters and selects or combines other filtering methods to obtain filtered features. The features are then aligned to obtain feature-selected data. The feature-selected data not only increases the number of significant features, but also obtains new effective or significant feature types.

[0017] Secondly, the present invention provides a method for classifying facial expressions, comprising the following steps:

[0018] S1. Collect and acquire facial electromyography (EMG) and electrooculography (EOG) signals;

[0019] S2. Perform signal preprocessing on facial electromyography (EMG) signals and electrooculography (EOG) signals respectively;

[0020] S3. Perform analog-to-digital conversion on the preprocessed facial electromyography (EMG) signals and electrooculography (EOG) signals respectively.

[0021] S4. Perform data preprocessing on the facial electromyography and electrooculography data obtained after analog-to-digital conversion;

[0022] S5. Perform feature processing on the facial electromyography data and electrooculography data obtained after data preprocessing, respectively.

[0023] S6. Perform feature selection on the facial electromyography and electrooculography features obtained after feature processing to obtain aligned features;

[0024] S7. The aligned features obtained after feature selection are classified using a machine learning model to obtain the expression type.

[0025] In S1, facial electromyography and electrooculography signals are acquired and obtained through a data acquisition unit. The acquisition unit acquires and obtains bioelectrical signals with high signal-to-noise ratio and high conductivity through several electrodes fixed inside the AR goggles.

[0026] The bioelectric signals include facial muscle electrical signals and electrooculography (EOG) signals;

[0027] The high signal-to-noise ratio is greater than or equal to 25 dB; the high conductivity is greater than or equal to 0.8 × 10⁻⁶. -2 S·cm -1 The electrode is an MXene-based hydrogel electrode, which is prepared by MXene-based hydrogel. The preparation process has the advantages of simple process, uniform distribution of MXene and biocompatibility.

[0028] S2 performs signal preprocessing on the facial electromyography (EMG) signal, specifically: performing simulated low-pass filtering on the EMG signal; performing a first-stage amplification on the simulated low-pass filtered EMG signal; performing simulated high-pass filtering on the first-stage amplified EMG signal; and performing a second-stage amplification on the simulated high-pass filtered EMG signal to obtain the preprocessed facial EMG signal. Similarly, S2 performs signal preprocessing on the electrooculogram (EOG) signal, specifically: performing simulated low-pass filtering on the EOG signal; performing a first-stage amplification on the simulated low-pass filtered EOG signal; performing simulated high-pass filtering on the first-stage amplified EOG signal; and performing a second-stage amplification on the simulated high-pass filtered EOG signal to obtain the preprocessed EOG signal.

[0029] S4 performs data preprocessing on the facial electromyography (EMG) data obtained after analog-to-digital conversion. Specifically, it performs digital filtering on the EMG data after analog-to-digital conversion and normalizes the digitally filtered EMG data to obtain preprocessed EMG data. It also performs data preprocessing on the electrooculography (EOG) data obtained after analog-to-digital conversion. Specifically, it performs digital filtering on the EOG data after analog-to-digital conversion and normalizes the digitally filtered EOG data to obtain preprocessed EOG data.

[0030] S5 performs feature processing on the facial electromyography (EMG) data obtained after data preprocessing. Specifically, it extracts time-domain, frequency-domain, statistical, and transform-domain features from the preprocessed facial EMG data to obtain EMG extraction features; it calculates the EMG correlation coefficient of the EMG extraction features, and determines whether the calculated EMG correlation coefficient is less than the EMG correlation coefficient threshold. If it is less than the EMG correlation coefficient threshold, the EMG extraction features corresponding to the EMG correlation coefficient are cross-referenced to obtain the feature-processed EMG data; otherwise, if it is greater than or equal to the EMG correlation coefficient threshold, the EMG extraction features corresponding to the EMG correlation coefficient are used as the feature-processed EMG data.

[0031] The electrooculogram (EOG) data obtained after data preprocessing is subjected to feature processing, specifically: extracting time-domain, frequency-domain, statistical, and transform-domain features from the preprocessed EOG data to obtain EOG extracted features; calculating the EOG correlation coefficient of the EOG extracted features; determining whether the correlation coefficient calculated by the EOG correlation coefficient calculation unit is less than the EOG correlation coefficient threshold; if it is less than the EOG correlation coefficient threshold, the EOG extracted features corresponding to the EOG correlation coefficient are cross-referenced to obtain the feature-processed EOG data; otherwise, if it is greater than or equal to the EOG correlation coefficient threshold, the EOG extracted features corresponding to the EOG correlation coefficient are used as the feature-processed EOG data.

[0032] S6 involves selecting features from the facial electromyography (EMG) and electrooculography (EOG) features obtained after feature processing. Specifically, this includes: receiving and fusing the processed EMG and EOG data to obtain fused features; calculating the average MIC of the fused features; selecting the top N fused features corresponding to the calculated average MIC values ​​as filtered features; aligning the filtered features to obtain aligned features; the fusion includes one or more combinations of addition, serialization, multi-level fusion, and multi-scale fusion.

[0033] The fusion can yield new effective feature types; the filtering, in specific implementation, can be based on the calculated average MIC, or it can employ statistical filtering, model filtering, and embedded filtering; the statistical filtering includes one or more of variance selection, chi-square test, mutual information, and recursive methods; the model filtering includes one or a combination of Pandas module filtering, Sklearn module filtering, LSTM model filtering, CNN model filtering, and DNN model filtering; the embedded filtering includes one or a combination of filtering, wrapping, and tree-based methods.

[0034] Overall, the innovations of the classification method and system include two aspects:

[0035] On the one hand, there is acquisition, signal preprocessing, and data preprocessing;

[0036] The data acquisition is achieved through a data acquisition unit with several electrodes. The electrodes are MXene-based hydrogel electrodes, which have a simple fabrication process, uniform MXene distribution on the electrodes, and good biocompatibility. The acquired facial muscle electroencephalogram (FEAG) and electrooculogram (EOG) signals have the advantages of high signal-to-noise ratio and high conductivity.

[0037] The signal preprocessing involves cross-processing of the acquired electromyography (EMG) and electrooculography (EOG) signals through two stages of filtering and amplification: low-pass filtering, primary amplification, high-pass filtering, and secondary amplification. Specifically, the low-pass and high-pass filtering of the EMG signals can be interchanged, and the primary and secondary amplification factors are adjusted based on the signal quality before and after processing. The amplification factors of the primary and secondary amplifications are greater than or equal to 10 and less than or equal to 200. The cutoff frequency of the low-pass filter is greater than or equal to 10Hz and less than or equal to 50Hz. The cutoff frequency of the high-pass filter is greater than or equal to 200Hz and less than or equal to 600Hz.

[0038] In specific implementation: the low-pass and high-pass filters of the electrooculogram signal can also be interchanged; the first-stage and second-stage amplification factors are adjusted according to the signal quality before and after processing; the amplification factors of the first-stage and second-stage amplification are greater than or equal to 10 and less than or equal to 200; the cutoff frequency of the low-pass filter is less than or equal to 20Hz; the cutoff frequency of the high-pass filter is greater than or equal to 20Hz and less than or equal to 100Hz.

[0039] In specific implementations of the data preprocessing, in addition to digital filtering and normalization, a data scaling unit is added before or after digital filtering to adjust the numerical range; a data augmentation unit can also be added before normalization to adjust the numerical range while highlighting data features to facilitate normalization processing.

[0040] On the other hand, the innovation lies in the feature processing and feature selection parts;

[0041] The feature processing, in specific implementation, involves: extracting the time domain, frequency domain, statistical and transform domain features of the facial electromyography (EMG) data after data preprocessing to obtain EMG extraction features; in specific implementation, feature decomposition can also be used.

[0042] The time-domain features include, but are not limited to, electromyographic integral values, root mean square values, waveform length, and mean absolute values;

[0043] The frequency domain features include, but are not limited to, features after FFT and STFT transformation;

[0044] The statistical features include, but are not limited to, variance and slope features;

[0045] The transform domain features include, but are not limited to, CWD distribution features;

[0046] The decomposition features include, but are not limited to, SVD decomposition;

[0047] The feature selection is specifically implemented as follows: receiving and fusing the processed electromyography and electrooculography data to obtain fused features; calculating the average MIC of the fused features; selecting the top N fused features corresponding to the calculated average MIC values ​​as filtered features; and aligning the filtered features to obtain aligned features.

[0048] The fusion includes one or more combinations of addition, serialization, multi-level fusion, and multi-scale fusion; the fusion can yield new effective feature types.

[0049] In specific implementation, the screening can be based on the calculated average MIC, or it can employ statistical screening, model screening, and embedded screening. Statistical screening includes, but is not limited to, variance selection, chi-square test, mutual information, and recursive methods. Model screening includes one or a combination of Pandas module screening, Sklearn module screening, LSTM model screening, CNN model screening, and DNN model screening. Embedded screening includes one or a combination of filtering, wrapping, and tree-based methods.

[0050] In specific implementation of the present invention, by relying on the aforementioned feature processing and feature selection output aligned features, the number and types of effective features are significantly increased, and a more accurate recognition rate can be obtained compared with existing methods and systems.

[0051] Beneficial effects

[0052] The present invention proposes a method and system for classifying human facial expressions, which has the following advantages compared with existing expression classification methods and systems:

[0053] 1. The classification system relies on a host computer to extract time-domain, frequency-domain, and transform-domain features from the electromyography and electrooculography data after digital-to-analog conversion, calculate the correlation coefficients between features, select features that meet the threshold, and then perform cross-referencing using the Kronecker product; the number of significant features obtained from the feature expansion;

[0054] 2. The classification method and system calculate the average MIC of the electromyography and electrooculography data after feature processing, and then perform screening and feature alignment to obtain the selected features. This can increase the number of significant features while fusing effective feature types.

[0055] 3. The classification method and system described above rely on feature processing and feature selection to increase the number and types of output features significantly and effectively, and can achieve a more accurate recognition rate compared with existing methods and systems. Attached Figure Description

[0056] Figure 1 This is a schematic diagram showing the composition and connection of a facial expression classification system according to the present invention;

[0057] Figure 2 This is a flowchart of a method for classifying human facial expressions according to the present invention;

[0058] Figure 3 This is a schematic diagram of the electrode distribution in a specific implementation of the facial expression classification system of the present invention;

[0059] Figure 4 This is a diagram showing the SVM classification effect in an embodiment of the present invention. Detailed Implementation

[0060] The following detailed description, in conjunction with the accompanying drawings and embodiments, illustrates the specific implementation of the facial expression classification method and system proposed in this invention.

[0061] Example 1

[0062] This invention proposes a method and system for classifying facial expressions, applicable to VR technology and human-computer interaction, such as enabling robots to better judge facial expression features and thus better determine human emotions. The system includes a facial bioelectric sensor, a hardware data filtering module, and a host computer. The facial bioelectric sensor collects facial electromyography (EMG), electrooculography (EOG) signals, and posture data. The hardware data filtering module processes the EMG and EOG signals. The method extracts features from the digital-to-analog converted EMG and EOG data to obtain time-domain, frequency-domain, and transform-domain features. Correlation coefficients are calculated for the extracted features, and cross-fertilization and fusion are performed. The average microinterpretation index (MIC) is then calculated, followed by filtering and feature alignment. The selected data is then classified to obtain expression types. This method and system, based on correlation coefficients and thresholds, performs feature cross-fertilization and feature fusion to increase the number and types of significant features. After statistical analysis, filtering, and alignment, the classification achieves higher accuracy compared to existing methods and systems.

[0063] Figure 1 This is a schematic diagram illustrating the composition and connection of a facial expression classification system according to the present invention. The classification system includes a signal acquisition unit, a signal preprocessing module (which implements hardware noise filtering and multi-level amplification of each signal from a hardware perspective), a data preprocessing module, a feature processing module, and a machine learning expression classification module. The signal acquisition unit includes several electrodes and is connected to the signal preprocessing module, which is connected to the data preprocessing module, which is connected to the feature processing module, and the feature processing module is connected to the machine learning expression classification module. The feature processing module includes an electromyography (EMG) feature processing module and an electrooculography (EOG) feature processing module. The data preprocessing module outputs preprocessed data, primarily through digital filtering and normalization, for feature processing and selection.

[0064] When normalization is implemented, it includes row normalization, column normalization, and normalization based on various norms, such as the 2-norm.

[0065] After data preprocessing, the electromyography (EMG) and electrooculography (EOG) data are processed by the EMG feature processing module and the EOG feature processing module, respectively, to extract features and calculate correlation coefficients. The preprocessed EMG and EOG data that meet the threshold are then cross-referenced to obtain the feature-processed EMG and EOG data. The feature-processed EMG and EOG data are then processed by the feature selection module for feature fusion, statistics, filtering, and alignment to obtain the feature-selected data. The feature-selected data are then classified by the machine learning expression classification module to obtain the expression type.

[0066] In specific implementation, the data acquisition unit includes several electrodes, which are prepared using MXene-based hydrogels. The preparation process has the advantages of simple technology, uniform distribution of MXene, and biocompatibility. The bioelectrical signals (including facial muscle electrical signals and electrooculography signals) acquired by the MXene-based hydrogel electrodes have high signal-to-noise ratio and high conductivity.

[0067] In practical implementation, the signal quality of facial muscle electroencephalogram (EEG) and electrooculogram (EOG) signals acquired by the MXene-based hydrogel electrode is superior to that acquired by existing medical electrodes; the average signal-to-noise ratio of the bioelectrical signals acquired by the medical electrodes is less than 24 dB; and the conductivity does not exceed 0.8 × 10⁻⁶. -2 S·cm -1 In this embodiment, the average signal-to-noise ratio of facial muscle electroencephalogram (FEG) and electrooculogram (EOG) signals acquired using MXene-based hydrogel electrodes is no less than 26 dB; the conductivity is greater than 1 × 10⁻⁶. -2 S·cm -1 Acquiring signals with high signal-to-noise ratio and high conductivity is more conducive to subsequent data processing and classification, thus improving classification accuracy.

[0068] In practice, test data were collected from eight subjects, including four men and four women. Multiple electrodes distributed across the subjects' faces were worn, including four surface electromyography (SEMG) electrodes and four eye-tracking electrodes.

[0069] Electrode distribution in existing technologies: Existing technologies involve a large number of collection electrodes, which are commonly distributed in the seven muscles commonly used in facial expressions. Taking the forehead as a reference electrode, the electrodes are distributed in the frontalis muscle, corrugator supercilii muscle, orbicularis oculi muscle, levator labii superioris and nasolabial folds, zygomaticus major muscle, orbicularis oris muscle, and depressor anguli oris muscle. Due to the symmetry of facial muscles, seven pairs of electrodes are required, resulting in a large number of electrodes.

[0070] The traditional layout for acquiring electrooculogram (EOG) signals is with the eyes positioned vertically and horizontally, and the reference electrode located in the middle of the forehead.

[0071] The electrode distribution in this application is as follows: Combining the characteristics of electrooculogram (EOG) and electromyogram (EMG) signals, EOG signal electrodes are distributed on the left and right sides of both eyes and on the top and bottom of one eye to collect EOG signals. EMG signal electrodes are also distributed, with a total of four electrodes located on the levator labii superioris and orbital muscles. Existing electrode layouts only place four electrodes on the forehead without a reference electrode. The electrode layout in this application differs from existing layouts: For EOG signals, the traditional layout involves electrode patches placed on the top, bottom, left, and right sides of the eyes, with an additional electrode serving as a reference electrode.

[0072] In order to collect facial measurement data using a hardware acquisition system, the hardware system includes a facial bioelectric sensor, a hardware data noise filtering module, and a host computer;

[0073] The facial bioelectric sensor, also known as the data acquisition unit, includes an electromyography (EMG) signal acquisition unit and an electrooculography (EOG) signal acquisition unit.

[0074] The facial bioelectric sensor includes eight MXene-based hydrogel electrodes; the facial measurement data includes electromyography (EMG) signals, inertial sensing signals, and electrooculogram (EOG) signals, wherein the surface EMG signals are facial EMG signals and are acquired through an EMG signal acquisition unit; the EOG signals are acquired through an EOG signal acquisition unit.

[0075] Electrode distribution of the electrooculogram (EOG) signal acquisition unit and the facial muscle electromuscular signal acquisition unit:

[0076] The hardware data noise filtering module includes a facial surface electromyography (EMG) module, an electrooculogram (EOG) signal module, and a motherboard; the EMG module processes facial EMG signals; the EOG signal module processes EOG signals; and the motherboard is used for analog-to-digital conversion and power supply.

[0077] The signal preprocessing module, in its specific implementation, performs hardware noise filtering and multi-stage amplification of each signal from a hardware perspective, including an electromyography (EMG) signal preprocessing module and an electrooculography (EOG) signal preprocessing module. The EMG signal preprocessing module includes a low-pass filter circuit, a first-stage amplification circuit, a high-pass filter circuit, and a second-stage amplification circuit. After the facial muscle EMG signals are acquired, the low-pass filter circuit performs initial noise reduction, filtering out noise above 500Hz. The first-stage amplification circuit increases the circuit gain. The high-pass filter circuit further reduces noise, filtering out noise below 10Hz. The second-stage amplification circuit provides the final circuit gain. Specifically, the first-stage and second-stage amplification circuits amplify the EMG signals by 15 times and 20 times, respectively.

[0078] The electrooculogram (EOG) signal preprocessing module performs preprocessing of the EOG signal and includes a low-pass filter circuit, a first-stage amplifier circuit, a high-pass filter circuit, and a second-stage amplifier circuit. After the EOG signal is acquired, due to the difference between the EOG signal and the facial muscle electroencephalogram (EMG) signal, it passes through the low-pass filter circuit, which performs initial noise reduction, filtering out noise above 38Hz. The first-stage amplifier circuit increases the gain of the circuit. The high-pass filter circuit further reduces noise, filtering out noise below 0.50Hz. The second-stage amplifier circuit increases the final gain of the circuit. In specific implementations, the first-stage and second-stage amplifier circuits amplify the EOG signal by 20 times and 30 times, respectively.

[0079] The measured electromyography and electrooculography signals are acquired by the signal preprocessing module, undergo analog-to-digital conversion, and then enter the data preprocessing module.

[0080] The data preprocessing includes digital filtering and normalization to obtain normalized facial electromyography (EMG) data. This includes low-pass filtering of the preprocessed facial EMG data converted to digital signals and high-pass filtering of the preprocessed electrooculography (EOG) data converted to digital signals. The filtered facial EMG data and EOG data are then normalized to obtain normalized facial EMG data and EOG data.

[0081] The normalized facial electromyography data and normalized electrooculography data; the normalization is specifically implemented using L2 normalization;

[0082] Feature extraction was performed, extracting features from electrooculography (EOG) signals and facial electromyography (EMG) signals respectively;

[0083] The facial electromyography (EMG) signal features include, but are not limited to, time-domain, frequency-domain, and transform-domain features; the time-domain features include common surface facial EMG time-domain features, such as: EMG integral value, root mean square value, waveform length, and mean absolute value;

[0084] The time-domain features of the electrooculogram signal include, but are not limited to, the difference between peak and trough values, the energy integral value, and the difference between peak and trough positions. In specific implementations, frequency domain features, transform domain features, statistical features, and entropy features can also be extracted. The frequency domain features include FFT and STFT features. The transform domain features include wavelet features. The statistical features include, but are not limited to, mean, variance, and higher-order statistical features.

[0085] Multiple extracted electrooculogram (EOG) signal features are fed into an EOG correlation coefficient calculation unit to calculate the EOG correlation coefficient. Then, the EOG correlation coefficient judgment unit judges the calculated EOG correlation coefficient. If the EOG correlation coefficient is less than a threshold, the feature is cross-referenced. In practice, the EOG correlation coefficient threshold is set to 0.5, meaning that a low correlation indicates the feature is important and needs to be expanded in the EOG feature cross-reference unit. One feature expansion method is to perform a Kronecker product on the feature and itself; or to first perform a nonlinear transformation, such as a tanh transform, on the feature before calculating the Kronecker product with itself.

[0086] Similarly, multiple facial electromyographic features are extracted and fed into the electromyographic correlation coefficient calculation unit to calculate the electromyographic correlation coefficient. Then, the calculated electromyographic correlation coefficient is judged by the electromyographic correlation coefficient judgment unit. When the electromyographic correlation coefficient is less than a threshold, the feature is cross-referenced. In practice, the electromyographic correlation coefficient threshold is set to 0.45, that is, when the correlation is small, it indicates that the feature is important and needs to be expanded by the electromyographic feature cross-reference unit. One feature expansion method is to perform the Kronecker product on the feature and itself; or to first perform a nonlinear transformation such as tanh transformation on the feature and then perform the Kronecker product on itself.

[0087] The features output by the electromyography feature crossover unit and the electrooculography feature crossover unit are fed into the feature fusion unit for feature fusion. The feature fusion includes, but is not limited to, additive fusion, product fusion, and fusion based on self-attention mechanism.

[0088] The features obtained after fusion are first processed by the average MIC calculation unit, then input into the feature filtering unit to filter features, and finally input into the feature alignment unit to obtain aligned features for machine learning classification.

[0089] The feature selection, in specific implementation, includes, but is not limited to, feature selection based on NCA and feature selection based on mRMR;

[0090] The feature alignment unit supplements features of different dimensions and forms a set of vectors, which are then fed into the machine learning expression classification module.

[0091] The machine learning expression classification module uses commonly used machine learning classification algorithms such as Support Vector Machine (SVM) to effectively classify facial expressions.

[0092] The specific classification steps of the SVM vector machine are to map the data into a high-dimensional space, find the optimal hyperplane, and achieve classification and regression; the optimal hyperplane can maximize the maximum margin between different classes, so that the projections of data points on it are as separated as possible.

[0093] Test data were collected from eight subjects (four men and four women). Multiple electrodes distributed across the face were worn by the subjects. Figure 3 The electrode distribution in this embodiment is as follows: multiple electrodes include 4 surface electromyography (EMG) electrodes and 4 eye movement (EMG) signal acquisition electrodes; the EMG electrodes are distributed as ①, ②, ③, and ④; the EMG electrodes are distributed as ⑤, ⑥, ⑦, and ⑧, with ⑨ being the reference ground; the electrodes are worn around both eyes. Test data is collected from the subject under various facial expressions. Specifically, first, 3 sets of lip movements and 5 sets of eye movements are collected from the subject. Then, the next facial expression is shown to the participant, and the given facial expression is repeated 4 times; each collection time is 7 seconds, starting from the 2nd second of the expression, lasting 3 seconds, ending at the 5th second, with a 2-second rest period.

[0094] like Figure 2 The diagram shows a flowchart of a facial expression classification method, which includes the following steps:

[0095] S1. Acquire facial electromyography (EMG) and electrooculography (EOG) signals;

[0096] S1 acquires facial electromyography and electrooculography signals using multiple electrodes distributed on the face;

[0097] In practical implementation, the data acquisition equipment for collecting facial electromyography (EMG) and electrooculography (EOG) signals also includes a wireless transmission unit for transmitting facial expression measurement data to a data processing terminal, such as a PC or a remote server connected via a network. Wireless transmission methods include mobile communication networks, Wi-Fi networks, or Bluetooth.

[0098] S2. Perform signal preprocessing on facial electromyography (EMG) signals and electrooculography (EOG) signals respectively;

[0099] The preprocessing of electromyographic signals in S2 includes the following sub-steps:

[0100] S21A performs analog low-pass filtering on electromyography;

[0101] S21B performs the first stage amplification on the analog low-pass filtered electromyography signal;

[0102] S21C performs simulated high-pass filtering on the first-stage amplified electromyography;

[0103] The S21D performs a second-stage amplification on the analog high-pass filtered electromyography signal;

[0104] The preprocessing of the electrooculogram signal in step S2 includes the following sub-steps:

[0105] S22A performs analog low-pass filtering on electrooculography;

[0106] S22B performs the first stage amplification on the analog low-pass filtered electrooculogram signal;

[0107] S22C performs simulated high-pass filtering on the electrooculogram after the first stage amplification;

[0108] S22D performs a second-stage amplification on the analog high-pass filtered electrooculogram signal;

[0109] In specific implementation, the low-pass and high-pass filters of S21B and S21D, as well as S22B and S22D, can be interchanged; the first-stage and second-stage amplification factors of S21A and S21C, as well as S22A and S22C, are adjusted according to the signal quality before and after processing; the amplification factors of the first-stage and second-stage amplification are greater than or equal to 10 and less than or equal to 200.

[0110] In S2, the facial electromyography (EMG) or electrooculography (EOG) signals are amplified and filtered respectively. Specifically, the facial EMG signals are passed through a low-pass filter circuit, which performs preliminary noise reduction by filtering out noise above 500Hz. The first-stage amplifier circuit increases the gain of the circuit. The high-pass filter circuit further reduces noise by filtering out noise below 10Hz. The second-stage amplifier circuit increases the final gain of the circuit.

[0111] S3. Perform analog-to-digital conversion on the preprocessed facial electromyography (EMG) signals and electrooculography (EOG) signals respectively.

[0112] S4. Perform data preprocessing on the facial electromyography and electrooculography data obtained after analog-to-digital conversion;

[0113] In S4, the facial electromyography (EMG) data obtained after analog-to-digital conversion is preprocessed with digital filtering and normalization to obtain normalized facial EMG data. Specifically, this includes: performing baseline removal processing on the facial EMG data followed by high-pass filtering; performing low-pass filtering on the electrooculogram (EOG) data; in practice, the baseline of the facial EMG data is removed (i.e., the mean is subtracted) to obtain baseline-removed facial EMG data, and then high-pass filtering is performed to filter out low-frequency noise at the digital frequency corresponding to 10Hz.

[0114] S5. Perform feature processing on the facial electromyography data and electrooculography data obtained after data preprocessing, respectively.

[0115] S5 performs feature processing on the facial electromyography (EMG) data obtained after data preprocessing. Specifically, it extracts time-domain, frequency-domain, statistical, and transform-domain features from the preprocessed facial EMG data to obtain EMG extraction features; it calculates the EMG correlation coefficient of the EMG extraction features, and determines whether the calculated EMG correlation coefficient is less than the EMG correlation coefficient threshold. If it is less than the EMG correlation coefficient threshold, the EMG extraction features corresponding to the EMG correlation coefficient are cross-referenced to obtain EMG data after feature processing. If it is greater than or equal to the EMG correlation coefficient threshold, the EMG extraction features corresponding to the EMG correlation coefficient are used as the EMG data after feature processing.

[0116] In specific implementation, the electromyography (EMG) extraction features are extracted from four different channels of facial EMG data and four different channels of electrooculography (EOG) data. Based on the different temporal characteristics of different eye movements and the differences in EOG amplitude, the difference between the peak and trough values ​​of CH1 and CH2, and the energy integral value of the signal, are used as classification features. Common temporal features of the surface EMG signals of CH3 and CH4 are extracted. These common temporal features include EMG integral value, root mean square value, waveform length, and mean absolute value.

[0117] The electrooculogram (EOG) data obtained after data preprocessing is subjected to feature processing, specifically: extracting time-domain, frequency-domain, statistical, and transform-domain features from the preprocessed EOG data to obtain EOG extracted features; calculating the EOG correlation coefficient of the EOG extracted features; determining whether the correlation coefficient calculated by the EOG correlation coefficient calculation unit is less than the EOG correlation coefficient threshold; if it is less than the EOG correlation coefficient threshold, the EOG extracted features corresponding to the EOG correlation coefficient are cross-referenced to obtain the feature-processed EOG data; otherwise, if it is greater than or equal to the EOG correlation coefficient threshold, the EOG extracted features corresponding to the EOG correlation coefficient are used as the feature-processed EOG data.

[0118] S6. Perform feature selection on the facial electromyography and electrooculography features obtained after feature processing to obtain aligned features;

[0119] S6 performs feature selection on the facial electromyography (EMG) and electrooculography (EOG) features obtained after feature processing. Specifically, it receives the EMG and EOG data after feature processing and fuses them to obtain fused features; calculates the average MIC of the fused features; selects the top N fused features corresponding to the calculated average MIC values ​​as filtered features; aligns the filtered features to obtain aligned features; the fusion includes one or more combinations of addition, serialization, multi-level fusion, and multi-scale fusion.

[0120] The fusion can yield new effective feature types; the filtering, in specific implementation, can be based on the average MIC calculated in this embodiment, or it can employ statistical filtering, model filtering, and embedded filtering; the statistical filtering includes one or more of variance selection, chi-square test, mutual information, and recursive methods; the model filtering includes one or a combination of Pandas module filtering, Sklearn module filtering, LSTM model filtering, CNN model filtering, and DNN model filtering; the embedded filtering includes one or a combination of filtering, wrapping, and tree-based methods.

[0121] S7. The aligned features obtained after feature selection are classified using a machine learning model to obtain the expression type.

[0122] In step S7, the specific classification steps of the SVM vector machine are to map the data into a high-dimensional space, find the optimal hyperplane, and realize classification and regression; the optimal hyperplane can maximize the maximum margin between different classes, so that the projections of data points on it are as separated as possible.

[0123] S2, S3, and S4 perform signal preprocessing, analog-to-digital conversion, and data preprocessing on the acquired signals, respectively. Signal preprocessing includes dual filtering and amplification, using a low-pass filter to remove noise above 500Hz and a high-pass filter to remove noise below 10Hz. The filtered electromyography (EMG) and electrooculography (EOG) signals are then converted from analog to digital and normalized to obtain normalized facial EMG and EOG data. Next, various extraction methods are used to extract features from the preprocessed facial EMG and EOG data.

[0124] S5 extracts the temporal features of the electrooculogram (EOG) signal, including the difference between peak and trough values, the energy integral value, and the peak-trough position difference. Facial electromyography (EMG) signal features are extracted as surface electromyography (SEMG) temporal features, including EMG integral value, root mean square value, waveform length, and mean absolute value.

[0125] The extracted features will form a one-dimensional vector of length fourteen, and the combination method is as follows: Where vd is the peak-to-trough difference, ld is the peak-to-trough position difference, i is the energy integral, mav is the mean, wl is the wavelength, rms is the root mean square, and iemg is the integrated value of fEMG. 1, 2, 3, and 4 are the channel numbers.

[0126] Using the previously extracted 14-dimensional features, 256 samples were collected for 8 facial expressions. After feature processing, fusion, and selection as described in S5 and S6, the data from the subject's last action was processed and selected as the test sample. The data from the first three actions, after processing and selection, were used as training samples for training the four algorithms. The four algorithms employed were the machine learning classification algorithms Support Vector Machine (SVM), Random Forest (RF), k-Nearest Neighbor (KNN), and Linear Discriminant Analysis (LDA). The classification results are as follows: Figure 4 As stated above.

[0127] Machine learning algorithms were used to classify facial expressions to evaluate the performance of the collected data for expression classification. The performance results are shown in Table 1.

[0128] Table 1 Test results of different classification algorithms

[0129]

[0130] Accuracy is the percentage of the model output that is correctly classified from the actual results. As shown in Table 1, the SVM accuracy is 95.3%, and this model performed best in this embodiment.

[0131] The above description is merely a preferred embodiment of the present invention, and the present invention should not be limited to the content disclosed in the embodiments and drawings. Any equivalents or modifications made without departing from the spirit of the present invention fall within the scope of protection of the present invention.

Claims

1. A facial expression classification system, comprising a data acquisition unit, a hardware data noise filtering module, a data preprocessing module, and a machine learning expression classification module, wherein the data acquisition unit includes several electrodes, the data acquisition unit is connected to the hardware data noise filtering module, the hardware data noise filtering module is connected to the data preprocessing module, and the output of the data preprocessing module is used for feature processing and selection, characterized in that, It also includes an electromyography (EMG) feature processing module, an electrooculography (EOG) feature processing module, and a feature selection module; the data acquisition unit acquires and outputs facial EMG and EOG signals; the facial EMG and EOG signals are respectively fed into the hardware data noise filtering module and the data preprocessing module to obtain preprocessed EMG and EOG data. After preprocessing, the electromyography (EMG) and electrooculography (EOG) data are processed by the EMG feature processing module and the EOG feature processing module, respectively, to extract features and calculate correlation coefficients. The preprocessed EMG and EOG data that meet the threshold are then cross-referenced to obtain the feature-processed EMG and EOG data. The feature-processed EMG and EOG data are then processed by the feature selection module to perform feature fusion, statistics, filtering, and alignment to obtain the feature-selected data. After feature selection, the data is classified by the machine learning expression classification module to obtain the expression type; the electromyography feature processing module includes an electromyography feature extraction unit, and the electrooculography feature processing module includes an electrooculography feature extraction unit. The electromyography (EMG) feature processing module further includes an EMG correlation coefficient calculation unit, an EMG correlation coefficient judgment unit, and an EMG feature cross-validation unit. The EMG correlation coefficient calculation unit calculates the EMG correlation coefficient of the extracted EMG features. The EMG correlation coefficient judgment unit determines whether the calculated EMG correlation coefficient is less than the EMG correlation coefficient threshold. If it is less, the extracted EMG features corresponding to the EMG correlation coefficient are sent to the EMG feature cross-validation unit for cross-validation to obtain the EMG data after feature processing. If it is greater than or equal to the EMG correlation coefficient threshold, the extracted EMG features corresponding to the EMG correlation coefficient are sent to the feature selection module as the EMG data after feature processing. The electrooculogram feature processing module further includes an electrooculogram correlation coefficient calculation unit, an electrooculogram correlation coefficient judgment unit, and an electrooculogram feature cross-processing unit; The electrooculogram correlation coefficient calculation unit calculates the electrooculogram correlation coefficient of the extracted electrooculogram features; the electrooculogram correlation coefficient judgment unit judges whether the calculated electrooculogram correlation coefficient is less than the electrooculogram correlation coefficient threshold. If it is less than the threshold, the extracted electrooculogram features corresponding to the electrooculogram correlation coefficient are sent to the electrooculogram feature cross-crossing unit for cross-crossing to obtain the electrooculogram data after feature processing; if it is greater than or equal to the electrooculogram correlation coefficient threshold, the extracted electrooculogram features corresponding to the electrooculogram correlation coefficient are sent to the feature selection module as the electrooculogram data after feature processing. The crossing of the electromyographic feature crossing unit and the electrooculographic feature crossing unit is achieved using the Kronecker product.

2. The facial expression classification system according to claim 1, characterized in that, The electromyography (EMG) feature extraction unit extracts the temporal, frequency, statistical, and transform domain features of the facial EMG data after data preprocessing to obtain EMG extraction features; the electrooculography (EOG) feature extraction unit extracts the temporal, frequency, statistical, and transform domain features of the EOG data after data preprocessing to obtain EOG extraction features.

3. A facial expression classification system according to claim 2, characterized in that, The feature selection module includes a feature fusion unit, an average MIC calculation unit, a feature filtering unit, and a feature alignment unit. The feature fusion unit receives the processed electromyographic data and electrooculographic data from the electromyographic feature processing module and the electrooculographic feature processing module, respectively, and fuses them to obtain the fused features. The average MIC calculation unit calculates the average MIC of the fused features; the feature filtering unit filters the fused features based on the calculated average MIC to obtain the filtered features. feature The alignment unit aligns the filtered features to obtain aligned features; The aligned features are then input into the machine learning facial expression classification module.

4. A facial expression classification system according to claim 3, characterized in that, The feature fusion unit employs one or more combinations of the following fusion methods: addition, serialization, multi-level fusion, and multi-scale fusion.

5. A method for classifying facial expressions, characterized in that, Includes the following steps: S1. Collect and acquire facial electromyography (EMG) and electrooculography (EOG) signals; S2. Perform signal preprocessing on facial electromyography (EMG) signals and electrooculography (EOG) signals respectively; S3. Perform analog-to-digital conversion on the preprocessed facial electromyography (EMG) signals and electrooculography (EOG) signals respectively. S4. Perform data preprocessing on the facial electromyography and electrooculography data obtained after analog-to-digital conversion; S5. Perform feature processing on the facial electromyography data and electrooculography data obtained after data preprocessing, respectively. S6. Perform feature selection on the facial electromyography and electrooculography features obtained after feature processing to obtain aligned features; S7. The aligned features obtained after feature selection are classified using a machine learning model to obtain the expression type; S5 performs feature processing on the facial electromyography data obtained after data preprocessing to obtain electromyography extraction features. Calculate the electromyographic correlation coefficient of the extracted electromyographic features, and determine whether the calculated electromyographic correlation coefficient is less than the electromyographic correlation coefficient threshold. If it is less than the electromyographic correlation coefficient threshold, cross the electromyographic extracted features corresponding to the electromyographic correlation coefficient to obtain the electromyographic data after feature processing; otherwise, if it is greater than or equal to the electromyographic correlation coefficient threshold, the electromyographic extracted features corresponding to the electromyographic correlation coefficient are used as the electromyographic data after feature processing. The electrooculogram (EOG) data obtained after data preprocessing is subjected to feature processing to obtain EOG extracted features. Calculate the electrooculogram correlation coefficient of the extracted electrooculogram features; If the correlation coefficient calculated by the electrooculogram correlation coefficient calculation unit is less than the electrooculogram correlation coefficient threshold, then the electrooculogram extracted features corresponding to the electrooculogram correlation coefficient are cross-referenced to obtain the electrooculogram data after feature processing; otherwise, if it is greater than or equal to the electrooculogram correlation coefficient threshold, then the electrooculogram extracted features corresponding to the electrooculogram correlation coefficient are used as the electrooculogram data after feature processing. The electromyography (EMG) extraction feature crossover and electrooculography (EOG) extraction feature crossover are implemented using the Kronecker product.

6. The facial expression classification method according to claim 5, characterized in that, S2 performs signal preprocessing on the facial electromyography (EMG) signal, specifically: performing simulated low-pass filtering on the EMG signal; performing a first-stage amplification on the simulated low-pass filtered EMG signal; performing simulated high-pass filtering on the first-stage amplified EMG signal; and performing a second-stage amplification on the simulated high-pass filtered EMG signal to obtain the preprocessed facial EMG signal. Similarly, S2 performs signal preprocessing on the electrooculogram (EOG) signal, specifically: performing simulated low-pass filtering on the EOG signal; performing a first-stage amplification on the simulated low-pass filtered EOG signal; performing simulated high-pass filtering on the first-stage amplified EOG signal; and performing a second-stage amplification on the simulated high-pass filtered EOG signal to obtain the preprocessed EOG signal. S4 performs data preprocessing on the facial electromyography (EMG) data obtained after analog-to-digital conversion. Specifically, it performs digital filtering on the EMG data after analog-to-digital conversion and normalizes the digitally filtered EMG data to obtain preprocessed EMG data. It also performs data preprocessing on the electrooculography (EOG) data obtained after analog-to-digital conversion. Specifically, it performs digital filtering on the EOG data after analog-to-digital conversion and normalizes the digitally filtered EOG data to obtain preprocessed EOG data.

7. The facial expression classification method according to claim 6, characterized in that, The electromyography (EMG) extraction features described in S5 are obtained by extracting the time domain, frequency domain, statistical, and transform domain features of the facial EMG data after data preprocessing. The electrooculogram (EOG) extraction features are specifically obtained by extracting the time domain, frequency domain, statistical, and transform domain features of the EOG data after data preprocessing.

8. A method for classifying facial expressions according to claim 7, characterized in that, S6 performs feature selection on the facial electromyography (EMG) and electrooculography (EOG) features obtained after feature processing. Specifically, it receives the EMG and EOG data after feature processing and fuses them to obtain the fused features. Calculate the average MIC of the fused features; select the top N fused features corresponding to the calculated average MIC values ​​as the filtered features; The filtered features are aligned to obtain aligned features; The fusion method used to receive and process electromyography and electrooculography data after feature processing and then fuse them is one or more combinations of addition, serial fusion, multi-level fusion, and multi-scale fusion.