Non-contact cognitive load evaluation method based on infrared thermal imaging

By using instance segmentation networks and inter-frame difference methods to locate the nasal region, combined with infrared thermal imaging technology and classification models, the inaccuracy of assessing cognitive load and the problems of contact-based measurement in existing technologies are solved, achieving accurate assessment without sensation or invasiveness.

CN116807413BActive Publication Date: 2026-06-23SOUTH CHINA UNIV OF TECH

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
SOUTH CHINA UNIV OF TECH
Filing Date
2023-07-27
Publication Date
2026-06-23

AI Technical Summary

Technical Problem

Existing technologies for assessing cognitive load suffer from problems such as lack of universality, high subjectivity, or the need for contact devices. In particular, when the face is obscured or the angle of deflection is large, infrared thermal imaging cannot accurately locate the nose area, resulting in inaccurate non-contact respiratory signal measurements.

Method used

An instance segmentation network is used to locate the nasal region. Combined with inter-frame difference and thresholding, respiratory signal features are extracted. Respiratory signals are measured non-contactly using infrared thermal imaging technology. A cognitive load level classification model is constructed, and recursive feature elimination and random forest algorithms are used for feature selection and evaluation.

Benefits of technology

It enables accurate and objective assessment of cognitive load under non-invasive and non-contact conditions, improves the robustness and accuracy of non-contact respiratory measurement, adapts to different environments and human movements, and reduces the influence of individual differences.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN116807413B_ABST
    Figure CN116807413B_ABST
Patent Text Reader

Abstract

The application discloses a non-contact cognitive load evaluation method based on infrared thermal imaging, which comprises the following steps: using an infrared thermal imager to shoot facial videos of a tester in a sitting state and when performing a cognitive task; positioning a nostril area in the facial images and extracting original breathing signals from temperature changes of the nostril area; after denoising the original signals, extracting breathing features in a time domain and a frequency domain; based on the extracted breathing features, performing feature selection, and selecting a random forest to construct a cognitive load level classification model; extracting breathing feature samples of the tester and inputting the breathing feature samples into the cognitive load level classification model to evaluate cognitive load levels of the tester. The application provides a non-contact cognitive load evaluation method which is objective, comfortable and robust.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of signal processing technology, and more specifically to a non-contact cognitive load assessment method based on infrared thermal imaging. Background Technology

[0002] Individuals experience cognitive load during the cognitive process; the more complex the cognitive process, the higher the level of cognitive load. Moderate cognitive load can promote work or learning progress, while excessively high or low cognitive load will reduce an individual's work or learning efficiency. Assessing cognitive load provides data support for developing reasonable work or learning strategies, which has significant practical implications in the information age.

[0003] Currently, there are three main methods for assessing cognitive load: task performance evaluation, subjective scales, and physiological assessment. Task performance evaluation requires setting specific performance indicators based on the task type; when the task changes, these indicators need to be reset, lacking universality. Subjective scales are simple to implement, requiring test takers to recall their subjective feelings during the task for evaluation, making them overly subjective. Physiological assessment offers accuracy and objectivity, but requires test takers to wear contact devices to collect physiological signals, which can cause discomfort and is not conducive to long-term cognitive load monitoring.

[0004] Breathing, as a physiological signal, is an effective indicator for assessing cognitive load. Among commonly used non-contact measurement techniques for respiratory signals, imaging photoplethysmography (IPC) is not ideal in low-light environments. Technologies such as Wi-Fi CSI (Channel State Information) and Doppler radar are highly sensitive to spontaneous movements of the test subject. Infrared thermal imaging, however, can capture temperature changes during breathing to collect respiratory signals and is usable in dark environments. By employing appropriate technical procedures, accurate measurements can be achieved even in spontaneous human movement scenarios, making it an effective non-contact respiratory signal measurement technique that overcomes the problems associated with contact measurements.

[0005] In respiratory signal measurement based on infrared thermal imaging, existing methods locate the nose region by detecting facial landmarks or using facial structural information, all of which require complete facial information. However, when the face is occluded or turned at an excessive angle, the nose region cannot be located. Summary of the Invention

[0006] The purpose of this invention is to address the aforementioned deficiencies in the prior art and provide a non-contact cognitive load assessment method based on infrared thermal imaging. This invention, by introducing an instance segmentation network, solves the problem that existing methods cannot locate the nose region when the face is occluded or the angle of deflection is too large, effectively improving the robustness of non-contact respiratory measurement based on infrared thermal imaging, and enabling accurate and objective cognitive load assessment without any discomfort or invasiveness.

[0007] The objective of this invention can be achieved by adopting the following technical solutions:

[0008] A non-contact cognitive load assessment method based on infrared thermal imaging, the assessment method comprising the following steps:

[0009] S1. Use an infrared thermal imager to capture facial videos of the test subject while seated and during cognitive tasks, and obtain facial images by segmenting them into frames.

[0010] S2. Locate the nostril region from the facial image and extract the raw respiratory signal;

[0011] S3. Denoise the original respiratory signal and extract respiratory features in the time and frequency domains;

[0012] S4. Based on the extracted respiratory features, feature selection is performed to construct a cognitive load level classification model;

[0013] S5. Extract respiratory feature samples from the test subject and input them into the cognitive load level classification model to assess the test subject's cognitive load level.

[0014] Furthermore, in step S1, an infrared thermal imager is used to capture facial videos of the test subject while seated and during the cognitive task, and the process of obtaining facial images by segmenting the frames is as follows:

[0015] S101. Place an infrared thermal imager directly in front of the test subject, ensuring that the test subject's face is within the field of view of the infrared thermal imager and that the nose is not obstructed. Take facial videos of the test subject in a seated state, performing a low-load cognitive task, and performing a high-load cognitive task.

[0016] S102. The captured facial video is divided into a series of facial images in chronological order. This frame division process is to facilitate the location of the nostrils in the images in subsequent steps.

[0017] Furthermore, the process of locating the nostril region from the facial image and extracting the original respiratory signal in step S2 is as follows:

[0018] S201. On each infrared thermal imaging facial image acquired independently, the nose region is marked with rectangles using the Labelme annotation tool to construct a facial nose region dataset. The purpose of this step is to enable the instance segmentation network in step S202 to learn the features of the facial nose region. Therefore, it is necessary to pre-annotate the nose region in the facial images to construct the dataset required for training and testing the instance segmentation network. The Labelme annotation tool used in this step is a commonly used open-source image annotation tool in the field.

[0019] S202. Train the instance segmentation network using the constructed facial nose region dataset. Divide the facial nose region dataset into a training set and a test set in a 4:1 ratio. Use the training set to train the instance segmentation network, and use the test set to test the performance of the instance segmentation network. After training, the instance segmentation network can automatically locate the nose region. Considering the high frame rate of facial videos and the large number of images that need to be processed, the YOLACT instance segmentation network, which combines speed and accuracy, was selected to locate the facial nose region. This step involves training with a large amount of data to obtain an instance segmentation network that can automatically locate the facial nose region.

[0020] S203. For a series of facial images obtained after segmenting the same facial video into frames, use the trained instance segmentation network to automatically locate the nose region in the facial images.

[0021] S204. On the located nose region image, perform continuous difference on adjacent m frames and calculate the absolute value of gray level to obtain the difference image D(x,y). The calculation formula is as follows.

[0022]

[0023] Among them, f n (x,y) represents the grayscale value of pixel (x,y) in the nose region of the nth frame image, and m represents the number of images for the difference operation.

[0024] Temperature changes during respiration are reflected in the grayscale changes of the nostril region in thermal imaging images. The purpose of this step is to obtain the nostril region with grayscale changes through multi-frame difference, thereby reducing the influence of other areas in the nose region besides the nostrils on the extraction of respiratory signals.

[0025] S205. Thresholding and denoising are performed on the difference image D(x,y). The threshold T is calculated using the Otsu thresholding method, and the resulting difference image D(x,y) is then thresholded to obtain the thresholded image E(x,y). Subsequently, an opening operation is performed to eliminate background noise points, retaining the two largest connected components as the nostril region P(x,y). The thresholding formula is as follows;

[0026]

[0027] The thresholding process in this step sets the grayscale value of pixels below the threshold to zero, while retaining the nostril area with large grayscale variations; the opening operation removes isolated noise points.

[0028] S206. For a series of facial images obtained after segmenting the same facial video, the average grayscale value of all pixels in the nostril region is used as the amplitude of the respiratory signal. The calculation formula for the original respiratory signal is as follows:

[0029]

[0030] Where W represents the number of facial images obtained after frame segmentation, b represents the number of pixels in the nostril region located in the facial image, and P a (x,y) represents the grayscale value of pixel (x,y) in the nostril region of the a-th frame image, and S(a) represents the average grayscale value of all pixels in the nostril region of the a-th frame image. This step extracts the raw respiratory signal from the change of the average grayscale value of the nostril region over time.

[0031] Furthermore, the process of denoising the original respiratory signal and extracting respiratory features in the time and frequency domains in step S3 is as follows:

[0032] S301. Perform bandpass filtering on the raw respiratory signal obtained in step S206. Design a Butterworth bandpass filter with a passband of 0.1-0.5Hz, and input the raw respiratory signal into the Butterworth bandpass filter for filtering; this step takes into account that the frequency range of normal respiratory signals is 0.1-0.5Hz, and thus a bandpass filter is designed to remove noise and reduce the interference of noise on the extraction of respiratory features;

[0033] S302. Remove baseline drift from the bandpass-filtered respiratory signal. Locate the troughs of the bandpass-filtered respiratory signal, fit a baseline between the troughs using cubic spline interpolation, and subtract the baseline from the respiratory signal to obtain the baseline-drift-free respiratory signal. The purpose of this step is to reduce the impact of baseline drift on the extraction of respiratory features.

[0034] S303. Subtract the mean value from the baseline-drift-removed respiratory signal to remove the DC component and obtain the denoised respiratory signal. The purpose of this step is to prevent the excessive DC component from interfering with the extraction of respiratory features.

[0035] S304. Extract the temporal features of the denoised respiratory signal. First, locate the peaks and troughs of the denoised respiratory signal. After determining each respiratory cycle, calculate the respiratory rate and tidal volume for each cycle. The respiratory rate is the number of breaths per minute. The average tidal volume is obtained by averaging the tidal volume. Then, multiply the respiratory rate by the average tidal volume to obtain the minute ventilation. Use the respiratory rate and minute ventilation as the overall features of the respiratory signal. Subsequently, calculate the mean, standard deviation, root mean square, and coefficient of variation for five feature sequences: average tidal volume, inspiratory to expiratory time ratio, inspiratory time percentage, inspiratory amplitude, and expiratory amplitude, respectively, as respiratory variability features. Use the overall respiratory signal features and respiratory variability features as the temporal features of the respiratory signal. The purpose of this step is to extract the temporal influence of cognitive load on the respiratory signal. Tidal volume refers to the amount of gas inhaled in each respiratory cycle, calculated using the following formula:

[0036]

[0037] Where TV represents tidal volume, z1 represents the horizontal axis of the peak of the respiratory cycle, z2 represents the horizontal axis of the trough of the respiratory cycle, and A represents the amplitude of the respiratory signal.

[0038] S305. Extract the frequency domain features of the denoised respiratory signal. Calculate the power spectrum of the denoised respiratory signal and extract the power spectrum within the ranges of 0.1-0.2Hz, 0.2-0.3Hz, 0.3-0.4Hz, and 0.4-0.5Hz as the frequency domain features of the respiratory signal; the purpose of this step is to extract the frequency domain influence of cognitive load on the respiratory signal.

[0039] S306. Combine the time-domain and frequency-domain features to form a respiratory feature vector. Subtract the respiratory feature vector under the cognitive task from the respiratory feature vector under the sitting state to obtain the baseline-removed respiratory feature sample. This step takes into account that there are certain differences in breathing patterns among different individuals, and the baseline of respiratory rate also varies. In order to reduce the influence of individual differences on the respiratory features, it is necessary to perform baseline removal on the respiratory features to obtain the amount of change in respiratory features due to cognitive load.

[0040] Furthermore, the process of constructing a cognitive load level classification model based on the extracted respiratory features in step S4 is as follows:

[0041] S401. Collect respiratory feature samples from several test subjects, label the baseline-removed respiratory feature samples, set the label of samples performing low-load cognitive tasks to 0, and set the label of samples performing high-load cognitive tasks to 1, and construct a respiratory feature set; the purpose of this step is to establish sample labels for subsequent steps to build a classification model.

[0042] S402. Divide the respiratory feature set into a training set and a test set in a 4:1 ratio;

[0043] S403. A recursive feature elimination algorithm is used to filter the labeled respiratory feature set, selecting the subset of features with the highest f1-weighted scores, termed the optimal feature subset. The f1-weighted score is a common metric for evaluating feature importance and can reflect the impact of features on model performance to some extent. The purpose of this step is to select features sensitive to cognitive load level classification, reduce the influence of irrelevant features on the model, improve model training efficiency, and prevent the curse of dimensionality.

[0044] S404. The obtained optimal feature subset is fed into a random forest for training. K-fold cross-validation is used during training to construct the mapping relationship between respiratory features and cognitive load levels, resulting in a well-constructed cognitive load level classification model. Since random forests exhibit randomness in selecting training samples and features, they can automatically handle highly correlated features and have strong resistance to overfitting. Therefore, random forests are chosen to construct the classification model. The purpose of this step is to fully evaluate the performance of the classification model, reduce the risk of overfitting, and construct a cognitive load level classification model.

[0045] Further, in step S5, respiratory feature samples of the test subject are extracted and input into the cognitive load level classification model. The process of evaluating the test subject's cognitive load level is as follows:

[0046] S501. Use an infrared thermal imager to capture facial video of the test subject during a cognitive task and segment it into facial images.

[0047] S502. After steps S2 and S3, the respiratory characteristic sample of the test subject is obtained;

[0048] S503. Input the obtained respiratory feature samples into the cognitive load level classification model constructed in step S404. Determine the level of cognitive load based on the result label output by the classification model. If the result label is 0, it represents a low cognitive load level. If the result label is 1, it represents a high cognitive load level.

[0049] The present invention has the following advantages and effects compared with the prior art:

[0050] (1) This invention assesses the level of cognitive load through respiratory signals, which belongs to the physiological assessment method. , It has the advantages of being objective, accurate, and timely, and has the advantage of being universally applicable compared to the task performance method, overcoming the disadvantages of subjective scale methods, such as strong subjectivity and large lag.

[0051] (2) This invention obtains respiratory signals from temperature changes during the test subject's breathing using infrared thermal imaging technology. This eliminates the need for direct contact with the test subject, achieving a non-invasive, contactless measurement that overcomes the discomfort caused by contact-based acquisition methods. Compared to other non-contact acquisition methods, infrared thermal imaging technology can accurately measure in low-light environments, effectively protects facial privacy, and is relatively insensitive to the test subject's spontaneous movements.

[0052] (3) This invention solves the problem that existing methods cannot locate the nose region when the face is occluded or the angle of deflection is too large by introducing an instance segmentation network. Based on the nose region, the nostril region is determined by the inter-frame difference method, which reduces the influence of irrelevant pixels on the breathing signal and effectively improves the accuracy and robustness of non-contact breathing measurement based on infrared thermal imaging, providing data support for the realization of cognitive load assessment.

[0053] (4) This invention extracts respiratory features from multiple time and frequency domains, selects the best feature subset through recursive feature elimination method, and establishes a mapping relationship between respiratory features and cognitive load level using random forest, thereby realizing cognitive load assessment. Attached Figure Description

[0054] The accompanying drawings, which are included to provide a further understanding of the invention and form part of this application, illustrate exemplary embodiments of the invention and, together with their description, serve to explain the invention and do not constitute an undue limitation thereof. In the drawings:

[0055] Figure 1 This is a flowchart of a non-contact cognitive load assessment method based on infrared thermal imaging disclosed in an embodiment of the present invention;

[0056] Figure 2 This is a schematic diagram of a scenario in which an infrared thermal imager is used to collect the respiratory signals of a test subject in an embodiment of the present invention;

[0057] Figure 3 This is an example image of marking the nose region in a self-acquired infrared thermal imaging facial image in an embodiment of the present invention. The black rectangle in the image represents the marked nose region.

[0058] Figure 4 This is an example image of locating the nose region in a facial image using the YOLACT instance segmentation network in an embodiment of the present invention. The black filled area in the image is the located nose region.

[0059] Figure 5 This is a waveform diagram of the original respiratory signal obtained from a facial video in an embodiment of the present invention;

[0060] Figure 6 This is a waveform diagram of the respiratory signal after bandpass filtering in an embodiment of the present invention;

[0061] Figure 7 This is a waveform of the respiratory signal after removing baseline drift and DC component in an embodiment of the present invention;

[0062] Figure 8 This is a schematic diagram of respiratory signal related feature points in an embodiment of the present invention, where the area of ​​the descending branch of the respiratory waveform is taken as the tidal volume;

[0063] Figure 9 This is a result diagram of feature selection using a cross-validation recursive feature elimination algorithm based on random forest in an embodiment of the present invention. Detailed Implementation

[0064] To make the objectives, technical solutions, and advantages of the embodiments of the present invention clearer, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.

[0065] Example 1

[0066] Figure 1 This is a flowchart of a non-contact cognitive load assessment method based on infrared thermal imaging provided by an embodiment of the present invention.

[0067] S1. The process of using an infrared thermal imager to capture facial videos of the test subject while seated and during the N-back task, and obtaining facial images by frame segmentation is as follows;

[0068] S101. Place an infrared thermal imager 1 meter directly in front of the test subject, ensuring the subject's face is within the imager's field of view and the nose is unobstructed. The N-back task is a commonly used cognitive experiment paradigm. In this embodiment, a 1-back task is used to induce a low cognitive load level in the test subject, and a 2-back task is used to induce a high cognitive load level. Infrared thermal imagers are used to capture facial videos of the test subject in a seated state, during the 1-back task, and during the 2-back task, respectively. A schematic diagram of the scenario for using an infrared thermal imager to collect the test subject's respiratory signals in this embodiment is shown below. Figure 2 As shown, the infrared thermal imager is controlled via a laptop computer.

[0069] S102. Divide the captured facial video into a series of facial images in chronological order.

[0070] S2. The process of locating the nose region from the facial image and extracting the raw respiratory signal is as follows:

[0071] S201. On each infrared thermal imaging facial image collected by ourselves, the nose region is marked with rectangles using the Labelme tool to construct a facial nose region dataset. Figure 3 This is an example image of marking the nose region in a self-acquired infrared thermal imaging facial image in an embodiment of the present invention. The black rectangle in the image represents the marked nose region.

[0072] S202. Divide the training set and test set into a 4:1 ratio. Use the training set to train the YOLACT instance segmentation network. Iterate the training set 50 times during training. The basic learning rate is 0.02.

[0073] S203. For a series of facial images obtained after segmenting the same facial video into frames, use the trained instance segmentation network to automatically locate the nose region in the facial images. Figure 4 This is an example image of locating the nose in a facial image using the YOLACT instance segmentation network in an embodiment of the present invention. The black filled area in the image is the located nose area.

[0074] S204. On the image of the located nose region, the adjacent 15 frames are differentially analyzed and the absolute value of gray is calculated, which is m = 15 in formula (1), to obtain the differential image.

[0075] S205. The threshold is calculated using the Otsu thresholding method. The obtained difference image is then thresholded and then opened to eliminate background noise points. The two connected regions with the largest areas are retained as the nostril regions.

[0076] S206. For a series of facial images obtained after segmenting the same facial video, the average grayscale value of all pixels in the nostril region is used as the amplitude of the breathing signal. The original breathing signal is extracted from the change of the average grayscale value of the nostril region over time. The waveform of the original breathing signal obtained from the facial video in this embodiment is as follows: Figure 5 As shown.

[0077] S3. The process of denoising the original respiratory signal and extracting respiratory features in the time and frequency domains is as follows:

[0078] S301. Design a Butterworth bandpass filter with a passband of 0.1-0.5Hz. Input the original respiratory signal obtained in step S206 into the Butterworth bandpass filter to remove noise. The waveform of the respiratory signal after bandpass filtering in this embodiment of the invention is as follows: Figure 6 As shown;

[0079] S302. Locate the troughs of the bandpass filtered respiratory signal, fit the baseline between the troughs using cubic spline interpolation, subtract the baseline from the respiratory signal to obtain the respiratory signal after removing baseline drift.

[0080] S303. Subtract the mean value from the respiratory signal after removing baseline drift, and remove the DC component to obtain the denoised respiratory signal.

[0081] In this embodiment of the invention, the respiratory signal waveform after removing baseline drift and DC component through steps S302 and S303 is as follows: Figure 7 As shown;

[0082] S304. First, locate the peaks and troughs of the denoised respiratory signal. After determining each respiratory cycle, calculate the respiratory rate and tidal volume for each cycle. Calculate the mean tidal volume. Then, multiply the respiratory rate by the mean tidal volume to obtain the minute ventilation. Use the respiratory rate and minute ventilation as the overall characteristics of the respiratory signal. Subsequently, calculate the mean, standard deviation, root mean square, and coefficient of variation for five characteristic sequences: mean tidal volume, inspiratory to expiratory time ratio, inspiratory time percentage, inspiratory amplitude, and expiratory amplitude, respectively, as respiratory variability characteristics. Use the overall respiratory signal characteristics and respiratory variability characteristics as the time-domain characteristics of the respiratory signal. Figure 8 This is a schematic diagram of respiratory signal related feature points in an embodiment of the present invention, where the area of ​​the descending branch of the respiratory waveform is taken as the tidal volume;

[0083] S305. Calculate the power spectrum of the denoised respiratory signal and extract the power spectrum in the ranges of 0.1-0.2Hz, 0.2-0.3Hz, 0.3-0.4Hz and 0.4-0.5Hz as the frequency domain features of the respiratory signal.

[0084] S306. Combine the time-domain features and frequency-domain features to form a respiratory feature vector. Subtract the respiratory vector features under the cognitive task from the respiratory vector features under the sitting state to obtain the baseline-removed respiratory feature sample.

[0085] S4. Based on the extracted respiratory features, the process of constructing a cognitive load level classification model after feature selection is as follows:

[0086] S401. Collect respiratory feature samples from several test subjects, label the baseline-removed respiratory feature samples, set the label of samples performing low-load cognitive tasks to 0, and set the label of samples performing high-load cognitive tasks to 1, and construct a respiratory feature set.

[0087] S402. Divide the respiratory feature set into a training set and a test set in a 4:1 ratio;

[0088] S403. Use the cross-validation recursive feature elimination algorithm based on random forest to filter the labeled respiratory feature set. Use the f1-weighted score as the evaluation index to select the feature subset with the highest f1-weighted score, which is called the best feature subset. Figure 9This is the result of feature selection using the cross-validation recursive feature elimination algorithm based on random forest in this embodiment of the invention. The vertical axis is the f1-weighted score, and the horizontal axis is the number of selected features. The optimal feature subset contains a total of 18 features.

[0089] S404. Input the obtained best feature subset into the random forest model for training. The random forest used contains 100 decision trees, the random seed is set to 90, random sampling with replacement is used, and 5-fold cross-validation is used during training. Finally, the constructed cognitive load level classification model is obtained.

[0090] S5. Extract the test subject's respiratory feature samples and input them into the cognitive load level classification model. The process of assessing the test subject's cognitive load level is as follows:

[0091] S501. Use an infrared thermal imager to capture facial video of the test subject during the N-back task and divide it into a series of facial images.

[0092] S502. After steps S2 and S3, the respiratory characteristic sample of the test subject is obtained;

[0093] S503. Input the obtained respiratory feature samples into the cognitive load level classification model constructed in step S404. Determine the level of cognitive load based on the result label output by the classification model. If the result label is 0, it represents a low cognitive load level. If the result label is 1, it represents a high cognitive load level.

[0094] Example 2

[0095] Based on the non-contact cognitive load assessment method based on infrared thermal imaging disclosed in Embodiment 1, this embodiment further provides the implementation process of the non-contact cognitive load assessment method based on infrared thermal imaging, as follows:

[0096] S1. The process of using an infrared thermal imager to capture facial videos of the test subject while sitting still and performing mental arithmetic tasks, and obtaining facial images after frame segmentation is as follows;

[0097] S101. Place an infrared thermal imager 1 meter directly in front of the test subject, ensuring that the test subject's face is within the field of view of the infrared thermal imager and that there is no obstruction around the nose; use a one-decimal-number addition task to induce a low cognitive load level in the test subject, and use a two-decimal-number addition task to induce a high cognitive load level in the test subject; use the infrared thermal imager to capture facial videos of the test subject in a seated state, during the one-decimal-number addition task, and during the two-decimal-number addition task, respectively.

[0098] S102. Refer to the corresponding steps in Example 1, which will not be repeated here;

[0099] S2. Refer to the corresponding steps in Example 1, which will not be repeated here;

[0100] S3. The process of denoising the original respiratory signal and extracting respiratory features in the time and frequency domains is as follows:

[0101] S301. Use a Hamming window to design an 80th-order FIR bandpass filter with a passband of 0.1-0.5Hz. Input the obtained raw breathing signal into the designed FIR bandpass filter to remove noise.

[0102] S302. Refer to the corresponding steps in Example 1, which will not be repeated here;

[0103] S303. Refer to the corresponding steps in Example 1, which will not be repeated here;

[0104] S304. Refer to the corresponding steps in Example 1, which will not be repeated here;

[0105] S305. Refer to the corresponding steps in Example 1, which will not be repeated here;

[0106] S306. Refer to the corresponding steps in Example 1, which will not be repeated here;

[0107] S4. Based on the extracted respiratory features, the process of constructing a cognitive load level classification model after feature selection is as follows:

[0108] S401. Refer to the corresponding steps in Example 1, which will not be repeated here;

[0109] S402. Divide the respiratory feature set into a training set and a test set in a 7:3 ratio;

[0110] S403. Refer to the corresponding steps in Example 1, which will not be repeated here;

[0111] S404. Input the obtained best feature subset into the random forest model for training. The random forest used contains 150 decision trees, the random seed is set to 120, random sampling with replacement is used, and 8-fold cross-validation is used during training. Finally, the trained cognitive load level classification model is obtained.

[0112] S5. Extract the test subject's respiratory feature samples and input them into the cognitive load level classification model. The process of assessing the test subject's cognitive load level is as follows:

[0113] S501. Use an infrared thermal imager to capture facial video of the test subject during a mental arithmetic task and divide it into a series of facial images.

[0114] S502. After steps S2 and S3, the respiratory characteristic sample of the test subject is obtained;

[0115] S503. The obtained respiratory feature samples are fed into the cognitive load level classification model trained in step S404. The cognitive load level is determined based on the label output by the classification model. A label of 0 represents a low cognitive load level, and a label of 1 represents a high cognitive load level. Based on the labels output by the random forest and the labels of the input samples, four results can be generated:

[0116] (1) The input sample label is 1 and the output result label is 1, which is denoted as true high load (TP);

[0117] (2) When the input sample label is 0 and the output result label is 1, it is denoted as False High Load (FP);

[0118] (3) When the input sample label is 0 and the output result label is 0, it is recorded as true low load (TN).

[0119] (4) The input sample label is 1 and the output result label is 0, which is denoted as false low load (FN).

[0120] Based on the above results, accuracy can be calculated as an evaluation index for cognitive load assessment:

[0121]

[0122] Accuracy reflects the overall correctness of the random forest's evaluation of all samples. In this example, the random forest's evaluation accuracy is 82.2%.

[0123] The above embodiments are preferred embodiments of the present invention, but the embodiments of the present invention are not limited to the above embodiments. Any changes, modifications, substitutions, combinations, or simplifications made without departing from the spirit and principle of the present invention shall be considered equivalent substitutions and shall be included within the protection scope of the present invention.

Claims

1. A non-contact cognitive load assessment method based on infrared thermal imaging, characterized in that, Includes the following steps: S1. Use an infrared thermal imager to capture facial videos of the test subject while seated and during cognitive tasks, and obtain facial images by segmenting them into frames. S2. Locate the nostril region from the facial image and extract the raw respiratory signal; S3. Denoise the original respiratory signal and extract respiratory features in the time and frequency domains; S4. Based on the extracted respiratory features, feature selection is performed to construct a cognitive load level classification model; S5. Extract respiratory feature samples from the test subjects and input them into the cognitive load level classification model to assess the cognitive load level of the test subjects; The step S2, which involves locating the nostril region from the facial image and extracting the raw respiratory signal, includes the following steps: S201. Mark the nose region with rectangles on each infrared thermal imaging facial image collected by yourself, and construct a facial nose region dataset. S202. Divide the constructed facial nose region dataset into a training set and a test set. Use the training set to train the instance segmentation network and the test set to test the performance of the instance segmentation network. S203. For a series of facial images obtained after segmenting the same facial video into frames, use the trained instance segmentation network to automatically locate the nose region in the facial images. S204. On the located nose region image, perform continuous difference on adjacent m frames and calculate the absolute value of gray level to obtain the difference image. S205. Threshold the obtained difference image, then perform an opening operation to retain the two largest connected regions as the nostril regions. S206. For a series of facial images obtained after segmenting the same facial video, the average gray value of all pixels in the nostril region is used as the amplitude of the breathing signal, and the original breathing signal is extracted from the change of the average gray value of the nostril region over time.

2. The non-contact cognitive load assessment method based on infrared thermal imaging according to claim 1, characterized in that, Step S1, which involves using an infrared thermal imager to capture facial videos of the test subject while seated and performing a cognitive task, and then segmenting the videos into frames to obtain facial images, includes the following steps: S101. Place an infrared thermal imager directly in front of the test subject, ensuring that the test subject's face is within the field of view of the infrared thermal imager, and take facial videos of the test subject in a sitting state, performing a low-load cognitive task, and performing a high-load cognitive task respectively. S102. Divide the captured facial video into a series of facial images in chronological order.

3. The non-contact cognitive load assessment method based on infrared thermal imaging according to claim 1, characterized in that, The step S3, which involves denoising the original respiratory signal and extracting respiratory features in the time and frequency domains, includes the following steps: S301. Bandpass filter is applied to the raw respiratory signal obtained in step S206. S302. Remove baseline drift of the respiratory signal after bandpass filtering; S303. Subtract the mean value from the respiratory signal after removing baseline drift, and remove the DC component to obtain the denoised respiratory signal. S304. First, locate the peaks and troughs of the denoised respiratory signal, and extract the respiratory rate and minute ventilation as the overall features of the respiratory signal. Then, calculate the mean, standard deviation, root mean square, and coefficient of variation of five feature sequences: mean tidal volume, ratio of inspiratory to expiratory time, proportion of inspiratory time, inspiratory amplitude, and expiratory amplitude, respectively, as respiratory variability features. Use the overall features of the respiratory signal and the respiratory variability features as the time-domain features of the respiratory signal. S305. Calculate the power spectrum of the denoised respiratory signal and extract the power spectrum in the ranges of 0.1-0.2Hz, 0.2-0.3Hz, 0.3-0.4Hz and 0.4-0.5Hz as the frequency domain features of the respiratory signal. S306. Combine the time-domain features and frequency-domain features to form a respiratory feature vector. Subtract the respiratory vector features under the cognitive task from the respiratory vector features under the sitting state to obtain the baseline-removed respiratory feature sample.

4. The non-contact cognitive load assessment method based on infrared thermal imaging according to claim 1, characterized in that, The step S4, which involves constructing a cognitive load level classification model based on extracted respiratory features after feature selection, includes the following steps: S401. Collect respiratory feature samples from several test subjects, label the baseline-removed respiratory feature samples, set the label of samples performing low-load cognitive tasks to 0, and set the label of samples performing high-load cognitive tasks to 1, and construct a respiratory feature set. S402. Divide the respiratory feature set into a training set and a test set according to the proportion; S403. Use the recursive feature elimination algorithm to filter the labeled respiratory feature set and select the best feature subset. S404. The best feature subset obtained is fed into a random forest for training to construct a mapping relationship between respiratory features and cognitive load level, thus obtaining a constructed cognitive load level classification model.

5. The non-contact cognitive load assessment method based on infrared thermal imaging according to claim 1, characterized in that, Step S5 involves extracting respiratory feature samples from the test subject and inputting them into the cognitive load level classification model to assess the test subject's cognitive load level. This step includes the following steps: S501. Use an infrared thermal imager to capture facial video of the test subject during a cognitive task and segment it into facial images. S502. After steps S2 and S3, the respiratory characteristic sample of the test subject is obtained; S503. Input the obtained respiratory feature samples into the cognitive load level classification model constructed in step S404, and determine the level of cognitive load based on the label output by the classification model.