Facial expression and ai-based pain assessment model
By dynamically detecting the activity intensity of facial muscle regions and constructing a collaborative relationship matrix, combined with a pre-trained pain feature classifier, the problem of inaccurate identification of individual differences in muscle activity habits in existing technologies is solved, and efficient assessment of pain expressions is achieved.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- MEI HOSPITAL UNIV OF CHINESE ACAD OF SCI
- Filing Date
- 2026-02-10
- Publication Date
- 2026-06-19
AI Technical Summary
Existing technologies cannot effectively distinguish between baseline muscle tension in an individual at rest and abnormal activity triggered by pain, and lack quantification of the dynamic interaction patterns of facial muscle regions during pain events, leading to inaccurate identification of subtle pain and individual differences in muscle activity habits.
By using a pain assessment model based on facial expressions and AI, the activity intensity of facial muscle regions is dynamically detected, a multi-region collaborative relationship matrix is constructed, and combined with a pre-trained pain feature classifier, pain probability distribution data is output.
It improves the ability and robustness to recognize complex, subtle, or transient pain expressions, and can more accurately depict the dynamic process of muscle activity, thereby enhancing the accuracy of pain assessment.
Smart Images

Figure CN122245798A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of intelligent medical monitoring technology, and in particular to a pain assessment model based on facial expressions and AI. Background Technology
[0002] Currently, vision-based automatic pain assessment technologies mainly rely on the extraction and classification of overall features from static facial images or short sequences. Common methods include directly learning the mapping from facial images to pain levels through convolutional neural networks, or extracting predefined geometric and textural features of facial key points for pattern matching. These methods treat the face as a whole or a series of discrete points for feature learning, and their model decisions depend on appearance pattern associations inferred from massive amounts of data.
[0003] Such methods based on overall appearance classification have significant limitations. Facial expressions exhibit high individual variability and ambiguity; the same level of pain may produce different facial features, and similar expressions may appear very similar in the overall image, leading to insufficient model specificity and limited generalization ability. Pain, as a physiological stress response, is essentially a sequence of contraction and relaxation of specific muscle groups driven by the nervous system. This dynamic, coordinated pattern based on physiological mechanisms is difficult to fully characterize and analyze using static, holistic image features.
[0004] Current technologies lack the ability to model the origins of facial physiological activity, failing to distinguish between baseline muscle tension in a resting state and abnormal activity triggered by pain. Furthermore, they fail to effectively quantify the dynamic interactions between different muscle regions during pain events. This leads to inaccurate identification of subtle pain, transient pain expressions, and situations where individual muscle activity habits differ. Summary of the Invention
[0005] The purpose of this invention is to address the shortcomings of existing technologies by proposing a pain assessment model based on facial expressions and AI.
[0006] To achieve the above objectives, the present invention employs the following technical solution: a pain assessment model based on facial expressions and AI, comprising: The facial data acquisition unit acquires a sequence of continuous facial images of the target object during the pain observation period from the image sensor. The region activity detection unit processes the continuous facial image sequence, dynamically detects whether the activity intensity of the facial muscle region exceeds the predefined basic activity intensity based on the pre-stored facial muscle region anatomical division, marks the facial muscle regions with abnormal activity intensity, and generates active region identification information. The activity trajectory quantization unit performs time-series tracking of the facial muscle regions with abnormal activity intensity based on the active region identification information, calculates the activity intensity vector of each facial muscle region with abnormal activity intensity at multiple consecutive time points, and calculates the feature intensity change across time points based on the activity intensity vector, thereby generating regional activity intensity evolution trajectory data. The collaborative relationship analysis unit receives the regional activity intensity evolution trajectory data, calculates the correlation of activity intensity changes among multiple different facial muscle regions with abnormal activity intensity, and constructs a multi-region collaborative relationship matrix that reflects the collaborative contraction or relaxation patterns of facial muscles. The feature fusion and determination unit fuses and encodes the regional activity intensity evolution trajectory data and the multi-regional collaborative relationship matrix, and processes them through a pre-trained pain feature classifier to output the pain probability distribution data of the target object within the current time window.
[0007] As a further aspect of the present invention, the regional activity detection unit includes: The facial muscle region segmentation subunit stores a partition map that divides the face into multiple independent and partially overlapping facial muscle regions, each facial muscle region being associated with the movement of a specific set of facial muscle groups. A baseline library of basic activity intensity stores the range of basic activity intensity corresponding to each facial muscle region in a pain-free state. The activity intensity calculation subunit is used to extract the statistical features of pixel intensity changes in each facial muscle region for each frame of the continuous facial image sequence and generate the current activity intensity value. An abnormality marking subunit is used to compare the current activity intensity value with the basic activity intensity range of the corresponding facial muscle region in the basic activity intensity benchmark library. If the current activity intensity value continues to exceed the basic activity intensity range to a preset frame number threshold, the corresponding facial muscle region is marked as a facial muscle region with abnormal activity intensity, and all marked facial muscle region information is summarized into the active region identification information.
[0008] As a further aspect of the present invention, the activity trajectory quantization unit includes: The region tracking subunit is used to locate the facial muscle region with abnormal activity intensity in the image sequence according to the active region identification information, and to track the positional change of the facial muscle region with abnormal activity intensity between adjacent frames using an image registration method. The intensity vector construction subunit is used to extract the texture complexity and motion amplitude measure of the tracked facial muscle region with abnormal activity intensity at each preset sampling time point, and combine the texture complexity and motion amplitude measure to form the activity intensity vector. The change calculation subunit is used to calculate the difference between the activity intensity vectors of the same facial muscle region with abnormal activity intensity at different sampling time points, and quantify the difference as the feature intensity change. The trajectory generation subunit is used to connect the characteristic intensity changes of a facial muscle region with abnormal activity intensity at all sampling time points into a curve in time sequence. The curve is used as the regional activity intensity evolution trajectory data of the facial muscle region.
[0009] As a further aspect of the present invention, the collaborative relationship analysis unit includes: The temporal alignment subunit is used to ensure that the regional activity intensity evolution trajectory data of facial muscle regions with different activity intensity anomalies used to calculate correlations are fully aligned on the time axis. The correlation calculation subunit is used to select the regional activity intensity evolution trajectory data of any two different facial muscle regions with abnormal activity intensity, and calculate the Pearson correlation coefficient or mutual information of these two trajectory data sequences as the strength value of the cooperative relationship between the two regions. The matrix construction sub-unit is used to fill the pairwise synergy strength values between all facial muscle regions with abnormal activity intensity into a symmetric matrix with region numbers as rows and columns to construct the initial synergy matrix. The matrix enhancement subunit is used to perform smoothing filtering on the initial collaborative relationship matrix and introduce a prior weight matrix based on the anatomical connectivity of facial muscles for element-wise weighting, ultimately generating the multi-region collaborative relationship matrix.
[0010] As a further aspect of the present invention, the feature fusion and determination unit includes: The trajectory data encoding subunit is used to process the regional activity intensity evolution trajectory data through a recurrent neural network and output a trajectory feature encoding vector of fixed length. The collaborative matrix encoding subunit is used to process the multi-region collaborative relationship matrix through a graph convolutional network, where the nodes of the graph correspond to facial muscle regions with abnormal activity intensity, the edge weights of the graph correspond to the collaborative relationship strength values, and output a fixed-length collaborative feature encoding vector. The feature splicing subunit is used to concatenate the trajectory feature encoding vector and the collaborative feature encoding vector into a fused feature vector; The pain classification subunit is used to input the fused feature vector into the pre-trained pain feature classifier, which is composed of fully connected layers and outputs a probability vector representing the likelihood of different pain levels, i.e., the pain probability distribution data.
[0011] As a further aspect of the present invention, the model also includes: A basic activity intensity calibration unit is used to periodically update the basic activity intensity benchmark library in the regional activity detection unit; The baseline activity intensity calibration unit receives clinically validated pain-free facial image data, recalculates and updates the baseline activity intensity range for each facial muscle region to ensure the baseline activity intensity benchmark library is suitable for the target population.
[0012] As a further aspect of the present invention, the training process of the pre-trained pain feature classifier includes: Obtain a training dataset containing a large number of samples with labeled real pain levels. Each sample contains its corresponding regional activity intensity evolution trajectory data, multi-regional collaborative relationship matrix, and real pain level label. The training dataset is used to perform end-to-end joint training of the trajectory data encoding subunit, the collaborative matrix encoding subunit, and the pain classification subunit in the feature fusion and determination unit; The training objective is to minimize the cross-entropy loss between the pain probability distribution data output by the pain classification subunit and the actual pain level label.
[0013] As a further embodiment of the present invention, the model also includes an abnormal state feedback unit; The abnormal state feedback unit receives the pain probability distribution data output by the pain feature classifier, and the active area identification information marked by the area activity detection unit. When the pain probability distribution data shows a high pain probability, but the active area identification information shows that the number of facial muscle areas with abnormal activity intensity is less than a preset threshold, the abnormal state feedback unit generates a model calibration signal. The model calibration signal is used to trigger the basic activity intensity calibration unit to start a targeted calibration process, or to increase the weight of the corresponding samples during the training process of the pre-trained pain feature classifier.
[0014] As a further aspect of the present invention, the image registration method used by the region tracking subunit adopts a feature-point-based optical flow method, specifically: Corner features were detected within the facial muscle region exhibiting abnormal activity intensity. Between consecutive image frames, the sparse optical flow field of the corner features is calculated; By utilizing the statistical characteristics of the sparse optical flow field, the estimated location of the facial muscle region with abnormal activity intensity in the next frame is determined, thus completing region tracking.
[0015] As a further aspect of the present invention, the prior weight matrix based on facial muscle anatomical connectivity used in the matrix enhancement subunit is constructed in the following manner: From knowledge of facial muscle anatomy, obtain prior information about whether there is a direct fascial connection or synergistic movement between any two facial muscles; For regions corresponding to two muscles that have direct anatomical connections or known strong synergistic relationships, set high weight values at the corresponding prior weight matrix element positions. For regions without direct anatomical connections, a lower weight value is set; The prior weight matrix is multiplied element-wise with the initial collaborative relationship matrix to achieve enhanced collaborative relationships based on anatomical knowledge.
[0016] Compared with the prior art, the advantages and positive effects of the present invention are as follows: Based on pre-stored anatomical divisions of facial muscle regions, this technique dynamically detects whether the activity intensity of facial muscle regions exceeds a predefined baseline. It anchors facial expression analysis to specific physiological structural units, establishing a personalized muscle activity baseline for each target individual. By identifying and labeling "abnormally active" muscle regions that deviate from their baseline, the focus of analysis shifts from a constantly changing visual image to biologically significant signal events that are closer to the physiological source of pain. This process effectively removes noise from individual static facial features and habitual expressions, directly capturing abnormal activation events of local muscle groups that may be triggered by pain stimuli, providing a clean and interpretable initial physiological activity identifier for subsequent analysis.
[0017] Temporal tracking is performed on identified facial muscle regions exhibiting abnormal activity intensity, calculating their activity intensity vectors and feature intensity changes across time points to generate regional activity intensity evolution trajectory data. Furthermore, the correlation of activity intensity changes among different abnormal regions is calculated, constructing a multi-regional synergistic relationship matrix. This technique not only records the intensity of muscle activity but also precisely characterizes its dynamic evolution over time. Based on this, correlation analysis reveals the synergistic or antagonistic relationships between multiple abnormal muscle regions in terms of activity temporality, thereby abstracting higher-order dynamic features describing the collaborative working patterns of muscle groups. This level of analysis, moving from single-point anomaly detection to dynamic analysis of multi-unit collaborative networks, can capture typical, time-dependent muscle contraction patterns in pain expression, enhancing the model's ability and robustness in recognizing complex, subtle, or transient pain expressions. Attached Figure Description
[0018] Figure 1 This is a timeline diagram of the pain assessment model based on facial expressions and AI described in this invention. Figure 2This is a temporal variation diagram of the frontalis muscle activity intensity vector; Figure 3 A flowchart illustrating the work of the collaborative relationship analysis unit; Figure 4 Heatmap of facial region synergistic relationships; Figure 5 The graph shows the change in loss of the classifier in the pain assessment model. Detailed Implementation
[0019] To make the objectives, technical solutions, and advantages of this invention clearer, the invention will be further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative and not intended to limit the invention.
[0020] In the description of this invention, it should be understood that the terms "length," "width," "upper," "lower," "front," "rear," "left," "right," "vertical," "horizontal," "top," "bottom," "inner," and "outer," etc., indicating orientation or positional relationships, are based on the orientation or positional relationships shown in the accompanying drawings and are only for the convenience of describing the invention and simplifying the description, and do not indicate or imply that the device or element referred to must have a specific orientation, or be constructed and operated in a specific orientation, and therefore should not be construed as a limitation of the invention. Furthermore, in the description of this invention, "a plurality of" means two or more, unless otherwise explicitly specified.
[0021] See Figure 1The pain assessment model based on facial expression and AI includes a facial data acquisition unit, a region activity detection unit, an activity trajectory quantification unit, a collaborative relationship analysis unit, and a feature fusion and judgment unit. The facial data acquisition unit acquires a continuous sequence of facial images of the target object during the pain observation period from an image sensor. The region activity detection unit processes the continuous facial image sequence, dynamically detects whether the activity intensity of facial muscle regions exceeds a predefined baseline activity intensity based on pre-stored anatomical divisions of facial muscle regions, and marks facial muscle regions with abnormal activity intensity, generating active region identification information. The activity trajectory quantification unit performs temporal tracking of facial muscle regions with abnormal activity intensity based on the active region identification information. The system calculates the activity intensity vector of each facial muscle region with abnormal activity intensity at multiple consecutive time points, and calculates the feature intensity change across time points based on the activity intensity vector, thereby generating regional activity intensity evolution trajectory data. The collaborative relationship analysis unit receives the regional activity intensity evolution trajectory data, calculates the correlation of activity intensity changes between multiple different facial muscle regions with abnormal activity intensity, and constructs a multi-region collaborative relationship matrix reflecting the collaborative contraction or relaxation pattern of facial muscles. The feature fusion and judgment unit fuses and encodes the regional activity intensity evolution trajectory data and the multi-region collaborative relationship matrix, and processes it through a pre-trained pain feature classifier to output the pain probability distribution data of the target object within the current time window.
[0022] In one embodiment of the present invention, the facial muscle region segmentation subunit storage partition map is constructed based on the anatomical foundation of the facial action coding system. The partition map divides the face into multiple independent and partially overlapping facial muscle regions, such as the glabella, periorbital region, and nasolabial fold region. Each facial muscle region is associated with the movement of a specific set of facial muscle groups; for example, the periorbital region is mainly associated with the activity of the orbicularis oculi muscle. The establishment of the basic activity intensity benchmark library is accomplished by collecting facial image data of the target population in resting and neutral emotional states. The basic activity intensity benchmark library stores the basic activity intensity range corresponding to each facial muscle region in a pain-free state, and this range is expressed as a mean. and standard deviation The formal definition is as follows. The activity intensity calculation subunit processes each frame of the continuous facial image sequence. It extracts statistical features of pixel intensity changes in each facial muscle region. These extracted features include the variance of pixel grayscale values and the sum of gradient magnitudes within the region, generating the current activity intensity value. Current activity intensity value The calculation method is defined by the following formula: in: This indicates the current activity intensity value. This indicates the facial muscle region being calculated. Indicates the area Variance of pixel grayscale values Indicates the area The sum of the gradient magnitudes of all pixels within the range. and The weighting coefficients are preset. In some embodiments, the anomaly flagging subunit uses the current activity intensity value generated by the activity intensity calculation subunit. The baseline activity intensity is compared with the baseline activity intensity range of the corresponding facial muscle region in the baseline activity intensity benchmark library. The baseline activity intensity range is expressed as an interval: in: The preset scale constant is used. The anomaly flagging subunit is set with a preset frame number threshold. If the current activity intensity value The number of frames continuously exceeds the preset frame rate threshold and exceeds the basic activity intensity range. If the activity intensity is abnormal, the corresponding facial muscle region is marked as an abnormal facial muscle region. The abnormality marking subunit summarizes the information of all marked facial muscle regions. The summarized information includes the anatomical number of the facial muscle region with abnormal activity intensity, the start time of the abnormal state, and the average offset of the activity intensity, ultimately forming active region identification information. Optionally, the basic activity intensity benchmark library can be subdivided into multiple sub-libraries according to different demographic characteristics. For example, independent benchmark libraries can be established for different age groups. Before calculating the current activity intensity value, the activity intensity calculation subunit calls the corresponding basic activity intensity sub-library for comparison based on the input target object attributes. Optionally, a preset frame number threshold is used. It's not a fixed value, it's a frame rate threshold. It can be dynamically adjusted based on the frame rate F of the continuous facial image sequence; the higher the frame rate F, the higher the frame number threshold. The corresponding increase is to ensure that the judgment of the time continuity of anomaly detection has a consistent physical time scale.
[0023] The "partial overlap" design of the regional atlas in the facial muscle region segmentation subunit is understandable, reflecting the mutual coverage and functional linkage of facial muscle groups in anatomical structure. This design allows the activity intensity calculation subunit to better capture the coordinated activity signals at the boundaries of muscle groups when extracting features. Similarly, the anomaly labeling subunit uses a "sustained exceedance" judgment logic rather than single-frame judgment. This helps filter out instantaneous intensity changes caused by brief facial twitches or image noise, thereby improving the detection specificity for persistent muscle tension activities related to pain.
[0024] In one embodiment of the present invention, in a specific implementation, the region tracking subunit locks the corresponding facial muscle region with abnormal activity intensity in the image sequence based on the active region identification information generated by the region activity detection unit. The region tracking subunit uses an image registration method to track the positional change of the facial muscle region with abnormal activity intensity between adjacent frames. The image registration method used is a feature-point-based optical flow method, specifically, detecting corner features within the facial muscle region with abnormal activity intensity, calculating the sparse optical flow field of the corner features between consecutive image frames, and determining the estimated position of the facial muscle region with abnormal activity intensity in the next frame through the statistical characteristics of the sparse optical flow field to complete the region tracking. The intensity vector construction subunit extracts the texture complexity and motion amplitude metric of the tracked facial muscle region with abnormal activity intensity at each preset sampling time point. The texture complexity is calculated based on the contrast of the gray-level co-occurrence matrix within the region, and the motion amplitude metric is calculated based on the combination of the displacement vector magnitude of the centroid of the region between adjacent sampling time points and the rate of change of the region area. The texture complexity and motion amplitude metric are combined to construct the activity intensity vector. The construction method at sampling time point t is defined by the following formula: in: This represents the activity intensity vector at time point t. This represents the texture complexity feature value extracted at time point t. This represents the magnitude of motion measured at time point t. and These are the preset coefficients used for dimensional normalization.
[0025] In some embodiments, the variation calculation subunit calculates the difference in activity intensity vectors between different sampling time points for the same facial muscle region with abnormal activity intensity. The variation calculation subunit uses Euclidean distance or cosine distance to measure the activity intensity vectors at consecutive time points. and The difference between them is used to quantify the calculated distance as a change in feature intensity. In some embodiments, the trajectory generation subunit connects the characteristic intensity changes of a facial muscle region with abnormal activity intensity at all sampling time points into a curve in a temporal sequence. This curve serves as the regional activity intensity evolution trajectory data of the facial muscle region, which is mathematically represented as a discrete time series. Optionally, when calculating the sparse optical flow field, the region tracking sub-unit can employ the pyramid Lucas-Kanade optical flow method to improve the tracking robustness for larger muscle movements, and the Shi-Tomasi corner detection algorithm can be used for corner feature detection. Optionally, the intensity vector can be used to construct the motion amplitude measure in the sub-unit. It can be further refined into a weighted sum of translational and deformation components. The translational component is determined by the displacement of the region's centroid, and the deformation component is determined by the rate of change of the region's axial length in the main motion direction.
[0026] It is understandable that the region tracking subunit uses a sparse optical flow method based on feature points instead of a dense optical flow method. This choice significantly reduces computational complexity while ensuring stable position tracking of facial muscle regions with abnormal activity intensity, which is beneficial for real-time model processing. It is also understandable that the variation in feature intensity... The calculation is based on the activity intensity vector rather than the original image intensity, which allows the regional activity intensity evolution trajectory data to more abstractly represent the dynamic patterns of muscle activity and provide more discriminative temporal features for subsequent synergistic relationship analysis units.
[0027] See Figure 2 This is a temporal variation graph of the frontalis muscle activity intensity vector, a core visualization result of the "muscle activity tracking phase" in the pain assessment model. The graph clearly shows the dynamic response pattern of the frontalis muscle under painful stimuli, providing the following support for the pain assessment model: Quantifying the temporal characteristics of muscle activity, providing basic data for subsequent calculations of characteristic intensity changes; intuitively reflecting the correlation between painful stimuli and muscle activity, helping to verify the effectiveness of the "active area tracking" in the model; combining with the activity trajectories of other facial muscle regions, further analyzing multi-regional coordinated contraction patterns, improving the accuracy of pain assessment; the peak value of the line is highly correlated with the intensity of the painful stimulus, which can be used to establish the correspondence between muscle activity intensity and pain level, assisting in the calibration of clinical pain assessment.
[0028] In one embodiment of the present invention, see [reference] Figure 3In specific implementation, the temporal alignment subunit receives multiple regional activity intensity evolution trajectory data from the activity trajectory quantization unit. The temporal alignment subunit ensures that the regional activity intensity evolution trajectory data of different facial muscle regions with abnormal activity intensities used for correlation calculation are fully aligned on the time axis. The alignment operation uses a timestamp-based linear interpolation method to uniformly resample trajectory data sequences of inconsistent lengths to the same number of data points with aligned time points. The correlation calculation subunit selects regional activity intensity evolution trajectory data from any two different facial muscle regions with abnormal activity intensities. The correlation calculation subunit calculates the Pearson correlation coefficient or mutual information of these two trajectory data sequences as the strength value of the cooperative relationship between the two regions. For the trajectory data sequences of region i and region j... and The formula for calculating its Pearson correlation coefficient ρ_{ij} is: in: and These represent the changes in feature intensity for regions i and j at the k-th alignment time point, respectively. and These represent the average values of the two trajectory data. This represents the length of the aligned trajectory data sequence, calculated as follows. This is used as the strength value of the synergistic relationship. In some embodiments, the matrix construction subunit fills the synergistic relationship strength values between all pairs of facial muscle regions with abnormal activity intensity into a symmetric matrix. The row and column indices of the matrix correspond to the numbers of the facial muscle regions with abnormal activity intensity, and the element in the i-th row and j-th column of the matrix is the synergistic relationship strength value between region i and region j. This constructs an initial cooperative relationship matrix. In some embodiments, the matrix enhancement subunit performs smoothing filtering on the initial cooperative relationship matrix. The smoothing filtering process uses a moving average window to perform convolution operation on each row or column of the matrix to suppress the influence of random noise that may exist in the trajectory data on the correlation calculation.
[0029] Optionally, the matrix enhancement subunit introduces an element-wise weighted prior weight matrix based on facial muscle anatomical connectivity. This prior weight matrix is constructed as follows: Prior information regarding the existence of direct fascial connections or synergistic movements between any two facial muscles is obtained from facial muscle anatomy. For regions corresponding to two muscles with direct anatomical connections or known strong synergistic movements, higher weight values are assigned to their corresponding elements in the prior weight matrix; for regions without direct anatomical connections, lower weight values are assigned. Optionally, the prior weight matrix is element-wise multiplied with the initial synergistic relationship matrix after smoothing filtering to achieve synergistic relationship enhancement based on anatomical knowledge, ultimately generating a multi-region synergistic relationship matrix. The element in the i-th row and j-th column of the multi-region synergistic relationship matrix is obtained using the formula: in: This represents the elements of the smoothed initial collaboration matrix. This represents the weight value at the corresponding position in the prior weight matrix. This represents the elements of the final multi-regional synergy matrix. It's understandable that the temporal alignment sub-unit operation is necessary because the start time and sampling length of the regional activity intensity evolution trajectory data for facial muscle regions with different activity intensities may differ due to varying labeling timings; a unified timeline is a prerequisite for meaningful correlation calculations. It's also understandable that the introduction of the prior weight matrix in the matrix enhancement sub-unit integrates anatomical prior knowledge into data-driven synergy calculations. This helps enhance the model's ability to recognize pain-related muscle synergy patterns with a clear anatomical basis (such as the interaction between the frontalis and orbicularis oculi muscles in frowning), while suppressing accidental synergy signals that are anatomically unsupported.
[0030] In one embodiment of the present invention, the feature fusion and determination unit includes a trajectory data encoding subunit, a collaborative matrix encoding subunit, a feature splicing subunit, and a pain classification subunit. The trajectory data encoding subunit processes the regional activity intensity evolution trajectory data through a recurrent neural network. The recurrent neural network adopts a long short-term memory network structure, with its input being the time series of feature intensity changes in each facial muscle region with abnormal activity intensity, and its output being a fixed-length trajectory feature encoding vector. The collaborative matrix encoding subunit processes the multi-region collaborative relationship matrix through a graph convolutional network. The graph convolutional network is constructed using facial muscle regions with abnormal activity intensity as graph nodes, and the collaborative relationship intensity values in the multi-region collaborative relationship matrix as edge weights between nodes. After multiple convolution and pooling operations, the graph convolutional network outputs a fixed-length collaborative feature encoding vector. The feature splicing subunit connects the trajectory feature encoding vector and the collaborative feature encoding vector into a fused feature vector, with the connection operation performed directly along the dimensional direction of the vectors. The pain classification subunit will integrate the feature vectors into the pre-trained pain feature classifier. The pre-trained pain feature classifier consists of multiple fully connected layers. The final output layer uses the Softmax activation function to output a probability vector representing the likelihood of different pain levels, i.e., the pain probability distribution data.
[0031] In some embodiments, the training process of the pre-trained pain feature classifier begins with acquiring a training dataset. This training dataset contains a large number of samples labeled with actual pain levels. Each sample includes its corresponding regional activity intensity evolution trajectory data, a multi-regional collaborative relationship matrix, and an actual pain level label. See Table 1 for the sample structure of the training dataset.
[0032] Table 1: Sample Structure of the Training Dataset In some embodiments, the trajectory data encoding subunit, collaborative matrix encoding subunit, and pain classification subunit in the feature fusion and decision unit are jointly trained end-to-end using a training dataset. The training process employs a backpropagation algorithm to optimize network parameters. The training objective is to minimize the cross-entropy loss between the pain probability distribution data output by the pain classification subunit and the true pain level label. The cross-entropy loss function L is calculated using the following formula: Where: L represents the cross-entropy loss value, and C represents the total number of pain level categories. This represents the one-hot encoded value of the true pain level label in category c. This represents the probability value corresponding to category c in the pain probability distribution data output by the pain classification subunit. Optionally, the recurrent neural network used in the trajectory data encoding subunit can employ a bidirectional long short-term memory network to simultaneously capture the forward and backward temporal dependencies of the trajectory data related to the evolution of regional activity intensity. Optionally, the number of layers in the graph convolutional network in the cooperating matrix encoding subunit can be dynamically adjusted based on the number of facial muscle regions with abnormal activity intensity; a deeper graph convolutional network is used when there are many regions to enhance feature extraction capabilities. It can be understood that the feature concatenation subunit concatenates the trajectory feature encoding vector and the cooperating feature encoding vector. This fusion method allows the pain feature classifier to simultaneously utilize the temporal dynamic information and spatial cooperating information of facial muscle activity, thereby comprehensively representing the multi-dimensional features of pain expressions. It can be understood that the cross-entropy loss function is used as the optimization objective during training, which directly promotes the alignment of the pain probability distribution data output by the pain classification subunit with the distribution of the true pain level label, ensuring the reliability of the model's classification results.
[0033] In its implementation, the trajectory data encoding subunit employs a Long Short-Term Memory (LSTM) network structure to process the regional activity intensity evolution trajectory data. The implementation of this network involves multiple steps. The regional activity intensity evolution trajectory data, as input, is organized into a sequence with a time step length of T. Each time step corresponds to the change in feature intensity at a sampling point; this change is a scalar value. The LTM network design includes an input gate, a forget gate, an output gate, and cell states. The input gate calculates the weight of new information using a sigmoid function. The forget gate determines the proportion of information retained from the cell state of the previous time step. The output gate controls the current... The output of the hidden state is typically configured as a bidirectional structure to capture both forward and backward temporal dependencies. The number of hidden layer units is set to 64 or 128 depending on the complexity of the input sequence and computational resources. The number of network layers is set to two to balance expressive power and training efficiency. The hidden state at each time step is passed sequentially and converges at the last time step. This final hidden state passes through a fully connected layer with 256 neurons, outputting a fixed-length trajectory feature encoding vector. During training, the network parameters are updated using the Adam optimizer and backpropagation algorithm. Gradients flow through the cross-entropy loss function of the pain classification sub-unit to achieve end-to-end learning.
[0034] The collaborative matrix encoding subunit uses a graph convolutional network to process the multi-region collaborative relationship matrix. The construction of the graph convolutional network begins with defining a graph structure, where nodes correspond to facial muscle regions with abnormal activity intensity. The initial features of each node are extracted from the regional activity intensity evolution trajectory data of the corresponding region, and its mean, standard deviation, and maximum value are calculated to form a feature vector. The edge weights are directly taken from the collaborative relationship strength values in the multi-region collaborative relationship matrix. The adjacency matrix is thus generated and undergoes symmetric normalization to stabilize the training process. The graph convolutional layer performs feature propagation operations, and the features of each node are weighted and aggregated with the features of its neighboring nodes. The weights are determined by the edge weights. After aggregation, the features are passed through a linear transformation layer and the ReLU activation function is applied. The network contains two to three graph convolutional layers, each followed by a graph pooling layer. The graph pooling layer adopts a node importance-based selection method, retaining key nodes according to the node feature norm or degree centrality, gradually compressing the graph size. Finally, global average pooling is used to fuse all node features into a global vector. This vector is input into a fully connected layer and outputs a 256-dimensional collaborative feature encoding vector. During training, the adjacency matrix remains fixed, and only the network weights and node features are updated. The optimization process is performed synchronously with the trajectory encoding network.
[0035] See Figure 4 This is a heatmap of facial region coordination relationships, a core visualization result of the "Coordination Relationship Analysis Unit" in the pain assessment model. It demonstrates the intensity of muscle activity coordination in different facial regions during pain responses. This map provides quantitative evidence of multi-region coordination for the pain assessment model and can be directly input into the "Coordination Matrix Encoding Subunit" to extract high-order features through a graph convolutional network, improving the accuracy of pain classification. It reveals the patterns of facial muscle linkage under pain expressions, which can assist clinicians in understanding the pain manifestation patterns of different patients, especially for individuals unable to express pain independently (such as comatose patients and infants). It verifies the effectiveness of the Pearson correlation coefficient calculation in the "Coordination Relationship Analysis Unit" and the rationality of enhancement based on the anatomical prior weight matrix.
[0036] In one embodiment of the present invention, the pain assessment model based on facial expression and AI further includes a basic activity intensity calibration unit. This unit periodically updates the basic activity intensity benchmark library in the regional activity detection unit. The basic activity intensity calibration unit receives clinically confirmed pain-free facial image data and recalculates and updates the basic activity intensity range for each facial muscle region. The update of the basic activity intensity range is achieved by statistically analyzing the distribution of activity intensity values in each facial muscle region using newly collected pain-free facial image data. The updated basic activity intensity range is determined by a new mean. and standard deviation Defined and stored in interval form, the interval is calculated using the following formula: in: This is a preset scaling factor used to control the tolerance of the baseline activity intensity range relative to the data distribution. In some embodiments, the model further includes an abnormal state feedback unit, which receives pain probability distribution data output by the pain feature classifier in the feature fusion and determination unit, and simultaneously receives active region identification information marked by the region activity detection unit. A pain probability threshold is preset within the abnormal state feedback unit. A threshold for the number of abnormal activity intensity areas When the probability value corresponding to the highest pain level in the pain probability distribution data exceeds the pain probability threshold... Furthermore, the number of facial muscle regions with abnormal activity intensity indicated by the active region identification information is less than the threshold for the number of regions with abnormal activity intensity. At that time, the abnormal state feedback unit generates a model calibration signal.
[0037] In some embodiments, the model calibration signal has two uses. One use is to trigger the Basic Activity Intensity Calibration Unit (BAI) to initiate a targeted calibration process, in which the BAI prioritizes using facial image data of the target object acquired before and after the triggering time for rapid calibration. Another use is to increase the weight of corresponding samples during the training of a pre-trained pain feature classifier; that is, during offline iterative training of the model, when encountering samples similar to the triggering condition, the weights in its loss function are increased accordingly. Optionally, the periodic update of the BAI can be set to be time-based, such as running automatically at fixed time intervals, or data-based, such as running automatically after accumulating a specific number of new pain-free state samples. Optionally, the pain probability threshold involved in the judgment conditions in the abnormal state feedback unit... and the threshold for the number of abnormal activity intensity areas The model can be adjusted based on its performance on the validation set to balance the sensitivity and specificity of the calibration signal.
[0038] It is understandable that the basic activity intensity calibration unit enables the basic activity intensity benchmark library to adapt to changes in facial basic activity over time for different target populations or the same target subject. This dynamic update mechanism helps maintain the accuracy of the regional activity detection unit in judging "abnormal activity." It is also understandable that the abnormal state feedback unit, by monitoring the logical consistency between pain judgment results and the number of facial activity regions, can proactively identify potential misjudgment patterns or performance degradation in the model, and trigger a correction mechanism through model calibration signals, thereby improving the robustness and adaptability of the entire assessment system.
[0039] See Figure 5 This is a graph showing the change in the classifier loss of a pain assessment model, illustrating the trend of the model's cross-entropy loss value with training epochs before and after the "abnormal state feedback unit" triggered sample weight adjustment. Both curves show a continuous downward trend, indicating that the model converges continuously during training, and the accuracy of pain level prediction gradually improves. This graph directly verifies the calibration effect of sample weight adjustment in the "abnormal state feedback unit," proving that increasing the loss weight of high-risk samples can effectively optimize model performance and reduce the probability of misjudging pain levels. The convergence speed and final value of the loss curve provide a quantitative basis for the model training strategy, which can guide subsequent adjustments to hyperparameters such as training epochs and learning rate, further improving model robustness. The reduction in model loss directly corresponds to the improvement in pain assessment accuracy, helping to reduce misjudgments of pain levels in clinical practice and providing patients with more accurate analgesic interventions.
[0040] The above are merely preferred embodiments of the present invention and are not intended to limit the present invention in any other way. Any person skilled in the art may make changes or modifications to the above-disclosed technical content to create equivalent embodiments that can be applied to other fields. However, any simple modifications, equivalent changes, and modifications made to the above embodiments based on the technical essence of the present invention without departing from the scope of the present invention shall still fall within the protection scope of the present invention.
Claims
1. A pain assessment model based on facial expressions and AI, characterized in that, The model includes: The facial data acquisition unit acquires a sequence of continuous facial images of the target object during the pain observation period from the image sensor. The region activity detection unit processes the continuous facial image sequence, dynamically detects whether the activity intensity of the facial muscle region exceeds the predefined basic activity intensity based on the pre-stored facial muscle region anatomical division, marks the facial muscle regions with abnormal activity intensity, and generates active region identification information. The activity trajectory quantization unit performs time-series tracking of the facial muscle regions with abnormal activity intensity based on the active region identification information, calculates the activity intensity vector of each facial muscle region with abnormal activity intensity at multiple consecutive time points, and calculates the feature intensity change across time points based on the activity intensity vector, thereby generating regional activity intensity evolution trajectory data. The collaborative relationship analysis unit receives the regional activity intensity evolution trajectory data, calculates the correlation of activity intensity changes among multiple different facial muscle regions with abnormal activity intensity, and constructs a multi-region collaborative relationship matrix that reflects the collaborative contraction or relaxation patterns of facial muscles. The feature fusion and determination unit fuses and encodes the regional activity intensity evolution trajectory data and the multi-regional collaborative relationship matrix, and processes them through a pre-trained pain feature classifier to output the pain probability distribution data of the target object within the current time window.
2. The pain assessment model based on facial expression and AI as described in claim 1, characterized in that, The regional activity detection unit includes: The facial muscle region segmentation subunit stores a partition map that divides the face into multiple independent and partially overlapping facial muscle regions, each facial muscle region being associated with the movement of a specific set of facial muscle groups. A baseline library of basic activity intensity stores the range of basic activity intensity corresponding to each facial muscle region in a pain-free state. The activity intensity calculation subunit is used to extract the statistical features of pixel intensity changes in each facial muscle region for each frame of the continuous facial image sequence and generate the current activity intensity value. An abnormality marking subunit is used to compare the current activity intensity value with the basic activity intensity range of the corresponding facial muscle region in the basic activity intensity benchmark library. If the current activity intensity value continues to exceed the basic activity intensity range to a preset frame number threshold, the corresponding facial muscle region is marked as a facial muscle region with abnormal activity intensity, and all marked facial muscle region information is summarized into the active region identification information.
3. The pain assessment model based on facial expression and AI as described in claim 2, characterized in that, The activity trajectory quantization unit includes: The region tracking subunit is used to locate the facial muscle region with abnormal activity intensity in the image sequence according to the active region identification information, and to track the positional change of the facial muscle region with abnormal activity intensity between adjacent frames using an image registration method. The intensity vector construction subunit is used to extract the texture complexity and motion amplitude measure of the tracked facial muscle region with abnormal activity intensity at each preset sampling time point, and combine the texture complexity and motion amplitude measure to form the activity intensity vector. The change calculation subunit is used to calculate the difference between the activity intensity vectors of the same facial muscle region with abnormal activity intensity at different sampling time points, and quantify the difference as the feature intensity change. The trajectory generation subunit is used to connect the characteristic intensity changes of a facial muscle region with abnormal activity intensity at all sampling time points into a curve in time sequence. The curve is used as the regional activity intensity evolution trajectory data of the facial muscle region.
4. The pain assessment model based on facial expression and AI as described in claim 3, characterized in that, The collaborative relationship analysis unit includes: The temporal alignment subunit is used to ensure that the regional activity intensity evolution trajectory data of facial muscle regions with different activity intensity anomalies used to calculate correlations are fully aligned on the time axis. The correlation calculation subunit is used to select the regional activity intensity evolution trajectory data of any two different facial muscle regions with abnormal activity intensity, and calculate the Pearson correlation coefficient or mutual information of these two trajectory data sequences as the strength value of the cooperative relationship between the two regions. The matrix construction sub-unit is used to fill the pairwise synergy strength values between all facial muscle regions with abnormal activity intensity into a symmetric matrix with region numbers as rows and columns to construct the initial synergy matrix. The matrix enhancement subunit is used to perform smoothing filtering on the initial collaborative relationship matrix and introduce a prior weight matrix based on the anatomical connectivity of facial muscles for element-wise weighting, ultimately generating the multi-region collaborative relationship matrix.
5. The pain assessment model based on facial expression and AI as described in claim 4, characterized in that, The feature fusion and determination unit includes: The trajectory data encoding subunit is used to process the regional activity intensity evolution trajectory data through a recurrent neural network and output a trajectory feature encoding vector of fixed length. The collaborative matrix encoding subunit is used to process the multi-region collaborative relationship matrix through a graph convolutional network, where the nodes of the graph correspond to facial muscle regions with abnormal activity intensity, the edge weights of the graph correspond to the collaborative relationship strength values, and output a fixed-length collaborative feature encoding vector. The feature splicing subunit is used to concatenate the trajectory feature encoding vector and the collaborative feature encoding vector into a fused feature vector; The pain classification subunit is used to input the fused feature vector into the pre-trained pain feature classifier, which is composed of fully connected layers and outputs a probability vector representing the likelihood of different pain levels, i.e., the pain probability distribution data.
6. The pain assessment model based on facial expression and AI as described in claim 1, characterized in that, The model also includes: A basic activity intensity calibration unit is used to periodically update the basic activity intensity benchmark library in the regional activity detection unit; The baseline activity intensity calibration unit receives clinically validated pain-free facial image data, recalculates and updates the baseline activity intensity range for each facial muscle region to ensure the baseline activity intensity benchmark library is suitable for the target population.
7. The pain assessment model based on facial expression and AI as described in claim 5, characterized in that, The training process of the pre-trained pain feature classifier includes: Obtain a training dataset containing a large number of samples with labeled real pain levels. Each sample contains its corresponding regional activity intensity evolution trajectory data, multi-regional collaborative relationship matrix, and real pain level label. The training dataset is used to perform end-to-end joint training of the trajectory data encoding subunit, the collaborative matrix encoding subunit, and the pain classification subunit in the feature fusion and determination unit; The training objective is to minimize the cross-entropy loss between the pain probability distribution data output by the pain classification subunit and the actual pain level label.
8. The pain assessment model based on facial expression and AI as described in claim 7, characterized in that, The model also includes an abnormal state feedback unit; The abnormal state feedback unit receives the pain probability distribution data output by the pain feature classifier, and the active area identification information marked by the area activity detection unit. When the pain probability distribution data shows a high pain probability, but the active area identification information shows that the number of facial muscle areas with abnormal activity intensity is less than a preset threshold, the abnormal state feedback unit generates a model calibration signal. The model calibration signal is used to trigger the basic activity intensity calibration unit to start a targeted calibration process, or to increase the weight of the corresponding samples during the training process of the pre-trained pain feature classifier.
9. The pain assessment model based on facial expression and AI as described in claim 3, characterized in that, The image registration method used by the region tracking subunit is a feature-point-based optical flow method, specifically: Corner features were detected within the facial muscle region exhibiting abnormal activity intensity. Between consecutive image frames, the sparse optical flow field of the corner features is calculated; By utilizing the statistical characteristics of the sparse optical flow field, the estimated location of the facial muscle region with abnormal activity intensity in the next frame is determined, thus completing region tracking.
10. The pain assessment model based on facial expression and AI as described in claim 4, characterized in that, The prior weight matrix based on facial muscle anatomical connectivity used in the matrix enhancement subunit is constructed in the following way: From knowledge of facial muscle anatomy, obtain prior information about whether there is a direct fascial connection or synergistic movement between any two facial muscles; For regions corresponding to two muscles that have direct anatomical connections or known strong synergistic relationships, set high weight values at the corresponding prior weight matrix element positions. For regions without direct anatomical connections, a lower weight value is set; The prior weight matrix is multiplied element-wise with the initial collaborative relationship matrix to achieve enhanced collaborative relationships based on anatomical knowledge.