PPG recognition method based on biomechanical derivative and kan-xlstm
By constructing a three-channel input tensor that includes hemodynamic priors, and combining KAN-xLSTM and orthogonal decoupling techniques, the problems of vector state information bottleneck, mismatch between nonlinear manifold and linear activation function, and entanglement of motion artifacts and identity features in PPG biometrics recognition are solved, achieving efficient identity recognition and noise stripping.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- QILU UNIVERSITY OF TECHNOLOGY (SHANDONG ACADEMY OF SCIENCES)
- Filing Date
- 2026-04-14
- Publication Date
- 2026-06-19
AI Technical Summary
Existing PPG biometric recognition technologies suffer from problems such as vector state information bottlenecks, mismatch between nonlinear manifolds of blood vessels and linear activation functions, and entanglement between motion artifacts and identity features under motion interference and complex physiological conditions, making it difficult to improve the recognition rate.
We employ a method based on biomechanical derivatives and KAN-xLSTM. By constructing a three-channel input tensor, we distribute it to the temporal stream processing branch and the frequency stream gating branch. We use global gating weights to perform feature fusion and orthogonal decoupling, refine the identity embedding vector, and combine it with the angular margin classification head for identity recognition.
It improves the model's memory capacity for fine-grained identity features, enhances the physical interpretability and accuracy of feature extraction, achieves explicit noise removal, and strengthens recognition robustness in extreme environments.
Smart Images

Figure CN122004803B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of PPG biometric recognition technology, and in particular to a PPG recognition method based on biomechanical derivatives and KAN-xLSTM. Background Technology
[0002] Photoplethysmography (PPG)-based biometric identification technology holds significant value in wearable devices. However, in practical applications, especially under motion disturbances and complex physiological conditions, existing technologies suffer from the following shortcomings:
[0003] Firstly, existing lightweight models suffer from a bottleneck in vector state information. Current mainstream temporal models, including traditional Recurrent Neural Networks (RNNs) and the latest state-space models (such as Mamba, the technology primarily used in previous patents), face a fundamental theoretical limitation when processing long sequences: vector state compression. These models typically compress infinitely long historical information into a fixed-size vector state. This compression mechanism forces models to average historical information, resulting in the smoothing or loss of transient but highly discriminative high-frequency microscopic morphological details (such as the precise slope of the dicrotic notch and minute variations in the systolic peak). In biometric identification, these minute high-frequency details are crucial for distinguishing users with similar heart rates. Existing models, due to insufficient memory capacity, struggle to further improve recognition rates among highly similar populations.
[0004] Secondly, there is a mismatch between the nonlinear manifold of blood vessels and linear activation functions. The PPG signal is essentially a hydrodynamic waveform generated by the interaction between the heart's pumping action and the elastic recoil of blood vessels. Vascular compliance follows smooth, continuous nonlinear physical laws (stress-strain curves). Existing backbone networks (such as Convolutional Neural Networks (CNNs) and Transformers) mainly rely on piecewise linear activation functions (such as ReLU). Approximating a smooth physiological manifold with piecewise linear functions is extremely inefficient, often requiring very deep network layers to achieve high fidelity, resulting in a waste of computational resources.
[0005] Furthermore, there is the entanglement between motion artifacts and identity features. Under strong motion interference, the acquired PPG signal is a nonlinear mixture of identity essence and motion noise. Most existing methods employ end-to-end black-box learning, lacking explicit mechanisms to distinguish between physiological features and external interference in the signal. This high coupling of features makes the model prone to misclassifying noise as identity features when faced with unseen noise patterns, lacking robustness in extreme environments. Summary of the Invention
[0006] In view of this, the present invention provides a PPG recognition method based on biomechanical derivatives and KAN-xLSTM to improve the model's memory capacity for fine-grained identity features, enhance the physical interpretability and accuracy of feature extraction, and achieve explicit noise removal.
[0007] In a first aspect, the present invention provides a PPG identification method based on biomechanical derivatives and KAN-xLSTM, the method comprising:
[0008] Step 1: Based on the acquired raw PPG signal, construct a three-channel input tensor that includes hemodynamic priors;
[0009] Step 2: Distribute the three-channel input tensor to the parallel time-domain stream processing branch and frequency-domain stream gating branch, and obtain the time-domain feature vector and the spectrum-based global gating weights respectively;
[0010] Step 3: Use global gating weights to perform element-wise weighting on the temporal feature vector to obtain the fused feature vector;
[0011] Step 4: Decompose the fused feature vector projection into identity embedding vector and noise embedding vector to refine the identity embedding vector;
[0012] Step 5: Input the purified identity embedding vector into the classification head based on angle margin, calculate the classification probability, and output the final identity recognition result.
[0013] Optionally, step 1 includes constructing a biomechanical derivative embedding module and input:
[0014] Based on the acquired raw PPG signal, its first derivative is calculated as the velocity feature, and its second derivative is calculated as the acceleration feature. The raw PPG signal, velocity feature, and acceleration feature are concatenated in the channel dimension to construct a three-channel input tensor containing hemodynamic priors.
[0015] Based on the phase space reconstruction principle, the kinematic derivative of the PPG signal is explicitly calculated to construct a biomechanical feature with translation invariance. Let the input PPG signal sequence be X, where T is the time step. An enhanced three-channel input tensor is constructed as follows:
[0016] Step 11, Raw PPG signal:
[0017] The input standardized time series is denoted as... ;
[0018] Step 12, Velocity Characteristics :
[0019] The first derivative of the original PPG signal, which physically corresponds to the instantaneous velocity of blood flow within the blood vessel, is expressed as follows:
[0020] ;
[0021] Step 13, Acceleration Characteristics :
[0022] The second derivative of the original PPG signal is calculated, which physically reflects the expansion and contraction capacity of the blood vessel wall under pulse wave impact. Its expression is as follows:
[0023] ;
[0024] Step 14: Through the splicing operation, a three-channel input tensor containing hemodynamic priors is obtained, the expression of which is:
[0025] .
[0026] Optionally, step 2 includes time-frequency dual-stream collaborative feature extraction:
[0027] The three-channel input tensor is distributed to two parallel branches. In the time-domain stream processing branch, the Kolmogorov-Arnold network KAN multi-scale feature extraction front-end is first used to capture waveform details at different scales. Then, it is input into the bidirectional extended long short-term memory network Bi-xLSTM module, which uses its matrix memory to perform lossless encoding of long sequence features and compresses the time series features into a time-domain feature vector through global average pooling (GAP). In the frequency-domain stream gating branch, the three-channel input tensor is subjected to Fast Fourier Transform (FFT), and global gating weights based on the spectrum are generated through the KAN projection layer and the sigmoid activation function.
[0028] The temporal stream processing branch is the core of feature extraction, used to capture fine-grained morphological features from the enhanced three-channel input tensor. It includes a KAN multi-scale feature extraction front-end and a bidirectional extended long short-term memory network module.
[0029] Step 21: KAN multi-scale feature extraction front-end;
[0030] KAN convolutional layers are used instead of linear convolutional layers; for an input vector u, KAN convolution computation can learn a nonlinear combination of B-spline basis functions; for a KAN convolution with kernel size k, its output y is expressed as:
[0031] ;
[0032] in, ; For i-th order B-spline basis functions, These are the corresponding learnable control coefficients;
[0033] Four parallel KAN convolutional branches are configured with kernel sizes k∈{1,3,7,11} to capture multi-scale features ranging from high-frequency noise filtering and sharp peak extraction to low-frequency waveform contour fitting. The outputs from each scale are concatenated to form a deep feature sequence. ;
[0034] Step 22: Bidirectionally expand the Long Short-Term Memory (LSM) network Bi-xLSTM module;
[0035] The Extended Long Short-Term Memory (xLSTM) network is employed, which introduces matrix memory. Covariance update rule; at time step t, for input features Extending the long short-term memory unit generates a query vector through linear projection. Key vector Sum value vector And calculate the input gate And the Gate of Oblivion Exponential gating is used to support long-range gradient propagation;
[0036] The update rules for matrix memory are as follows:
[0037] ;
[0038] in, This represents element-wise multiplication. Indicates the outer product;
[0039] A bidirectional extended long short-term memory (LSS) network module is used to process the forward and reverse sequences separately. The outputs are concatenated and then passed through a global average pooling (GAP) layer to obtain the temporal feature vector. .
[0040] Optionally, step 2 further includes a frequency-domain flow-gated branch, which utilizes a global spectral prior to cleanse time-domain features:
[0041] Step 23: Fast Fourier Transform (FFT): Perform an FFT on the three-channel input tensor and take the modulus to obtain the frequency domain amplitude spectrum. ;
[0042] Step 24, KAN projection and gating generation: The frequency domain amplitude spectrum is mapped to the same channel dimension as the time domain features using the KAN projection layer;
[0043] Step 25, Global Gating Weight Generation: After passing through the Sigmoid activation function, spectral gating weights g in the range (0,1) are generated, representing the channel importance coefficients of different feature dimensions. The expression is:
[0044] .
[0045] Optionally, step 3 includes feature fusion:
[0046] The global gating weights are multiplied element-wise with the time-domain feature vector, and the product is concatenated with the frequency-domain feature vector projected by KAN to obtain the final fused feature vector, which is expressed as follows:
[0047] .
[0048] Optionally, step 4 includes orthogonal decoupling and signal reconstruction:
[0049] The fused feature vector projection is decomposed into identity embedding vector and noise embedding vector; the orthogonal constraint loss function is used to force the identity embedding vector and noise embedding vector to be perpendicular to each other in the feature space; and the identity embedding vector and noise embedding vector are added element by element and then input into the signal reconstruction decoder to restore the reconstructed PPG signal, thereby purifying the identity embedding vector.
[0050] Design an orthogonal decoupling mechanism for generative expressions; (1) Feature projection and separation: fuse feature vectors After the concatenation operation, the two vectors are mapped to two low-dimensional embedding vectors through two independent fully connected layer projection heads: an identity embedding vector and an identity embedding vector. It only contains user identity information; noise embedding vector (1) Used to absorb motion artifacts and non-identity variations caused by loosening of the wearer; (2) Orthogonal constraint loss function During training, forced minimization and The absolute value of the cosine similarity between them is expressed as: (3) The signal reconstruction decoder adds the separated identity embedding vector and noise embedding vector element by element: ,Will The input signal reconstruction decoder restores the reconstructed PPG signal; a reconstruction loss is introduced. .
[0051] Optionally, step 5 includes an output layer and joint optimization:
[0052] Step 51: The classification head based on angle margins only embeds the purified identity vector. Input classification header; use ArcFace loss function By adding an angular margin m to the angle between the feature vector and the weight vector, the inter-class distance is maximized on the hypersphere.
[0053] Step 52: Joint loss function. End-to-end training is performed using the following multi-task loss function:
[0054] ;
[0055] in, , This is the balance coefficient.
[0056] In a second aspect, embodiments of the present invention provide a computer-readable storage medium comprising a stored program, wherein, when the program is executed, it controls the device where the computer-readable storage medium is located to execute the PPG recognition method based on biomechanical derivatives and KAN-xLSTM in the first aspect or any possible implementation thereof.
[0057] Thirdly, embodiments of the present invention provide an electronic device, including: one or more processors; a memory; and one or more computer programs, wherein the one or more computer programs are stored in the memory, and the one or more computer programs include instructions that, when executed by the device, cause the device to perform the PPG recognition method based on biomechanical derivatives and KAN-xLSTM in the first aspect or any possible implementation of the first aspect.
[0058] The technical solution provided by this invention includes a method that constructs a three-channel input tensor containing hemodynamic priors based on the acquired raw PPG signal; distributes the three-channel input tensor to parallel temporal stream processing branches and frequency stream gating branches, and obtains temporal feature vectors and spectrum-based global gating weights respectively; uses the global gating weights to perform element-wise weighting on the temporal feature vectors to obtain a fused feature vector; projects and decomposes the fused feature vector into an identity embedding vector and a noise embedding vector to purify the identity embedding vector; inputs the purified identity embedding vector into a classification head based on angle margins, calculates the classification probability, and outputs the final identity recognition result. This method improves the model's memory capacity for fine-grained identity features, enhances the physical interpretability and accuracy of feature extraction, and achieves explicit noise removal. Attached Figure Description
[0059] To more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings used in the embodiments will be briefly introduced below. Obviously, the drawings described below are only some embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.
[0060] Figure 1 A flowchart of the PPG recognition method based on biomechanical derivative and KAN-xLSTM provided in an embodiment of the present invention;
[0061] Figure 2 This is a schematic diagram of an electronic device provided in an embodiment of the present invention. Detailed Implementation
[0062] To make the objectives, technical solutions, and advantages of the embodiments of the present invention clearer, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.
[0063] The terminology used in the embodiments of this invention is for the purpose of describing particular embodiments only and is not intended to limit the invention. The singular forms “a,” “the,” and “the” used in the embodiments of this invention are also intended to include the plural forms unless the context clearly indicates otherwise.
[0064] It should be understood that the term "and / or" used in this article is merely a description of the relationship between related objects, indicating that three relationships can exist. For example, A and / or B can represent: A existing alone, A and B existing simultaneously, or B existing alone. Additionally, the character " / " in this article generally indicates that the preceding and following related objects have an "or" relationship.
[0065] Depending on the context, the word "if" as used here can be interpreted as "when," "when," "in response to determination," or "in response to detection." Similarly, depending on the context, the phrase "if determination" or "if detection (of the stated condition or event)" can be interpreted as "when determination," "in response to determination," "when detection (of the stated condition or event)," or "in response to detection (of the stated condition or event)."
[0066] Figure 1 A flowchart of the PPG recognition method based on biomechanical derivatives and KAN-xLSTM provided in this embodiment of the invention is shown below. Figure 1 As shown, the method includes:
[0067] Step 1: Based on the acquired raw PPG signal, construct a three-channel input tensor that includes hemodynamic priors.
[0068] In this embodiment of the invention, step 1 includes constructing a biomechanical derivative embedding module and input:
[0069] Based on the acquired raw PPG signal, its first derivative is calculated as the velocity feature, and its second derivative is calculated as the acceleration feature. The raw PPG signal, velocity feature, and acceleration feature are concatenated in the channel dimension to construct a three-channel input tensor containing hemodynamic priors.
[0070] Biomechanical derivatives specifically refer to the first derivative (velocity) and second derivative (acceleration) of the PPG signal, used to characterize the compliance of the blood vessel wall and the hemodynamic state.
[0071] Traditional methods directly extract features from the original PPG signal waveform, which is easily affected by baseline drift and amplitude variations. This invention, based on the principle of phase space reconstruction, explicitly calculates the kinematic derivative of the PPG signal to construct translation-invariant biomechanical features. Let the input PPG signal sequence be X, where T is the time step. An enhanced three-channel input tensor is constructed as follows:
[0072] Step 11, Raw PPG signal:
[0073] The input standardized time series is denoted as... ;
[0074] Step 12, Velocity Characteristics :
[0075] The first derivative of the original PPG signal, which physically corresponds to the instantaneous velocity of blood flow within the blood vessel, is expressed as follows:
[0076] ;
[0077] Step 13, Acceleration Characteristics :
[0078] The second derivative of the original PPG signal is calculated, which physically reflects the expansion and contraction capacity of the blood vessel wall under pulse wave impact. Its expression is as follows:
[0079] ;
[0080] Step 14: Through the splicing operation, a three-channel input tensor containing hemodynamic priors is obtained, the expression of which is:
[0081] .
[0082] The above design enables the model to lock onto a phase diagram trajectory composed of velocity and acceleration. The topology of this phase diagram trajectory is determined by the individual's cardiovascular physiological characteristics and is naturally robust to time axis scaling (heart rate changes) and baseline drift.
[0083] Step 2: Distribute the three-channel input tensor to the parallel time-domain stream processing branch and frequency-domain stream gating branch, and obtain the time-domain feature vector and the spectrum-based global gating weights, respectively.
[0084] In this embodiment of the invention, step 2 includes time-frequency dual-stream collaborative feature extraction:
[0085] The three-channel input tensor is distributed to two parallel branches. In the time-domain stream processing branch, a Kolmogorov-Arnold Network (KAN) multi-scale feature extraction front-end is first used to capture waveform details at different scales. Then, the input is fed into a bidirectional extended long short-term memory (Bi-xLSTM) module, which uses its matrix memory to perform lossless encoding of long sequence features and compresses the time series features into a time-domain feature vector through global average pooling (GAP). In the frequency-domain stream gating branch, the three-channel input tensor is subjected to Fast Fourier Transform (FFT), and global gating weights based on the spectrum are generated through a KAN projection layer and the sigmoid activation function.
[0086] Kolmogorov-Arnold networks are neural networks that place learnable B-spline activation functions on the network edges, making them better at fitting physical nonlinear functions than multilayer perceptrons (MLPs).
[0087] The temporal stream processing branch is the core of feature extraction, used to capture fine-grained morphological features from the enhanced three-channel input tensor. It includes a KAN multi-scale feature extraction front-end and a bidirectional extended long short-term memory network module.
[0088] Step 21: KAN multi-scale feature extraction front-end;
[0089] KAN convolution layers are used instead of linear convolution layers. For an input vector u, standard convolution calculates a linear inner product, while KAN convolution calculates a nonlinear combination of B-spline basis functions. For a KAN convolution with a kernel size of k, the output y is expressed as:
[0090] ;
[0091] in, ; For i-th order B-spline basis functions, These are the corresponding learnable control coefficients;
[0092] Set up 4 parallel KAN convolution branches, such as Figure 1 As shown, the convolutional kernel sizes are k∈{1,3,7,11}, namely KAN convolutional layer 1, KAN convolutional layer 3, KAN convolutional layer 7, and KAN convolutional layer 11, which are used to capture multi-scale features ranging from high-frequency noise filtering and sharp peak extraction to low-frequency waveform contour fitting; the outputs of each scale are concatenated to form a deep feature sequence. ;
[0093] Step 22: Bidirectionally expand the Long Short-Term Memory (LSM) network Bi-xLSTM module;
[0094] The bidirectional Extended Long Short-Term Memory (Bi-xLSTM) network module forms the backbone network of this invention. To overcome the high-frequency detail loss problem caused by Mamba or traditional RNNs compressing historical information into vector states, this invention employs an Extended Long Short-Term Memory (xLSTM) network, which introduces matrix memory. Covariance Update Rule: At time step t, for the input features... Extending the long short-term memory unit generates a query vector through linear projection. Key vector Sum value vector And calculate the input gate And the Gate of Oblivion Exponential gating is used to support long-range gradient propagation;
[0095] Extended Long Short-Term Memory (xLSTM) networks introduce a novel architecture based on traditional LSTM, incorporating matrix memory and exponential gating. This architecture stores high-order feature correlations through covariance update rules. Matrix memory is the core storage unit of xLSTM, storing historical information in d×d matrix form. Unlike the d×1 vector states in Mamba models or RNNmox, it has a larger memory capacity.
[0096] The update rules for matrix memory are as follows:
[0097] ;
[0098] in, This represents element-wise multiplication. Indicates the outer product;
[0099] Traditional vector update is It can only store the accumulated amount of features; while xLSTM's outer product update... A rank-1 matrix was generated, capable of storing pairwise correlations between the dimensions of the feature vectors. This improved the memory capacity from O(d) to [a higher value]. This allows for the lossless memorization of minute morphological differences in each heartbeat cycle within a long sequence (such as the depth of the dicrotic notch).
[0100] A bidirectional extended long short-term memory (LSS) network module is used to process the forward and reverse sequences separately. The outputs are concatenated and then passed through a global average pooling (GAP) layer to obtain the temporal feature vector. .
[0101] In this embodiment of the invention, step 2 further includes a frequency domain flow-gated branch. Motion artifacts typically manifest as specific interference in the spectrum (such as frequency components overlapping with the step frequency). This invention utilizes global spectral priors to cleanse time-domain features:
[0102] Step 23: Fast Fourier Transform (FFT): Perform an FFT on the three-channel input tensor and take the modulus to obtain the frequency domain amplitude spectrum. ;
[0103] Step 24, KAN Projection and Gated Generation: The frequency domain amplitude spectrum is mapped to the same channel dimension as the time domain characteristics using the KAN projection layer. The nonlinear characteristics of KAN help to capture complex spectral patterns.
[0104] Step 25, Global Gating Weight Generation: After passing through the Sigmoid activation function, spectral gating weights g in the range (0,1) are generated, representing the channel importance coefficients of different feature dimensions. The expression is:
[0105] .
[0106] Step 3: Use global gating weights to perform element-wise weighting on the temporal feature vector to suppress the feature channels in the noise-contaminated frequency bands and obtain the fused feature vector.
[0107] In this embodiment of the invention, step 3 includes feature fusion:
[0108] The global gating weights are multiplied element-wise with the time-domain feature vector, and the product is concatenated with the frequency-domain feature vector projected by KAN to obtain the final fused feature vector, which is expressed as follows:
[0109] .
[0110] The above operation utilizes global spectrum information to dynamically suppress the characteristic channels corresponding to frequency bands contaminated by motion noise and enhance the channels containing effective physiological information.
[0111] Step 4: Decompose the fused feature vector projection into identity embedding vector and noise embedding vector to refine the identity embedding vector.
[0112] In this embodiment of the invention, step 4 includes orthogonal decoupling and signal reconstruction:
[0113] The fused feature vector projection is decomposed into identity embedding vector and noise embedding vector; the orthogonal constraint loss function is used to force the identity embedding vector and noise embedding vector to be perpendicular to each other in the feature space; and the identity embedding vector and noise embedding vector are added element by element and then input into the signal reconstruction decoder to restore the reconstructed PPG signal, thereby purifying the identity embedding vector.
[0114] In order to completely solve the problem of identity and noise entanglement, a generative orthogonal decoupling mechanism is designed; (1) Feature projection and separation: fusing feature vectors After the concatenation operation, the two vectors are mapped to two low-dimensional embedding vectors through two independent fully connected layer projection heads: an identity embedding vector and an identity embedding vector. It only contains user identity information; noise embedding vector (1) Used to absorb motion artifacts and non-identity variations caused by loosening of the wearer; (2) Orthogonal constraint loss function During training, forced minimization and The absolute value of the cosine similarity between them is expressed as: (3) The signal reconstruction decoder adds the separated identity embedding vector and noise embedding vector element by element, i.e., feature superposition: ,Will The input signal reconstruction decoder restores the reconstructed PPG signal; a reconstruction loss is introduced. .
[0115] In this embodiment of the invention, relying solely on orthogonal constraints may lead to information loss. By using a reconstruction task, information can be forced to... Actively absorb all components that are signal changes but not identity-related, thereby assisting... Purification.
[0116] Orthogonal constraint loss function Forced and Two vectors are perpendicular to each other in the feature space. If they are geometrically perpendicular, it means... The information contained therein (such as the amplitude of motion) The projection onto the vector is zero, thus mathematically guaranteeing the purity of the identity vector.
[0117] Orthogonal disentanglement is a feature representation learning strategy that forces the feature space to be decomposed into mutually perpendicular (unrelated) identity subspaces and noise subspaces through a loss function.
[0118] Step 5: Input the purified identity embedding vector into the classification head based on angle margin, calculate the classification probability, and output the final identity recognition result.
[0119] In this embodiment of the invention, step 5 includes the output layer and joint optimization:
[0120] Step 51: The classification head based on angle margins only embeds the purified identity vector. Input classification header; use ArcFace loss function By adding an angular margin m to the angle between the feature vector and the weight vector, the inter-class distance on the hypersphere is maximized, thereby improving recognition accuracy.
[0121] Step 52: Joint loss function. End-to-end training is performed using the following multi-task loss function:
[0122] ;
[0123] in, , This is the balance coefficient.
[0124] In this embodiment of the invention, the temporal stream processing branch demonstrates the process from the generation velocity and acceleration features of the original PPG signal, which are then spliced and fed into the KAN multi-scale feature extraction front end, then into the bidirectional xLSTM module (containing extended long short-term memory units), and finally output through the global average pooling layer.
[0125] The frequency domain flow-gated branch demonstrates the process of transforming a time-domain waveform into a frequency-domain amplitude spectrum via FFT, generating spectral gating weights through a KAN projection layer and a Sigmoid function, and then multiplying them with the time-domain features.
[0126] The orthogonal decoupling module demonstrates the process of projecting the feature fusion into identity embedding vectors and noise embedding vectors, which are constrained by an orthogonal constraint loss function and then superimposed and output by the signal reconstruction decoder to reconstruct the PPG signal.
[0127] The output layer displays the identity embedding vector processed by an angle-margin-based classification head, outputting the identity recognition result.
[0128] The following are alternatives to the derivative order in this invention: In addition to using the first and second derivatives, the third derivative (jerk) or fractional derivatives can also be introduced to construct the input to capture higher-order dynamic features, all of which fall within the scope of this invention. The following are alternatives to the backbone network: Although this invention uses xLSTM, the biomechanical derivative embedding and orthogonal decoupling modules in this architecture are universal and can also be combined with other improved Transformers or RNNs with matrix memory capabilities, as long as their core idea is to store historical information in matrix form. The following are alternatives to the decoupling method: In addition to orthogonal constraints, mutual information minimization or adversarial training can also be used to separate identity from noise, as long as the goal is to divide the feature space into two subspaces: identity and noise.
[0129] The technical solution provided by this invention includes a method that constructs a three-channel input tensor containing hemodynamic priors based on the acquired raw PPG signal; distributes the three-channel input tensor to parallel temporal stream processing branches and frequency stream gating branches, and obtains temporal feature vectors and spectrum-based global gating weights respectively; uses the global gating weights to perform element-wise weighting on the temporal feature vectors to obtain a fused feature vector; projects and decomposes the fused feature vector into an identity embedding vector and a noise embedding vector to purify the identity embedding vector; inputs the purified identity embedding vector into a classification head based on angle margins, calculates the classification probability, and outputs the final identity recognition result. This method improves the model's memory capacity for fine-grained identity features, enhances the physical interpretability and accuracy of feature extraction, and achieves explicit noise removal.
[0130] The various steps in the embodiments of the present invention can be performed by an electronic device. This electronic device includes, but is not limited to, tablet computers, portable PCs, and desktop computers.
[0131] This invention provides a computer-readable storage medium including a stored program, wherein, when the program is running, it controls the electronic device containing the computer-readable storage medium to execute the above-described embodiment of the PPG recognition method based on biomechanical derivatives and KAN-xLSTM.
[0132] Figure 2 A schematic diagram of an electronic device provided in an embodiment of the present invention, such as... Figure 2As shown, the electronic device 21 includes a processor 211, a memory 212, and a computer program 213 stored in the memory 212 and executable on the processor 211. When the computer program 213 is executed by the processor 211, it implements the PPG recognition method based on biomechanical derivative and KAN-xLSTM in the embodiment. To avoid repetition, it will not be described in detail here.
[0133] Electronic device 21 includes, but is not limited to, processor 211 and memory 212. Those skilled in the art will understand that... Figure 2 This is merely an example of electronic device 21 and does not constitute a limitation on electronic device 21. It may include more or fewer components than shown, or combine certain components, or different components. For example, electronic device may also include input / output devices, network access devices, buses, etc.
[0134] The processor 211 may be a Central Processing Unit (CPU), or other general-purpose processors, digital signal processors (DSPs), application-specific integrated circuits (ASICs), field-programmable gate arrays (FPGAs), or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. A general-purpose processor may be a microprocessor or any conventional processor.
[0135] The memory 212 can be an internal storage unit of the electronic device 21, such as a hard disk or RAM of the electronic device 21. The memory 212 can also be an external storage device of the electronic device 21, such as a plug-in hard disk, Smart Media Card (SMC), Secure Digital (SD) card, or FlashCard equipped on the electronic device 21. Furthermore, the memory 212 can include both internal and external storage units of the electronic device 21. The memory 212 is used to store computer programs and other programs and data required by network devices. The memory 212 can also be used to temporarily store data that has been output or will be output.
[0136] Those skilled in the art will clearly understand that, for the sake of convenience and brevity, the specific working processes of the systems, devices, and units described above can be referred to the corresponding processes in the foregoing method embodiments, and will not be repeated here.
[0137] The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of the present invention should be included within the scope of protection of the present invention.
Claims
1. A PPG recognition method based on biomechanical derivatives and KAN-xLSTM, characterized in that, The method includes: Step 1: Based on the acquired raw PPG signal, construct a three-channel input tensor containing hemodynamic priors; wherein, based on the acquired raw PPG signal, calculate its first derivative as velocity feature and its second derivative as acceleration feature; concatenate the raw PPG signal, velocity feature and acceleration feature in the channel dimension to construct a three-channel input tensor containing hemodynamic priors. Step 2: Distribute the three-channel input tensor to the parallel time-domain stream processing branch and frequency-domain stream gating branch, and obtain the time-domain feature vector and spectrum-based global gating weights respectively. In the time-domain stream processing branch, the Kolmogorov-Arnold network KAN multi-scale feature extraction front-end is first used to capture waveform details at different scales. Then, it is input into the bidirectional extended long short-term memory network Bi-xLSTM module, which uses its matrix memory to perform lossless encoding of long sequence features, and compresses the time series features into a time-domain feature vector through global average pooling (GAP). In the frequency-domain stream gating branch, the three-channel input tensor is subjected to Fast Fourier Transform (FFT), and spectrum-based global gating weights are generated through the KAN projection layer and the Sigmoid activation function. Step 3: Use global gating weights to perform element-wise weighting on the temporal feature vector to obtain the fused feature vector; Step 4: Decompose the fused feature vector projection into identity embedding vector and noise embedding vector to purify the identity embedding vector; wherein the orthogonal constraint loss function is used to force the identity embedding vector and noise embedding vector to be perpendicular to each other in the feature space; and the identity embedding vector and noise embedding vector are added element by element and then input into the signal reconstruction decoder to restore the reconstructed PPG signal, thereby purifying the identity embedding vector. Step 5: Input the refined identity embedding vector into the classification head based on angle margin, calculate the classification probability, and output the final identity recognition result; Step 2 includes time-frequency dual-stream collaborative feature extraction, and the time-domain stream processing branch includes a KAN multi-scale feature extraction front-end and a bidirectional extended long short-term memory network module. Step 21: KAN multi-scale feature extraction front end; KAN convolutional layers are used instead of linear convolutional layers; for the input vector u, the KAN convolution is calculated based on a nonlinear combination of learnable B-spline basis functions; for a KAN convolution with a kernel size of k, the output y is expressed as: ; wherein, ; is the i-th B-spline basis function, is the corresponding learnable control coefficient; Four parallel KAN convolutional branches are configured with kernel sizes k∈{1,3,7,11} to capture multi-scale features ranging from high-frequency noise filtering and sharp peak extraction to low-frequency waveform contour fitting. The outputs from each scale are concatenated to form a deep feature sequence. ; Step 22: Bidirectionally expand the Long Short-Term Memory (LSM) network Bi-xLSTM module; The Extended Long Short-Term Memory (xLSTM) network is employed, which introduces matrix memory. Covariance update rule; at time step t, for input features Extending the long short-term memory unit generates a query vector through linear projection. Key vector Sum value vector And calculate the input gate And the Gate of Oblivion Exponential gating is used to support long-range gradient propagation; The update rules for matrix memory are as follows: ; wherein denotes element-wise multiplication, denotes outer product; A bidirectional extended long short-term memory (LSS) network module is used to process the forward and reverse sequences separately. The outputs are concatenated and then passed through a global average pooling (GAP) layer to obtain the temporal feature vector. .
2. The method of claim 1, wherein, Step 1 includes constructing a biomechanical derivative embedding module and input: Let the input PPG signal sequence be X, where t is the time step. An enhanced three-channel input tensor is constructed as follows: Step 11, Raw PPG signal: The input normalized time series, denoted by ; Step 12, speed characteristic : The first derivative of the original PPG signal, which physically corresponds to the instantaneous velocity of blood flow within the blood vessel, is expressed as follows: ; Step 13, acceleration feature : The second derivative of the original PPG signal is calculated, which physically reflects the expansion and contraction capacity of the blood vessel wall under pulse wave impact. Its expression is as follows: ; Step 14: Through the splicing operation, a three-channel input tensor containing hemodynamic priors is obtained, the expression of which is: 。 3. The method of claim 1, wherein, Step 2 also includes a frequency-domain flow-gated branch, which uses a global spectral prior to cleanse time-domain features: Step 23, Fast Fourier Transform, FFT: FFT transform the three-channel input tensor and take the modulus to get the frequency domain magnitude spectrum ; Step 24, KAN projection and gating generation: The frequency domain amplitude spectrum is mapped to the same channel dimension as the time domain features using the KAN projection layer; Step 25, Global Gating Weight Generation: After passing through the Sigmoid activation function, spectral gating weights g in the range (0,1) are generated, representing the channel importance coefficients of different feature dimensions. The expression is: 。 4. The method of claim 3, wherein, Step 3 includes feature fusion: The global gating weights are multiplied element-wise by the time-domain feature vector, and the product is concatenated with the frequency-domain feature vector projected by KAN to obtain the final fused feature vector, which is expressed as follows: 。 5. The method of claim 4, wherein, Step 4 includes orthogonal decoupling and signal reconstruction: Design an orthogonal decoupling mechanism for generative expressions; (1) Feature projection and separation: fuse feature vectors After the concatenation operation, the two vectors are mapped to two low-dimensional embedding vectors through two independent fully connected layer projection heads: an identity embedding vector and an identity embedding vector. It only contains user identity information; noise embedding vector (1) Used to absorb motion artifacts and non-identity variations caused by loosening of the wearer; (2) Orthogonal constraint loss function During training, forced minimization and The absolute value of the cosine similarity between them is expressed as: (3) The signal reconstruction decoder adds the separated identity embedding vector and noise embedding vector element by element: ,Will The input signal reconstruction decoder restores the reconstructed PPG signal; a reconstruction loss is introduced. .
6. The method of claim 5, wherein, Step 5 includes the output layer and joint optimization: Step 51, angle margin based classification head, only purified identity embedding vector Input classification head; adopt ArcFace loss function Maximize the inter-class distance on the hypersphere by adding an angle margin m on the angle between the feature vector and the weight vector Step 52: Joint loss function. End-to-end training is performed using the following multi-task loss function: ; wherein , is a balancing factor.
7. A computer-readable storage medium, characterized in that, The computer-readable storage medium includes a stored program, wherein, when the program is executed, it controls the device on which the computer-readable storage medium is located to perform the PPG identification method based on biomechanical derivatives and KAN-xLSTM as described in any one of claims 1 to 6.
8. An electronic device, comprising: include: One or more processors; Memory; And one or more computer programs, wherein the one or more computer programs are stored in the memory, the one or more computer programs including instructions that, when executed by the device, cause the device to perform the PPG recognition method based on biomechanical derivatives and KAN-xLSTM as described in any one of claims 1 to 6.