Method for screening, predicting and monitoring paroxysmal atrial fibrillation based on heart rate variability features

By combining adaptive scaling entropy (ASE) decomposition and support vector machine (SVM), the nonlinearity and high signal-to-noise ratio problems of ECG signals in the screening and prediction of paroxysmal atrial fibrillation were solved, achieving efficient HRV feature extraction and classification, and improving the accuracy of screening, prediction and monitoring of paroxysmal atrial fibrillation.

CN117357085BActive Publication Date: 2026-06-19CHANGZHI MEDICAL COLLEGE

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
CHANGZHI MEDICAL COLLEGE
Filing Date
2023-11-03
Publication Date
2026-06-19

Smart Images

  • Figure CN117357085B_ABST
    Figure CN117357085B_ABST
Patent Text Reader

Abstract

The application provides a paroxysmal atrial fibrillation screening, prediction and monitoring method based on heart rate variability characteristics, and belongs to the technical field of physiological signal processing; the technical problem to be solved is to provide an improved paroxysmal atrial fibrillation screening, prediction and monitoring method based on heart rate variability characteristics; the method comprises the following steps: HRV data acquisition and preprocessing; the ASE method is used to decompose the HRV to obtain a plurality of components on adaptive scales, and the entropy of each component is calculated as the extracted HRV feature; the Wilcoxon signed rank test is used, and the optimal feature subset is selected by forward selection; the optimal feature subset is input into the SVM, and the optimal classification model is obtained by using five-fold cross-validation; the processed HRV data is input into the optimal classification model to obtain the HRV feature extraction, selection and classification result; and the application is applied to paroxysmal atrial fibrillation screening, prediction and monitoring.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention provides a method for screening, predicting, and monitoring paroxysmal atrial fibrillation based on heart rate variability characteristics, belonging to the field of physiological signal processing technology. Background Technology

[0002] Atrial fibrillation (AF), also known as atrial fibrillation, is a common type of cardiac arrhythmia. Paroxysmal atrial fibrillation (PAF) is an asymptomatic early form of atrial fibrillation (AF). Long-term PAF will eventually develop into AF. Screening for PAF in the population and further predicting PAF episodes in patients allows for timely and effective intervention and treatment of PAF using techniques such as radiofrequency ablation, preventing cardiac remodeling and the transformation of PAF into AF.

[0003] Automatic screening and prediction of peripheral arterial heart disease (PAF) are generally achieved through feature classification methods, using features derived from electrocardiograms (ECG) or heart rate variability (HRV). Clinically, PAF is typically diagnosed by detecting P-wave variations in ECG signals; however, the nonlinearity, non-stationarity, and high signal-to-noise ratio of ECG make typical P-wave variations difficult to detect. HRV, representing RR interval variations in ECG signals, is easier to obtain than ECG, and noise has a smaller impact on HRV than ECG. In recent years, PAF identification using HRV features has been widely applied. Furthermore, some research indicates that HRV features contain potential information about the nervous system regulating heart rate; significant changes in these features suggest the occurrence of cardiovascular diseases associated with nervous system damage, thus HRV has received considerable attention. HRV features for identifying PAF mainly include features in the time domain, frequency domain, and various state-space domains. Some HRV features extracted using nonlinear system theory analysis methods can provide deeper analysis of heart rate dynamics, potentially revealing subtle abnormalities in cardiac autoregulation caused by disease, and are therefore also used for PAF identification. Entropy can measure HRV complexity and reflect the information carried by the current state of a biological system. Approximate entropy (AE) and sample entropy (SE), in particular, are suitable for analyzing short, noisy data. However, AE and SE analyses sometimes assign high entropy values ​​to diseased systems and low entropy values ​​to healthy systems, which is inconsistent with the expected decrease in entropy between diseased and healthy systems. This inconsistency arises because these entropy measures are based on single-scale analysis and do not consider the multi-scale characteristics of physiological systems. To overcome this shortcoming, researchers have proposed multi-scale entropy (MSE), which quantifies system complexity at different scales. However, the multi-scale approach in MSE essentially involves linear smoothing and extraction of data, which can easily lead to the loss of high-frequency components. The frequency response of the mean filter used cannot prevent aliasing, and the extracted features cannot accurately quantify the complexity of HRV at multiple scales, thus failing to effectively identify PAFs (Problems-Based Effects).

[0004] Based on the Empirical Mode Decomposition (EMD) method, some researchers have proposed an adaptive orthogonal filtering technique for signals, which effectively reduces aliasing in the EMD method and is called the Integral Mean Mode Decomposition (IMMD) method. IMMD decomposes signals with completeness and orthogonality, overcoming the shortcomings of linear filters in decomposing nonlinear HRVs. Therefore, combining IMMD with entropy to analyze HRV yields the complexity of HRV at multiple adaptive scales, overcoming the aforementioned shortcomings of EMD and more accurately extracting the characteristics of HRV complexity. We call this entropy measurement method at multiple adaptive scales adaptive-scale entropy (ASE).

[0005] Many classifiers have been applied to PAF (Problem-Based Allocation) identification, including Bayesian linear discrimination (BLC), K-Nearest Neighbor (KNN), Random Forest (RF), Extreme Learning Machine (ELM), Convolutional Neural Network (CNN), and Support Vector Machine (SVM). Among these, SVM, based on the Structural Risk Minimization criterion in statistical learning theory, effectively solves classification problems with small sample sizes. It exhibits superior generalization ability compared to other classifiers. Summary of the Invention

[0006] In order to overcome the shortcomings of the prior art, the technical problem to be solved by the present invention is to provide an improved method for screening, predicting and monitoring paroxysmal atrial fibrillation based on heart rate variability characteristics.

[0007] To solve the above-mentioned technical problems, the technical solution adopted by the present invention is: a method for screening, predicting and monitoring paroxysmal atrial fibrillation based on heart rate variability characteristics, comprising the following steps:

[0008] S1: HRV data acquisition and preprocessing;

[0009] S2: The HRV is decomposed using the ASE method to obtain multiple components at an adaptive scale, and the entropy of each component is calculated as the extracted HRV features.

[0010] S3: The Wilcoxon signed-rank test is used to select the optimal feature subset through forward selection;

[0011] S4: Input the optimal feature subset into the SVM and use five-fold cross-validation to obtain the optimal classification model;

[0012] S5: Input the processed HRV data into the optimal classification model to obtain the HRV feature extraction, selection, and classification results.

[0013] The HRV data in step S1 is obtained from the publicly available PhysioNet AFPDB database. HRV data labeled (n) from multiple normal subjects or people who have never experienced PAF are selected from the AFPDB database, and HRV data labeled (p) from multiple PAF patients are selected from the AFPDB database, with two HRV data points selected for each subject.

[0014] Four types of HRV data were extracted from the selected HRV data, including:

[0015] HRV data of normal sinus rhythm, denoted as NSR, are selected from the first 5 minutes of odd-numbered HRV data in (n);

[0016] HRV data far from PAF, denoted as D-PAF, select the first 5 minutes of odd-numbered HRV data in (p). The above HRV data far from PAF have no PAF for 45 minutes after the end and 45 minutes before the start.

[0017] The HRV data immediately adjacent to PAF is denoted as F-PAF. The last 5 minutes of even-numbered HRV data in (p) are selected, and then PAF appears immediately.

[0018] HRV data for PAF onset, selecting HRV data 5 minutes after the immediate occurrence of PAF in (p). The above data are even-numbered HRV data marked with p and c in the AFPDB database.

[0019] Each type of HRV data contains 25 HRV data points, each lasting 5 minutes.

[0020] Step S2 specifically includes:

[0021] First, IMMD adaptively decomposes HRV to obtain a set of its residual components at different scales;

[0022] Secondly, each element in the set is coarsened;

[0023] Next, calculate the scale of each element in the set;

[0024] Finally, the entropy of each element in the set is calculated as the HRV feature.

[0025] The entropy of each element in the set is calculated as follows:

[0026] The entropy of each element was calculated using the AE, SE, PE, FE and DE methods respectively, resulting in five different feature sets of HRV complexity.

[0027] Step S4 specifically includes:

[0028] The Wilcoxon signed-rank test was used to examine the differences in feature quantities between two types of HRV data: normal sinus rhythm and distant PAF type, distant PAF and adjacent PAF type, and adjacent PAF and PAF attack type. Features with statistically significant differences (p < 0.05) were selected.

[0029] The selected features are then subjected to Min-Max normalization.

[0030] The SFS method is used to select feature subsets. The specific steps are as follows:

[0031] 1) Input the selected n features into SVM and train n models in the training set. The highest ACC of these models on the test set is denoted as ACC0. The feature corresponding to the model with ACC0 is set as f0, and the feature subset is denoted as F0 = {f0}.

[0032] 2) Add the remaining n-1 features to F0 to form n-1 new feature subsets. Input the new feature subsets into SVM and train n-1 models on the training set. The models that achieve the highest ACC on the test set are denoted as ACC1. If ACC1 > ACC0, the new feature corresponding to the model of ACC1 is denoted as f1, and the feature subset is denoted as F1 = {f0, f1}.

[0033] 3) Repeat step 2) iteratively until the remaining ni features are added to the feature subsets respectively, and none of the ni new feature subsets can improve ACC, i.e., ACC(i+1)≤ACCi. At this point, stop the iteration. The feature subset with the optimal ACC is F. i ={f0,…,f i}

[0034] In step S5, the Gaussian kernel function in the LIBSVM toolbox is used to implement SVM classification. Five-fold cross-validation of the data is used to train and test the SVM model. The results are evaluated using three widely used performance metrics: accuracy (ACC), sensitivity (SN), and specificity (SP).

[0035] HRV identification is achieved by using LF / HF feature input SVM in PAF screening.

[0036] HRV identification is achieved by using SE feature input SVM at the third scale of ASE in PAF prediction and monitoring.

[0037] The advantages of this invention compared to existing technologies are as follows: The paroxysmal atrial fibrillation (PAF) screening, prediction, and monitoring method based on heart rate variability (HRV) features provided by this invention uses LF / HF feature input SVM to achieve HRV identification in PAF screening, and uses SE feature input SVM at the third scale of ASE to achieve HRV identification in PAF prediction and monitoring. The ASE method proposed in this invention is based on adaptive filtering technology, which can obtain the complex features of HRV at multiple autonomous frequencies, reflecting the subject's nervous system regulation of heart rhythm. These features help identify key differences between HRV sequences of PAF in different states. Furthermore, the optimal classification model trained by this invention exhibits superior performance. Attached Figure Description

[0038] The present invention will be further described below with reference to the accompanying drawings:

[0039] Figure 1 This is a flowchart of the method of the present invention;

[0040] Figure 2 This is a schematic diagram of element coarsening;

[0041] Figure 3 This is a schematic diagram of the 5-minute HRV (p20c) of a typical PAF seizure.

[0042] Figure 4 The result obtained after ASE adaptive filtering and coarsening Figure 3 A schematic diagram of the components and scales of HRV(p20c);

[0043] Figure 5 Feature maps (mean ± 2 standard deviations) obtained from four different types of HRV data using the ASE method. (n=25);

[0044] Figure 6 The graph shows the frequency-SE relationship curves for four different types of HRV sequences. The gray area in the graph represents the LF (0.04–0.15 Hz). Detailed Implementation

[0045] like Figure 1As shown, this invention provides a method for screening, predicting, and monitoring paroxysmal atrial fibrillation (PAF) based on heart rate variability (HRV) characteristics. First, HRV data needs to be acquired, segmented, selected, and preprocessed. Then, features are extracted from the HRV data, and finally, feature classification is achieved. Specifically, the method includes: First, the HRV is decomposed using the ASE method to obtain multiple components on an adaptive scale, and the entropy of each component is calculated as the extracted HRV features. Second, the Wilcoxon signed-rank test is used to select the optimal feature subset through forward selection. Finally, the optimal feature subset is input into an SVM, and five-fold cross-validation is used to obtain the optimal classification model, thereby achieving PAF screening, prediction, and monitoring.

[0046] The present invention will be further described below.

[0047] 1. Data Source

[0048] This invention uses the PhysioNet AFPDB database, in which HRV data of 48 subjects in the AFPDB database are selected. Among them, 50 30-minute HRV data from normal or never-experienced PAF subjects are labeled as (n); and 50 30-minute HRV data from PAF patients are labeled as (p). In particular, two adjacent HRV data are labeled as coming from the same subject (e.g., n01 and n02, p15 and p16). In this database, the present invention selected four types of HRV data, including: normal sinus rhythm (denoted as NSR, the first 5 minutes from odd-numbered HRV data (e.g., n01) in (n); distant from PAF (denoted as D-PAF, the first 5 minutes from odd-numbered HRV data (e.g., p15) in (p); these data have no PAF for 45 minutes after the end and 45 minutes before the start); immediately adjacent to PAF (F-PAF, the last 5 minutes from even-numbered HRV data (e.g., p16) in (p), after which PAF appears immediately); and PAF episode (5 minutes of HRV data from (p) where PAF appears immediately, i.e., even-numbered HRV data in the AFPDB database marked with p and c (e.g., p16c; immediately adjacent to p16). Each type of HRV data contains 25 HRV data points with a total duration of 5 minutes.

[0049] The four types of HRV data from the data source were combined into three datasets with different classification significance:

[0050] Group 1 consists of 25 normal sinus rhythm types and 25 distant PAF types. This classification question represents whether potential PAF patients can be screened from normal individuals.

[0051] Group 2 consists of 25 distant PAF types and 25 adjacent PAF types. This classification question represents whether it is possible to predict when PAF will occur.

[0052] Group 3 consists of 25 adjacent PAF types and 25 PAF seizures. This classification question represents whether PAF seizures can be monitored.

[0053] 2. HRV data processing

[0054] 2.1 HRV Feature Extraction Method

[0055] In the ASE method: First, IMMD adaptively decomposes HRV to obtain a set of its residual components at different scales; second, each element in the set is coarsened; third, the scale of each element in the set is calculated; finally, the entropy of each element in the set is calculated as the HRV feature.

[0056] 2.1.1 Obtaining the component set

[0057] The method for obtaining the set of residual components from an HRV sequence through IMMD adaptive decomposition is as follows:

[0058] 1) Let the HRV sequence be denoted as x(n). Define X jk (j, k = 1, 2, ...; k > j) is the data sequence x(n) = [x1, x2, ..., xn]. i ,…,x n Adjacent to two extreme values ​​x j and x k Local part between, X jk The length is τ = j–k+1. Local sequence X jk The mean is:

[0059]

[0060] Where m jk Fixed at the local midpoint Based on equation (1), all local mean points of the data sequence are obtained.

[0061] 2) Construct the data mean sequence m(n) by using all local mean points of the data sequence through cubic spline interpolation.

[0062] 3) The prototype mode function (PMF) is:

[0063] PMF(n) = x(n) - m(n) (2);

[0064] The above steps 1) to 3) are called a mean screening process, and PMF is denoted as PMF1.

[0065] 4) The PMF1 iteration repeats the mean screening process k–1 (k=2,3,…) times to obtain the PMF. k When PMF k Satisfying the Cauchy screening stopping criteria: When the screening stops, PMF k That is, IMF1.

[0066] 5) The remaining component r1(n) = x(n) - IMF1 is used as the new signal. Repeating the above process yields IMF2 and r2(n). Similarly, the remaining IMFk (k = 1, 2, 3, ..., m) components and the remaining component r can be obtained. k (n).

[0067] 6) All local data of each residual component are replaced by its local mean (Formula (1)). Finally, all residual components constitute a "trend" set R = {r1(n), r2(n), ..., r} of the original time series. m (n)}.

[0068] 2.1.2 Element coarsening

[0069] All local data of each element in R are replaced by its local mean (Equation (1)), resulting in a coarse-grained R. A schematic diagram of the coarsening of elements is shown below. Figure 2 .

[0070] 2.1.3 Calculating element scale

[0071] definition (i = 1, 2, ...) represents the i-th local scale (the local portion between the i-th and i+1-th maxima in IMFk). Because the IMFk spectrum has narrowband characteristics, all local scales of IMFk... Approximately equal, defining all local scales The average value is the time scale of IMFk. Its calculation formula is:

[0072]

[0073] According to the decomposition process of the IMMD method, this scale is the width of the adaptive filtering window that filters out IMFk, which is the k-th element r in R. k (n) scale, which adapts to the original data.

[0074] 2.1.4 Calculate the entropy of an element

[0075] Because AE, SE, permutation entropy (PE), fuzzy entropy (FE), and dispersion entropy (DE) are suitable for analyzing the complexity of short and noisy HRVs, they have become common and popular entropy measures in recent years. This invention employs these five entropies to measure the complexity of each element in a coarse-grained set R. For a time series x(i) of length N, the methods for measuring the five entropies are as follows:

[0076] 1)AE

[0077] Given the sequence {x1, x2, ..., x... N The reconstruction yields N-m+1 m-dimensional vectors X. m (i)=[x i ,x i+1 ,...,x i+m-1 ], i = 1, 2, ..., N-m+1. The distance d[X] between any two m-dimensional vectors. m (j)-X m [i] is defined as the maximum absolute value of the difference between corresponding elements in these two vectors. If the distance is less than or equal to the threshold r, it is called a template matching between the two vectors, and X. m (i) The probability of a template-matched vector is denoted as...

[0078]

[0079] Define φ m (r) is:

[0080]

[0081] Increase the dimension of the reconstructed vector to m+1, and repeat the above steps to obtain the m+1 dimensional vector X. m+1 (i) Probability of template matching and the corresponding φ m+1 (r). The conditional probability AE is defined as:

[0082] AE(m,r,N)=φ m (r)-φ m+1 (r) (7);

[0083] m is typically set to 2, and r is typically set to 0.2 times the standard deviation of the sequence.

[0084] 2) SE

[0085] Given the sequence {x1, x2, ..., x... N The reconstruction yields N-m+1 m-dimensional vectors X. m (i)=[xi ,x i+1 ,...,x i+m-1 ], i = 1, 2, ..., N-m+1. The distance d[X] between any two m-dimensional vectors. m (j)-X m [i] is defined as the maximum absolute value of the difference between corresponding elements in two vectors. If the distance is less than or equal to the threshold r, it is called a template matching between the two vectors, and X. m (i) The probability of template matching is denoted as (No self-matching):

[0086]

[0087] Definition B m (r) is:

[0088]

[0089] Increase the dimension of the reconstructed vector to m+1, and repeat the above steps to obtain the probability of template matching for the m+1 dimensional vector. And the corresponding A m (r). Then SE is defined as:

[0090]

[0091] m is typically set to 2, and r is typically set to 0.2 times the standard deviation of the sequence.

[0092] 3) PE

[0093] Given the sequence {x1, x2, ..., x... N The reconstruction yields N-m+1 m-dimensional vectors X. m (i)=[x i ,x i+1 ,...,x i+m-1 ], i = 1, 2, ..., N-m+1. For X m (i) Sort the elements in the sequence by their numerical values ​​to obtain the sorted type π. i (For m elements, there are m! possible sorting types π). Therefore, N-m+1 m-dimensional vectors have the same sorting type π. i The probability is:

[0094]

[0095] Suppose that there are j distinct sorting types for N-m+1 m-dimensional vectors, then PE is defined as:

[0096]

[0097] m is generally taken as 3 to 7, with 5 to 7 recommended.

[0098] 4)FE

[0099] Given the sequence {x1, x2, ..., x... N The reconstruction yields N-m+1 m-dimensional vectors. Distance between any two m-dimensional vectors If the absolute value of the difference between corresponding elements in two vectors is the maximum, then the similarity between the two vectors is defined as:

[0100]

[0101] Define φ m (n, r) is:

[0102]

[0103] Therefore, FE is defined as:

[0104] FE(m,n,r)=lnφ m (n,r)-lnφ m+1 (n,r) (15);

[0105] m is generally taken as 2, n is generally taken as 2, and r is generally taken as 0.2 times the standard deviation of the sequence.

[0106] 5)DE

[0107] Given the sequence {x1, x2, ..., x... N The values ​​are mapped from 0 to 1 onto y using the normal cumulative distribution function:

[0108]

[0109] Where μ and σ represent the mean and standard deviation of the sequence. (The last part, "y", appears to be a typo and should be left as is.) i via z i c =round(cy i +0.5) is mapped to integers from 1 to c, where round(·) represents the floor function. The reconstructed zi yields an m-dimensional vector. j = 1, 2, ..., N-m+1. Each time series Mapped to dispersive mode in The probability of a pattern is defined as:

[0110]

[0111] Then DE is defined as:

[0112]

[0113] m is generally taken as 2, and c is generally taken as 4 - 8.

[0114] 2.2 Selection Method for HRV Feature Subsets

[0115] For all the feature quantities obtained in Section 2.1:

[0116] First, the Wilcoxon signed-rank test method is used to test the difference between the feature quantities of two types of HRV data in each group, and the features with statistically significant differences (p < 0.05) are selected. The Wilcoxon signed-rank test is a non-parametric test method that does not require the verification of the normal distribution of data and is applicable to the small-sample data of the present invention.

[0117] Secondly, the selected features are subjected to Min-Max normalization. The normalized data f' of feature f i is: 212

[0118]

[0119] where, f i is the i-th value in f, min(f) is the minimum value in f, and max(f) is the maximum value in f.

[0120] Finally, the Sequential Forward Selection (SFS) method is used to select the feature subset. SFS is a heuristic search method. First, the target feature set is defined as an empty set. According to the feature evaluation function, each time a feature that makes the evaluation function better is added, and finally the feature subset that makes the evaluation function optimal is obtained. In the present invention, the ACC of the SVM classifier is used as the feature evaluation function. The specific steps are as follows:

[0121] 1) The n features selected in Sections 2.1 - 2.2 are respectively input into the SVM and n models are trained in the training set. The highest accuracy (accuracy, ACC) of these models on the test set is denoted as ACC0. The feature corresponding to the model with ACC0 is set as f0, and the feature subset is denoted as F0 = {f0}.

[0122] 2) The remaining n - 1 features are respectively added to F0 to form n - 1 new feature subsets. The new feature subsets are respectively input into the SVM and n - 1 models are trained in the training set. The highest ACC of these models on the test set is denoted as ACC1. If ACC1 > ACC0, the newly added feature corresponding to the model with ACC1 is denoted as f1, and the feature subset is denoted as F1 = {f0, f1}.

[0123] ​3) Repeat step 2) iteratively until the remaining ni features are added to the feature subsets, and none of the resulting ni new feature subsets can improve ACC (i.e., ACC(i+1) ≤ ACTi). At this point, stop the iteration. The feature subset with the optimal ACC is F. i ={f0,…,f i}

[0124] 2.3 Feature Classification Methods

[0125] This invention employs the Radial Basis Function (RBF) from the LIBSVM toolbox for SVM classification because it has fewer kernel parameters, lower training complexity, and better classification performance, making it widely used in classification. Furthermore, 5x cross-validation of the data is used for training and testing the SVM model. The results are evaluated using three widely used performance metrics: accuracy (ACC), sensitivity (SN), and specificity (SP), whose expressions are as follows:

[0126]

[0127]

[0128]

[0129] In the above formula: TP (True Positive) represents a true positive, TN (True Negative) represents a true negative, FP (False Positive) represents a false positive, and FN (False Negative) represents a false negative.

[0130] The method of the present invention will now be described with reference to specific embodiments.

[0131] Feature extraction results

[0132] IMMD Adaptive Filtering and Coarsening Results

[0133] Figure 3 A typical case of PAF seizure with a 5-minute HRV (p20c) is presented; Figure 4 The components of HRV in this example, R(n) = {r1(n), r2(n), ..., r, are obtained after ASE adaptive filtering and coarsening. m The set (n)} (m=11) and its scale values. These scales decrease in degree, and by calculating the entropy of r1~r11, we can obtain the quantification value of HRV complexity at these scales.

[0134] Extracted HRV features

[0135] All sample data were filtered and coarsened using IMMD to obtain at least 10 components. The five common and popular entropies (AE, SE, PE, FE, DE) of these components were calculated, resulting in five different feature sets for HRV complexity. The results of feature extraction using the ASE method for four different types of HRV data are shown below. Figure 5 Because AE calculation is a biased estimate, Figure 5 In (a), the entropy values ​​of NSR become negative at the last two scales; at multiple scales, PAF has the highest value for AE and F-PAF has the lowest value for SE. At multiple scales, NSR has the highest value for SE and F-PAF has the lowest value for SE. At multiple scales, PAF has the highest value for FE and D-PAF has the lowest value for FE. At multiple scales, D-PAF has the highest value for DE and PAF has the lowest value for DE. At multiple scales, F-PAF has the highest value for PE and NSR has the lowest value for PE.

[0136] Feature selection results

[0137] Feature results selected through statistical testing methods

[0138] For Groups 1 to 3, the Wilcoxon signed-rank test was used to examine the differences between the features of the two classes of data in each group. Features with statistically significant differences (p<0.05) were selected, and the results are shown in Table 1. In addition, for comparison, this invention uses the following metrics: standard deviation of the NN intervals (SDNN), root mean square successive difference of intervals (RMSSD), number of successive differences of intervals which differ by more than 50 ms (PNN50), standard deviation of differences between adjacent NN intervals (SDSD), power in the ultra-low frequency range (ULF), power in the very low frequency range (VLF), power in the low frequency range (LF), power in the high frequency range (HF), total power of spectral analysis (TP), and the ratio of low-frequency to high-frequency power. The features selected by the Wilcoxon signed-rank test after extracting 20 HRV features (generally with a filter window τ = 1 to 20) using the LF / HF and MSE methods are also given in Table 1.

[0139]

[0140] Table 1 shows the feature set with statistically significant differences (p<0.05). Note: The number n (n=1~20) represents the feature quantity at the nth scale.

[0141] Feature results selected by the SFS method

[0142] This invention uses the classification accuracy (ACC) of the SVM classifier as the feature evaluation function of SFS, and selects features using SFS to form the optimal feature subset for classification. The optimal feature subset obtained by SFS from the features in Table 1 is shown in Table 2.

[0143]

[0144] Table 2 shows the optimal feature subsets obtained through SFS, where the number n (n = 1 to 20) represents the feature quantity at the nth scale.

[0145] Classification results

[0146] The optimal subset of classification features is input into the SVM for classification. This invention uses the Radial Basis Function (RBF) from the LIBSVM toolbox to implement SVM classification. Additionally, five-fold cross-validation is used to select the optimal parameters for the SVM. Corresponding to the optimal feature subset in Table 2, in [2]... -10 ,2 10 The optimal kernel function parameters g and penalty function parameters c selected within the region are shown in Table 3.

[0147]

[0148] Table 3 shows the optimal parameters obtained using SVM.

[0149] The classification performance was evaluated using three widely used classification performance metrics: accuracy (ACC), sensitivity (SN), and specificity (SP). The results are shown in Table 4.

[0150]

[0151] Table 4 shows the SVM classification performance.

[0152] For Group 1, although the HRV features obtained by the linear method outperform those obtained by the nonlinear method, their classification performance is only average. The best classification performance is achieved by the LF / HF features in the frequency domain, with an ACC of only 78%, SN of 100%, and SP of 56%. The worst performance is achieved by the DE feature subsets at scales 6 and 9 of ASE(DE), with an ACC of 62%, SN of 80%, and SP of 44%.

[0153] For Group2, the SE features of ASE(SE) at the third scale are excellent, with ACC of 98%, SN of 100%, and SP of 96%, which are far superior to the classification performance of other feature subsets in Table 2; the feature subset of ASE(PE) has ACC of 78%, SN of 76%, and SP of 80%, which are slightly better than the time domain, frequency domain, and MSE feature subsets.

[0154] For Group3, the best classification performance still comes from the SE features at the 2nd or 3rd scale of ASE(SE), with ACC reaching 96%, SN at 96%, and SP at 96%. The classification performance of the SE feature subsets at the 3rd, 8th, and 16th scales of the MSE method, along with the PNN50 features in the same domain, is good, with ACC all at 84%, and SN and SP being very close. The feature classification performance of ASE(AE, FE, PE, DE) is generally average.

[0155] Table 5 presents similar PAF screening (Group 1) and prediction (Group 2) methods in the literature, which employ common typical features of HRV and are all based on AFPDB data. No similar work has been found for PAF monitoring (Group 3). Since most patients experience asymptomatic PAF attacks, the proposed Group 3 classification problem has practical significance for monitoring PAF attacks. Furthermore, Group 3 classification can also test the performance of PAF features and classification systems; therefore, this invention suggests that PAF identification systems should also consider Group 3. For Group 1, the proposed method's recognition performance is generally average, with the best performance achieved by ASE(FE) and ASE(PE). While the proposed method outperforms similar works in Table 5 (P. Langley's method has an ACC of 71%, however, the data duration used is 30 minutes, and accuracy is only verified by single-fold validation), it is inferior to SVM systems based on LF / HF features (see Table 4). For Group 2, M. Surucu's method claims to achieve 100% accuracy in ACC, SN, and SP. However, this method uses 48 features, which may lead to the curse of dimensionality in practical applications. Furthermore, the method's accuracy is only verified through single-fold data validation, and its generalization ability needs further evaluation. Additionally, this method uses 30 minutes of data, resulting in poorer real-time performance compared to other methods. Therefore, the ASE(SE) method proposed in this invention is optimal for Group 2 classification. For Group 3, among the methods tested in this invention (Table 2), the best performance still comes from the ASE(SE) method.

[0156] Furthermore, most methods in Table 5 use at least four features, with higher-performing methods using even more features. As mentioned earlier, a larger number of features increases the likelihood of falling into the curse of dimensionality in practical applications, leading to poor generalization ability. Moreover, most methods use a significant number of time-domain and / or frequency-domain features, the medical implications of which are rarely interpretable. In contrast, the method proposed in this invention typically uses only one or two features, thus exhibiting good generalization ability; these features can be traced back to the original HRV sequence, making their medical significance easily traceable.

[0157]

[0158] Table 5 compares the present invention with other methods. In the table: a mRMR = minimal redundancy maximal relevance; b ILFS stands for Infinite Latent Feature Selection. c AVNN = the mean value of the signal NN interval; d NN20 = the total number of consecutive HRV values ​​whose difference is greater than 20ms; e TINN = the triangular interpolation of the NN interval; f HH-P1 = normalized bispectral entropy; g HH-P2 = normalized bispectral squared entropy.

[0159] The medical implications of extracting HRV features using the ASE method will be discussed below.

[0160] The complexity of biological systems reflects their ability to adapt and function in a constantly changing environment; the complexity of biological systems is multi-scale; disease states, as well as aging, reduce an individual's adaptability, and the complexity of the system will decrease or even be lost.

[0161] In the method proposed in this invention, ASE(SE) exhibits optimal performance, indicating that ASE(SE) extracts the most typical features of the HRV sequence. This invention only discusses these features. Figure 5In (b), it can be observed that, on most timescales, ASE(SE) values, in order of decreasing value, are NSR > D-PAF > PAF > F-PAF. This can be explained as follows: Because healthy individuals have the best individual adaptability and the most complex cardiac dynamics, NSR has the highest ASE(SE) value; the three different PAF pathological states reduce an individual's ability to regulate cardiac rhythm to varying degrees, thus reducing complexity to varying degrees; although D-PAF is far from PAF onset, PAF inevitably causes permanent damage to the individual, resulting in a decrease in ASE(SE), but to a smaller extent; F-PAF is adjacent to PAF, and the cardiac dynamics are about to transition from complexity to disorder, resulting in a lower ASE(SE) value than D-PAF; when PAF occurs, an increase in ASE(SE) is observed. At this time, HRV values ​​(RR interval values) are absolutely unequal, and ASE(SE) decreases exponentially, resembling white noise [27,50], indicating a transition from complexity to disorder in the cardiac dynamics. Therefore, the HRV during a PAF attack cannot be compared with the ASE (SE) values ​​of the other three types of HRV to determine the complexity of the changes.

[0162] On the other hand, after converting the adaptive scaling into frequency, we obtain frequency-SE relationship curves for four different types of HRV sequences, such as... Figure 6 As shown. Most scholars believe that the LF (0.04–0.15 Hz) component of HRV mainly reflects sympathetic regulation, while the HF (0.15–0.4 Hz) component reflects parasympathetic regulation [44, 51-52]. In Figure 6 Within the LF, the differences in SE (on the 3 and 4 scales) between NSR and D-PAF were statistically significant (see Table 1), indicating that PAF patients have more significant permanent sympathetic nerve damage and a decline in their health level; the differences between D-PAF and F-PAF were statistically significant in all corresponding characteristics (see Table 1). This difference between LF and HF reflects the different degrees of change in vagal and sympathetic tone when PAF is about to occur, leading to a disruption of their balance, which is considered the main mechanism of PAF onset; during a PAF attack, multifocal abnormal electrical activity in the ventricular atrium causes absolute unequal RR intervals, at which point the patient's cardiac dynamics have changed from complex to completely chaotic, and HRV behavior resembles white noise.

[0163] In PAF screening, this invention proposes using an LF / HF feature-input SVM for HRV identification, while PAF prediction and monitoring utilize an SE feature-input SVM at the third scale of the ASE method for HRV identification. The performance of the proposed PAF screening, prediction, and monitoring methods is comparable to leading PAF-like systems in the literature. Furthermore, the proposed ASE method, based on adaptive filtering techniques, can obtain complex features of HRV at multiple autonomous frequencies, reflecting the subject's nervous system regulation of heart rhythm. These features help identify key differences between HRV sequences in PAF under different conditions, thereby assisting cardiologists in exploring the pathogenesis of PAF patients.

[0164] Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention, and not to limit them; although the present invention has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that modifications can still be made to the technical solutions described in the foregoing embodiments, or equivalent substitutions can be made to some or all of the technical features; and these modifications or substitutions do not cause the essence of the corresponding technical solutions to deviate from the scope of the technical solutions of the embodiments of the present invention.

Claims

1. A method for screening, predicting, and monitoring paroxysmal atrial fibrillation based on heart rate variability characteristics, characterized in that: Includes the following steps: S1: HRV data acquisition and preprocessing; S2: The HRV is decomposed using the ASE method to obtain multiple components at an adaptive scale, and the entropy of each component is calculated as the extracted HRV features. S3: The Wilcoxon signed-rank test is used to select the optimal feature subset through forward selection; S4: Input the optimal feature subset into the SVM and use five-fold cross-validation to obtain the optimal classification model; S5: Input the processed HRV data into the optimal classification model to obtain the HRV feature extraction, selection, and classification results.

2. The method for screening, predicting, and monitoring paroxysmal atrial fibrillation based on heart rate variability characteristics according to claim 1, characterized in that: The HRV data in step S1 is obtained from the publicly available PhysioNet AFPDB database. HRV data labeled (n) from multiple normal subjects or people who have never experienced PAF are selected from the AFPDB database, and HRV data labeled (p) from multiple PAF patients are selected from the AFPDB database, with two HRV data points selected for each subject. Four types of HRV data were extracted from the selected HRV data, including: HRV data of normal sinus rhythm, denoted as NSR, are selected from the first 5 minutes of odd-numbered HRV data in (n); HRV data far from PAF, denoted as D-PAF, select the first 5 minutes of odd-numbered HRV data in (p). The above HRV data far from PAF have no PAF for 45 minutes after the end and 45 minutes before the start. The HRV data immediately adjacent to PAF is denoted as F-PAF. The last 5 minutes of even-numbered HRV data in (p) are selected, and then PAF appears immediately. HRV data for PAF onset, selecting HRV data 5 minutes after the immediate occurrence of PAF in (p). The above data are even-numbered HRV data marked with p and c in the AFPDB database. Each type of HRV data contains 25 HRV data points, each lasting 5 minutes.

3. The method of screening, predicting and monitoring paroxysmal atrial fibrillation based on heart rate variability characteristics according to claim 2, characterized in that: Step S2 specifically includes: First, IMMD adaptively decomposes HRV to obtain a set of its residual components at different scales; Secondly, each element in the set is coarsened; Next, calculate the scale of each element in the set; Finally, the entropy of each element in the set is calculated as the HRV feature.

4. The method for screening, predicting, and monitoring paroxysmal atrial fibrillation based on heart rate variability characteristics according to claim 3, characterized in that: The entropy of each element in the set is calculated as follows: The entropy of each element was calculated using the AE, SE, PE, FE and DE methods respectively, resulting in five different feature sets of HRV complexity.

5. The method for screening, predicting, and monitoring paroxysmal atrial fibrillation based on heart rate variability characteristics according to claim 4, characterized in that: Step S4 specifically includes: The Wilcoxon signed-rank test was used to examine the differences in characteristic quantities between two types of HRV data: normal sinus rhythm and distant PAF type, distant PAF and adjacent PAF type, and adjacent PAF and PAF episode type. Statistically significant differences were selected. p The characteristic is <0.05; The selected features are then subjected to Min-Max normalization. The SFS method is used to select feature subsets. The specific steps are as follows: 1) Select the n Each feature is input into an SVM and trained on the training set. n There are 10 models, and the model that achieves the highest ACC on the test set is denoted as ACC0. The feature corresponding to the model with ACC0 is set as follows: f 0, the feature subset is denoted as F0={ f 0}; 2) Put the remaining n -1 features are added to F0 respectively to form n -1 new feature subsets are given, and the new feature subsets are input into the SVM and trained on the training set. n -1 models are those that achieve the highest ACC on the test set, denoted as ACC1. If ACC1 > ACC0, the new feature corresponding to the model with ACC1 is denoted as... f 1. The feature subset is denoted as F1={ f 0, f 1}; 3) Iterate and repeat step 2) until the remaining n - i Each feature is added to a feature subset to form n - i No new feature subset can improve ACC, i.e. If we stop iterating at this point, then the feature subset of the optimal ACC is F. i ={ f 0,…, f i } 6. The method for screening, predicting, and monitoring paroxysmal atrial fibrillation based on heart rate variability characteristics according to claim 5, characterized in that: In step S5, the Gaussian kernel function in the LIBSVM toolbox is used to implement SVM classification. Five-fold cross-validation of the data is used to train and test the SVM model. The results are evaluated using three widely used performance metrics: accuracy (ACC), sensitivity (SN), and specificity (SP).

7. A method for screening, predicting, and monitoring paroxysmal atrial fibrillation based on heart rate variability characteristics according to any one of claims 1-6, characterized in that: HRV identification is achieved by using LF / HF feature input SVM in PAF screening.

8. A method for screening, predicting, and monitoring paroxysmal atrial fibrillation based on heart rate variability characteristics according to any one of claims 1-6, characterized in that: HRV identification is achieved by using SE feature input SVM at the third scale of ASE in PAF prediction and monitoring.