A health management abnormal index identification method
By collecting multi-dimensional health indicator data and utilizing entropy weighting and model fusion techniques, the problems of insufficient indicator correlation and weight setting in traditional methods have been solved, thereby improving the accuracy and stability of health indicator anomaly identification.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- CHENGDU MEIYOU MEDICAL TECH CO LTD
- Filing Date
- 2026-03-09
- Publication Date
- 2026-06-19
AI Technical Summary
Traditional methods for identifying abnormal health indicators ignore the correlation and differences in the degree of influence between different health indicators and rely on clinical experience to set thresholds, resulting in low accuracy and poor robustness. Machine learning models have not undergone targeted feature engineering and weight optimization, making it difficult to meet the actual health management needs.
By collecting multi-dimensional health indicator data, calculating the weight values of each health indicator using the entropy weight method, constructing an indicator correlation feature matrix, generating a comprehensive feature vector, and using isolated forest and logistic regression models to determine the abnormality level, the correlation between indicators and the objective quantification of weights were achieved.
It improves the accuracy and robustness of identifying abnormal health indicators, avoids the limitations of subjective experience-based weighting, and enhances the ability to identify early potential abnormalities.
Smart Images

Figure CN122245806A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of data processing technology, and in particular to a method for identifying abnormal health management indicators. Background Technology
[0002] With the rapid development of the health management industry, anomaly identification based on multi-dimensional health indicators has become a core technology for chronic disease prevention and control and personal health monitoring. Traditional methods for identifying anomalies in health indicators often rely on a single threshold, directly determining whether an indicator is abnormal by comparing its measured value with a clinical standard threshold. However, this method has significant drawbacks: First, it ignores the correlation between different health indicators. For example, there is a significant correlation between blood glucose and glycated hemoglobin, and a single threshold cannot reflect the impact of this correlation on health status. Second, it does not consider the different degrees of influence of different indicators on health status. For example, blood pressure has a higher weighting for cardiovascular disease than body mass index, and traditional methods use the same judgment standard for all indicators, resulting in low accuracy. Third, it relies on clinical experience to set thresholds, lacks data-driven quantitative analysis, and is insufficient in identifying early potential anomalies.
[0003] In recent years, attempts have been made to use machine learning models for anomaly identification, but most of these attempts involve directly inputting raw indicator data into the model without targeted feature engineering and weight optimization. This results in poor robustness and generalization ability of the model, making it difficult to meet the needs of actual health management scenarios. Summary of the Invention
[0004] The purpose of this invention is to overcome the shortcomings of the prior art and provide a method for identifying abnormal health management indicators.
[0005] The objective of this invention is achieved through the following technical solution: a method for identifying abnormal health management indicators, comprising the following steps:
[0006] S1: Collect multi-dimensional health indicator data of the target object;
[0007] S2: Perform missing value imputation, outlier removal, and standardization on the collected health indicator data to obtain a standardized dataset;
[0008] S3: Calculate the weight values of each health indicator based on the entropy weight method;
[0009] S4: Calculate the Pearson correlation coefficient between any two health indicators and construct the indicator correlation feature matrix;
[0010] S5: Weighted fusion of standardized indicator data and associated feature matrix to generate comprehensive feature vector;
[0011] S6: Input the comprehensive feature vector into the model, output the abnormality score of the health indicator and determine the abnormality level.
[0012] Preferably, in step S1, the health indicator data includes physiological indicators and biochemical indicators, wherein the physiological indicators include heart rate, systolic blood pressure, diastolic blood pressure, body temperature and blood oxygen saturation; and the biochemical indicators include fasting blood glucose, glycated hemoglobin, total cholesterol, triglycerides and uric acid.
[0013] Preferably, in step S2, the formula for filling missing values is:
[0014] ;
[0015] ;
[0016] in, For the first The first sample The fill value for each indicator, The number of nearest neighbor samples, For the first The weights of the nearest neighbor samples, For the first The sample and the first Euclidean distance of the nearest neighbor samples For the first The nearest neighbor sample The original values of each indicator.
[0017] Preferably, in step S2, the formula for standardization is:
[0018] ;
[0019] in, For the first The first sample The standardized value of each indicator, For the first The average of the indicators, For the first The standard deviation of each indicator.
[0020] Preferably, step S3 further includes the following step:
[0021] S31: Constructing a standardized decision matrix ,in, For the sample size, For the number of indicators;
[0022] S32: Calculate the... The first indicator The proportion of each sample ,
[0023] ;
[0024] S33: Calculate the... Entropy value of each indicator ,
[0025] ;
[0026] like ,but ;
[0027] S34: Calculate the... Coefficient of difference of each indicator ,
[0028] ;
[0029] S35: Calculate the... Weight of each indicator ,
[0030] .
[0031] Preferably, in step S4, the formula for calculating the Pearson correlation coefficient is:
[0032] ;
[0033] in, For the first The first indicator and the first The correlation coefficient of each indicator .
[0034] Preferably, in step S5, the formula for calculating the weighted fusion is:
[0035] ;
[0036] in, For the first The first sample The fusion feature value of each indicator These are the associated feature weight coefficients, with a value range of [value range missing]. .
[0037] Preferably, step S6 further includes the following step:
[0038] S61: Input the comprehensive feature vector into the isolated forest model to obtain the first anomaly score. , ;
[0039] S62: Input the comprehensive feature vector into the logistic regression model to obtain the second anomaly score. , ;
[0040] S63: Calculate fusion anomaly score ,
[0041] ;
[0042] in, The fusion weighting coefficient has a range of values. ;
[0043] when When the time is right, it is considered normal;
[0044] when At that time, it was determined to be a mild abnormality;
[0045] when At that time, it was determined to be moderately abnormal;
[0046] when At that time, it was determined to be a severe abnormality.
[0047] The present invention has the following advantages: By collecting multi-dimensional health indicator data and calculating the weight values of each health indicator based on the entropy weight method, the present invention achieves objective quantification of the degree of influence of the indicators and avoids the limitations of subjective experience in weighting; at the same time, it generates a comprehensive feature vector and inputs the comprehensive feature vector into the model to judge the level of anomaly, thereby mining the potential correlation between indicators, solving the problem of traditional methods ignoring the correlation of indicators, and improving the accuracy and robustness of anomaly identification. Attached Figure Description
[0048] Figure 1 A schematic diagram of the process for identifying abnormal indicators in health management. Detailed Implementation
[0049] To make the objectives, technical solutions, and advantages of the embodiments of the present invention clearer, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. The components of the embodiments of the present invention described and shown in the accompanying drawings can generally be arranged and designed in various different configurations.
[0050] Therefore, the following detailed description of the embodiments of the invention provided in the accompanying drawings is not intended to limit the scope of the claimed invention, but merely to illustrate selected embodiments of the invention. All other embodiments obtained by those skilled in the art based on the embodiments of the invention without inventive effort are within the scope of protection of the invention.
[0051] It should be noted that, unless otherwise specified, the embodiments and features described in this invention can be combined with each other.
[0052] It should be noted that similar labels and letters in the following figures indicate similar items. Therefore, once an item is defined in one figure, it does not need to be further defined and explained in subsequent figures.
[0053] In the description of this invention, it should be noted that the terms "center," "upper," "lower," "left," "right," "vertical," "horizontal," "inner," and "outer," etc., indicate the orientation or positional relationship based on the orientation or positional relationship shown in the accompanying drawings, or the orientation or positional relationship commonly used when the product of this invention is in use, or the orientation or positional relationship commonly understood by those skilled in the art. They are only used for the convenience of describing this invention and simplifying the description, and do not indicate or imply that the device or element referred to must have a specific orientation, or be constructed and operated in a specific orientation, and therefore should not be construed as a limitation of this invention. In addition, the terms "first," "second," etc., are only used to distinguish descriptions and should not be construed as indicating or implying relative importance.
[0054] In the description of this invention, it should also be noted that, unless otherwise explicitly specified and limited, the terms "set," "install," "connect," and "link" should be interpreted broadly. For example, they can refer to a fixed connection, a detachable connection, or an integral connection; they can refer to a mechanical connection or an electrical connection; they can refer to a direct connection or an indirect connection through an intermediate medium; and they can refer to the internal connection of two components. Those skilled in the art can understand the specific meaning of the above terms in this invention based on the specific circumstances.
[0055] In this embodiment, as Figure 1 As shown, a method for identifying abnormal health management indicators includes the following steps:
[0056] S1: Collect multi-dimensional health indicator data of the target object; preferably, in step S1, the health indicator data includes physiological indicators and biochemical indicators, wherein the physiological indicators include heart rate, systolic blood pressure, diastolic blood pressure, body temperature and blood oxygen saturation; the biochemical indicators include fasting blood glucose, glycated hemoglobin, total cholesterol, triglycerides and uric acid.
[0057] S2: Perform missing value imputation, outlier removal, and standardization on the collected health indicator data to obtain a standardized dataset;
[0058] S3: Calculate the weight values of each health indicator based on the entropy weight method;
[0059] S4: Calculate the Pearson correlation coefficient between any two health indicators and construct the indicator correlation feature matrix;
[0060] S5: Weighted fusion of standardized indicator data and associated feature matrix to generate comprehensive feature vector;
[0061] S6: Input the comprehensive feature vector into the model, output anomaly scores for health indicators, and determine the anomaly level. By collecting multi-dimensional health indicator data and calculating the weight values of each health indicator based on the entropy weight method, the objective quantification of the influence of indicators is achieved, avoiding the limitations of subjective experience-based weighting. At the same time, a comprehensive feature vector is generated and input into the model to determine the anomaly level, thereby uncovering the potential correlation between indicators, solving the problem of traditional methods ignoring indicator correlation, and improving the accuracy and robustness of anomaly identification.
[0062] Furthermore, in step S2, the formula for filling missing values is:
[0063] ;
[0064] ;
[0065] in, For the first The first sample The fill value for each indicator, The number of nearest neighbor samples, For the first The weights of the nearest neighbor samples, For the first The sample and the first Euclidean distance of the nearest neighbor samples For the first The nearest neighbor sample The original values of each indicator. Further, in step S2, the standardization formula is:
[0066] ;
[0067] in, For the first The first sample The standardized value of each indicator, For the first The average of the indicators, For the first The standard deviation of each indicator. Specifically, missing value imputation utilizes distance-weighted assignment based on sample similarity to improve imputation accuracy; outlier filtering is based on 3... The principle (the probability of extreme values is extremely low under normal distribution) is to eliminate statistically significant outliers; standardization normalizes the data through the mean and standard deviation, eliminates differences in units of different indicators, and enables horizontal comparison between indicators.
[0068] In this embodiment, step S3 further includes the following step:
[0069] S31: Constructing a standardized decision matrix ,in, For the sample size, For the number of indicators;
[0070] S32: Calculate the... The first indicator The proportion of each sample ,
[0071] ;
[0072] S33: Calculate the... Entropy value of each indicator ,
[0073] ;
[0074] like ,but ;
[0075] S34: Calculate the... Coefficient of difference of each indicator ,
[0076] ;
[0077] S35: Calculate the... Weight of each indicator ,
[0078] .
[0079] Specifically, the entropy method determines weights based on the dispersion of indicator data. The higher the dispersion, the stronger the indicator's ability to distinguish health status, and the greater the weight. The analytic hierarchy process (AHP) transforms qualitative knowledge into quantitative weights by constructing a hierarchical model and an expert judgment matrix. At the same time, consistency checks ensure the rationality of the weights. The coupling formula uses product normalization to integrate subjective and objective weights, which avoids the subjectivity of human experience and compensates for the neglect of professional knowledge by pure data-driven approaches. This quantifies the degree of influence of different indicators on health status assessment, takes into account both the objective characteristics of the data and professional knowledge, avoids assessment bias caused by a single weight, and provides a basis for accurately calculating the deviation of indicators.
[0080] Furthermore, in step S4, the formula for calculating the Pearson correlation coefficient is:
[0081] ;
[0082] in, For the first The first indicator and the first The correlation coefficient of each indicator Specifically, the Pearson correlation coefficient is used to characterize the degree of linear correlation between two indicators, with a value range of [−1,1]. The larger the absolute value, the stronger the correlation. Potential correlations between indicators can be discovered through the correlation feature matrix. For example, the correlation coefficient between fasting blood glucose and glycated hemoglobin is about 0.8, indicating that the two have a strong positive correlation.
[0083] Furthermore, in step S5, the formula for calculating the weighted fusion is:
[0084] ;
[0085] in, For the first The first sample The fusion feature value of each indicator These are the associated feature weight coefficients, with a value range of [value range missing]. Specifically, the weight coefficients of the associated features. Its main function is to balance the contribution of original indicator features and associated features, and to avoid the associated features from having an excessive influence on the fusion result.
[0086] In this embodiment, step S6 further includes the following step:
[0087] S61: Input the comprehensive feature vector into the isolated forest model to obtain the first anomaly score. , ;
[0088] S62: Input the comprehensive feature vector into the logistic regression model to obtain the second anomaly score. , ;
[0089] S63: Calculate fusion anomaly score ,
[0090] ;
[0091] in, The fusion weighting coefficient has a range of values. ;
[0092] when When the time is right, it is considered normal;
[0093] when At that time, it was determined to be a mild abnormality;
[0094] when At that time, it was determined to be moderately abnormal;
[0095] when When the condition is found to be severely abnormal, it is determined. Specifically, the comprehensive feature vector labeled with normal / abnormal tags is divided into a training set and a test set, with the training set accounting for 70% and the test set accounting for 30%. The isolated forest model and the logistic regression model are trained on the test set respectively. The isolated forest model is used to capture isolated points in the data, and the logistic regression model is used to optimize the recognition accuracy based on the classification labels.
[0096] Although the present invention has been described in detail with reference to the foregoing embodiments, those skilled in the art can still modify the technical solutions described in the foregoing embodiments or make equivalent substitutions for some of the technical features. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of the present invention should be included within the protection scope of the present invention.
Claims
1. A method for identifying abnormal health management indicators, characterized in that: Includes the following steps: S1: Collect multi-dimensional health indicator data of the target object; S2: Perform missing value imputation, outlier removal, and standardization on the collected health indicator data to obtain a standardized dataset; S3: Calculate the weight values of each health indicator based on the entropy weight method; S4: Calculate the Pearson correlation coefficient between any two health indicators and construct the indicator correlation feature matrix; S5: Weighted fusion of standardized indicator data and associated feature matrix to generate comprehensive feature vector; S6: Input the comprehensive feature vector into the model, output the abnormality score of the health indicator and determine the abnormality level.
2. The method for identifying abnormal health management indicators according to claim 1, characterized in that: In step S1, the health indicator data includes physiological indicators and biochemical indicators. The physiological indicators include heart rate, systolic blood pressure, diastolic blood pressure, body temperature, and blood oxygen saturation. The biochemical indicators include fasting blood glucose, glycated hemoglobin, total cholesterol, triglycerides, and uric acid.
3. The method for identifying abnormal health management indicators according to claim 2, characterized in that: In step S2, the formula for filling in missing values is: ; ; in, For the first The first sample The fill value for each indicator, The number of nearest neighbor samples, For the first The weights of the nearest neighbor samples, For the first The sample and the first Euclidean distance of the nearest neighbor samples For the first The nearest neighbor sample The original values of each indicator.
4. The method for identifying abnormal health management indicators according to claim 3, characterized in that: In step S2, the formula for standardization is: ; in, For the first The first sample The standardized value of each indicator, For the first The average of the indicators, For the first The standard deviation of each indicator.
5. The method for identifying abnormal health management indicators according to claim 4, characterized in that: Step S3 further includes the following steps: S31: Constructing a standardized decision matrix ,in, For the sample size, For the number of indicators; S32: Calculate the... The first indicator The proportion of each sample , ; S33: Calculate the... Entropy value of each indicator , ; like ,but ; S34: Calculate the... Coefficient of difference of each indicator , ; S35: Calculate the... Weight of each indicator , 。 6. The method for identifying abnormal health management indicators according to claim 5, characterized in that: In step S4, the formula for calculating the Pearson correlation coefficient is as follows: ; in, For the first The first indicator and the first The correlation coefficient of each indicator .
7. The method for identifying abnormal health management indicators according to claim 6, characterized in that: In step S5, the calculation formula for weighted fusion is: ; in, For the first The first sample The fusion feature value of each indicator These are the associated feature weight coefficients, with a value range of [value range missing]. .
8. The method for identifying abnormal health management indicators according to claim 7, characterized in that: Step S6 further includes the following steps: S61: Input the comprehensive feature vector into the isolated forest model to obtain the first anomaly score. , ; S62: Input the comprehensive feature vector into the logistic regression model to obtain the second anomaly score. , ; S63: Calculate fusion anomaly score , ; in, The fusion weighting coefficient has a range of values. ; when When the time is right, it is considered normal; when At that time, it was determined to be a mild abnormality; when At that time, it was determined to be moderately abnormal; when At that time, it was determined to be a severe abnormality.