An aquatic ecological problem adaptive diagnosis method, device, equipment and medium
By employing an adaptive diagnostic method that combines the biological integrity index and a dynamic threshold mechanism, the problem of inaccurate identification of influencing factors in damaged aquatic ecosystem areas has been solved, thus improving the accuracy and reliability of the diagnosis.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- CHINA NAT ENVIRONMENTAL MONITORING CENT
- Filing Date
- 2026-04-29
- Publication Date
- 2026-06-19
AI Technical Summary
Existing technologies fail to consider the differences in characteristics of areas with varying degrees of damage when identifying influencing factors in aquatic ecosystems, leading to inaccurate identification and a high rate of false positives, especially in areas with high heterogeneity and low sensitivity where there are missed detections.
An adaptive diagnostic approach was adopted, and a statistical analysis model matching the degree of damage was constructed by calculating the biological integrity index. Combining global statistics and local marginal contribution values, a dynamic threshold mechanism was used to screen influencing factors, which were divided into key, potential and insignificant factors.
It improves the accuracy and stability of influencing factor identification, reduces the false positive rate in highly heterogeneous areas and the false negative rate in low-sensitivity areas, and enhances the accuracy and reliability of water ecological problem diagnosis.
Smart Images

Figure CN122242973A_ABST
Abstract
Description
Technical Field
[0001] This application belongs to the field of environmental protection technology, and specifically relates to an adaptive diagnostic method, device, equipment and medium for water ecological problems. Background Technology
[0002] Degradation of aquatic ecosystems is a core challenge in current watershed management. Its causes are complex and spatially heterogeneous. Accurately identifying and classifying the influencing factors of damaged areas can effectively guide the protection and restoration of the aquatic ecological environment.
[0003] In existing technologies, when identifying the influencing factors of damaged areas, the characteristics of areas with different degrees of damage are not considered. For example, severely damaged areas have nonlinear dose effects, but linear models cannot identify them, while lightly damaged areas are susceptible to noise interference when using complex models. Furthermore, the identification of key factors relies on fixed thresholds, ignoring the dynamic characteristics of data quality and damage intensity in different water areas, leading to an increased misjudgment rate in highly heterogeneous areas and missed detection of weak signals in low-sensitivity areas. There are technical problems with the inaccurate identification of influencing factors and key factors of damaged areas. Summary of the Invention
[0004] Based on the above analysis, the embodiments of the present invention aim to provide an adaptive diagnostic method, apparatus, equipment, and medium for water ecological problems, in order to solve the technical problem of inaccurate identification of influencing factors and key factors of damaged areas in the prior art.
[0005] The objective of this invention is achieved as follows: A first aspect of the present invention provides an adaptive diagnostic method for water ecological problems, comprising: Collect multi-source water environment data of the water area to be diagnosed; the multi-source water environment data includes aquatic biological data and environmental factor data. Calculate the biological integrity index based on the aquatic organism data; The degree of damage to the water area to be diagnosed is classified according to the biological integrity index. Based on the degree of damage, a statistical analysis model matching the damage characteristics is constructed, and the statistical analysis model is used to screen the influencing factors of each damaged area; The contribution of the influencing factor is quantitatively evaluated based on global statistics and local marginals to obtain a first contribution value and a second contribution value. The first contribution value and the second contribution value are then fused to obtain the comprehensive contribution value of the influencing factor. An adaptively adjusted dynamic threshold is constructed based on the degree of damage and the fluctuation of the environmental factor data. The comprehensive contribution value is compared with the dynamic threshold, and the influencing factors are divided into key influencing factors, potential influencing factors, and insignificant influencing factors.
[0006] Furthermore, the step of calculating the biological integrity index based on the aquatic organism data includes: selecting evaluation indicators from a pre-constructed aquatic organism evaluation indicator library; determining the expected value of the evaluation indicators based on historical monitoring data; comparing the expected value with the measured value of each evaluation indicator in the aquatic organism data, and uniformly quantifying each evaluation indicator into a standard score; summarizing and calculating the standard scores of each evaluation indicator to obtain the biological integrity index of each sampling point.
[0007] Furthermore, the step of classifying the degree of damage to the water area to be diagnosed based on the biological integrity index includes: determining a threshold for distinguishing between healthy and damaged areas based on the distribution of the biological integrity index at each sampling point; dividing the area to be diagnosed into multiple sub-regions according to the biological integrity index; marking sub-regions with an average biological integrity index lower than the threshold as damaged areas; and classifying the damaged areas into damage levels of severely damaged, moderately damaged, and slightly damaged based on the biological integrity index of the damaged areas.
[0008] Furthermore, the step of constructing a statistical analysis model matching the damage characteristics based on the degree of damage, and using the statistical analysis model to screen the influencing factors of each damaged area, includes: constructing a statistical analysis model using the biological integrity index as the response variable and the environmental factor data as the predictor variable; for severely damaged areas, the statistical analysis model is a generalized additive model; for moderately damaged areas, the statistical analysis model is a mixed-effects model; and for slightly damaged areas, the statistical analysis model is a generalized linear model; parameter fitting is performed on each of the statistical analysis models to obtain the regression coefficients and significance test probabilities of each environmental factor; and screening rules are set based on the regression coefficients and significance test probabilities, with environmental factors that meet the screening rules being selected as influencing factors.
[0009] Further, the method of quantifying and evaluating the contribution of the influencing factors based on global statistics and local marginals to obtain a first contribution value and a second contribution value includes: constructing a random forest model using the influencing factors as input variables and the biological integrity index or damage level as output variables; normalizing the global importance of each influencing factor based on the average decrease in the Gini coefficient caused by node splitting during the training of the random forest model, which is used as the first contribution value; constructing a gradient boosting model based on the same input and output variables as the random forest model, and using the TreeSHAP algorithm to calculate the marginal contribution value of each influencing factor to the prediction result during the training of the gradient boosting model; and averaging the absolute values of the marginal contribution values of each influencing factor to obtain the local importance of each influencing factor, which is used as the second contribution value.
[0010] Furthermore, the step of fusing the first contribution value and the second contribution value to obtain the comprehensive contribution value of the influencing factors includes: normalizing the first contribution value and the second contribution value respectively; and weighting and summing the normalized first contribution value and the second contribution value according to preset weights to obtain the comprehensive contribution value of each influencing factor; wherein the preset weights are optimized based on a genetic algorithm machine using historical data.
[0011] Further, the step of constructing an adaptively adjusted dynamic threshold based on the degree of damage and the volatility of the environmental factor data includes: mapping the degree of damage to a numerical damage index; calculating a coefficient of variation reflecting the overall volatility based on the environmental factor data; and constructing a dynamic threshold based on the damage index and the coefficient of variation, expressed as: ,in, DI Indicates the damage index, CV Represents the coefficient of variation. Indicates the baseline threshold. and These represent the adjustment coefficients for the damage index and the coefficient of variation, respectively. This represents the data quality benchmark.
[0012] A second aspect of the present invention provides an adaptive diagnostic device for water ecological problems, comprising: The data collection module is used to collect multi-source water environment data of the water area to be diagnosed; the multi-source water environment data includes aquatic biological data and environmental factor data; The biological integrity assessment module is used to calculate the biological integrity index based on the aquatic organism data. The damage severity grading module is used to grade the degree of damage to the water area to be diagnosed based on the biological integrity index. The impact factor screening module is used to construct a statistical analysis model that matches the damage characteristics based on the degree of damage, and to screen the impact factors of each damaged area using the statistical analysis model. The contribution value evaluation module is used to quantify the contribution of the influencing factor based on global statistics and local margins respectively, to obtain a first contribution value and a second contribution value, and to fuse the first contribution value and the second contribution value to obtain the comprehensive contribution value of the influencing factor. The impact factor classification module is used to construct an adaptively adjusted dynamic threshold based on the degree of damage and the fluctuation of the environmental factor data, compare the comprehensive contribution value with the dynamic threshold, and classify the impact factors into key impact factors, potential impact factors, and insignificant impact factors.
[0013] A third aspect of the present invention provides an electronic device, including a memory and a processor, wherein the memory stores a computer program, and when the computer program is executed by the processor, it implements the adaptive diagnosis method for water ecological problems described in any embodiment.
[0014] A fourth aspect of the present invention provides a computer-readable storage medium having a computer program stored thereon, wherein the computer program, when executed by a processor, implements the adaptive diagnosis method for water ecological problems described in any embodiment.
[0015] Compared with the prior art, the present invention can achieve at least one of the following beneficial effects: The adaptive diagnostic method for water ecological problems provided by this invention employs a model adaptive selection mechanism based on the degree of damage, which can match the optimal statistical analysis model for different damaged areas, thereby improving the accuracy of influencing factor identification. By combining global statistical importance with local marginal contribution for fusion analysis, it can simultaneously take into account the overall trend and local differences, improving the stability of key factor identification. By constructing a dynamic threshold mechanism that considers the degree of damage and data volatility, it can effectively reduce the false positive rate in highly heterogeneous areas and reduce the missed detection in low-sensitivity areas, thereby improving the overall accuracy and reliability of water ecological problem diagnosis. Attached Figure Description
[0016] To more clearly illustrate the technical solutions in the embodiments of this specification or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are only some embodiments recorded in the embodiments of this specification. For those skilled in the art, other drawings can be obtained based on these drawings.
[0017] Figure 1 A flowchart of the adaptive diagnosis method for water ecological problems provided in Embodiment 1 of the present invention; Figure 2 This is a schematic diagram of the adaptive diagnostic device for water ecological problems provided in Embodiment 2 of the present invention; Figure 3 This is a schematic diagram of the electronic device architecture provided in Embodiment 3 of the present invention. Detailed Implementation
[0018] To make the objectives, technical solutions, and advantages of the embodiments of this application clearer, the technical solutions of the embodiments of this application will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of this application, not all embodiments. It should be noted that, unless otherwise specified, the implementation methods and features in the implementation methods in this disclosure can be combined, separated, interchanged, and / or rearranged. Based on the embodiments of this application, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of this application.
[0019] Example 1 A specific embodiment of the present invention, such as Figure 1 As shown, an adaptive diagnostic method for water ecological problems is disclosed, including the following steps: S1. Collect multi-source water environment data of the water area to be diagnosed.
[0020] In this embodiment, the multi-source water environment data includes aquatic biological data and environmental factor data.
[0021] For example, aquatic biological data includes the species composition and abundance of plankton, benthic animals, fish or algae; environmental factor data includes water quality physicochemical data, aquatic habitat data and hydrological data; water quality physicochemical data includes conventional and toxic pollutants such as TN, TP, COD, heavy metals, etc.; aquatic habitat data includes riparian vegetation coverage, substrate type and water connectivity; and hydrological data includes flow rate, flow velocity, water temperature and water depth.
[0022] S2. Calculate the biological integrity index based on the aquatic organism data.
[0023] In this embodiment, step S2 includes the following steps: Evaluation indicators were selected from a pre-constructed database of aquatic organism evaluation indicators; The expected values of the evaluation indicators are determined based on historical monitoring data; By comparing the expected value with the measured values of each evaluation indicator in the aquatic organism data, each evaluation indicator is uniformly quantified into a standard score. The standard scores of each evaluation indicator are summarized and calculated to obtain the biological integrity index of each sampling point.
[0024] Specifically, when selecting evaluation indicators, priority is given to indicators that are sensitive to environmental gradient responses and have low correlation redundancy, and redundant indicators are eliminated through correlation analysis or stepwise regression methods. When determining expected values, reference area data with less human interference or better ecological conditions are selected as benchmarks, and the expected values of each evaluation indicator are determined using the quantile method or the mean method. In the process of calculating standard scores, the scoring functions of positive or negative indicators are used for normalization according to the type of evaluation indicator to ensure the comparability of indicators with different dimensions.
[0025] S3. Classify the degree of damage to the water area to be diagnosed according to the biological integrity index.
[0026] In this embodiment, step S3 includes the following steps: Based on the distribution of biological integrity index at each sampling point, a threshold for distinguishing between healthy and damaged organisms was determined. The area to be diagnosed is divided into multiple sub-regions based on the biological integrity index; Sub-regions whose average biological integrity index is lower than the determination threshold are marked as damaged regions; Based on the biological integrity index of the damaged area, the damaged area is classified into three damage levels: severely damaged, moderately damaged, and lightly damaged.
[0027] Specifically, firstly, the distribution of the biological integrity index at each sampling point within the reference area is statistically analyzed, and the 25th percentile method is used to determine the threshold for judging health and damage. Then, the damaged samples are clustered or gridded according to their spatial location to form multiple sub-regions. Sub-regions with an average biological integrity index below the threshold are marked as damaged regions. The biological integrity index within each damaged region is then analyzed. I Sort them and calculate their maximum values. Minimum value and interval span The damaged areas are divided into multiple level ranges according to the following rules: Severely damaged: , Moderate damage: , Minor damage: .
[0028] In some embodiments, a special event triggering mechanism is also included, which marks the corresponding area as severely damaged when an extreme hydrological event or a sudden pollution event is detected, such as a mass biological mortality event.
[0029] S4. Based on the degree of damage, construct a statistical analysis model that matches the damage characteristics, and use the statistical analysis model to screen the influencing factors of each damaged area.
[0030] In this embodiment, step S4 includes the following steps: A statistical analysis model was constructed using the biological integrity index as the response variable and the environmental factor data as the predictor variable. For severely damaged areas, the statistical analysis model is a generalized additive model; for moderately damaged areas, the statistical analysis model is a mixed-effects model; and for lightly damaged areas, the statistical analysis model is a generalized linear model. The parameters of each statistical analysis model are fitted to obtain the regression coefficients and significance test probabilities of each environmental factor. Based on the regression coefficients and significance test probabilities, screening rules are set, and environmental factors that meet the screening rules are used as influencing factors.
[0031] Specifically, environmental factor data from each sampling point are used as input variables, and the corresponding biological integrity index is used as the output variable to construct a training dataset. Before model construction, missing value imputation, outlier removal, and standardization are performed on the environmental factor data. When the damage level is severe, complex nonlinear effects exist within the region, such as threshold mutations; therefore, a generalized additive model (GAM) is used to fit the nonlinear relationship. When the damage level is moderate, spatial heterogeneity needs to be considered; therefore, a mixed-effects model (GLMM) is used to introduce random effects. When the damage level is mild, a generalized linear model (GLM) is used for fitting, which has strong noise resistance. During model training, model parameters are determined through cross-validation, and the regression coefficients and significance levels of each environmental factor are output. Environmental factors that have a significant impact on the biological integrity index are selected based on the absolute value of the regression coefficients and the significance level.
[0032] For example, the following filtering rules can be set for areas with different degrees of damage: Severely damaged areas: , Moderately damaged area: , Mildly damaged areas: , in, Represents the regression coefficient. p This represents the probability of a significance test.
[0033] S5. Quantitatively evaluate the contribution of the influencing factor based on global statistics and local marginals respectively to obtain a first contribution value and a second contribution value. Merge the first contribution value and the second contribution value to obtain the comprehensive contribution value of the influencing factor.
[0034] In this embodiment, step S5 includes the following steps: A random forest model is constructed using the aforementioned influencing factors as input variables and the aforementioned biological integrity index or damage level as output variables. Based on the average decrease in the Gini coefficient caused by node splitting during the training of the random forest model, the global importance of each influencing factor is obtained after normalization and used as the first contribution value. A gradient boosting model is constructed based on the same input and output variables as the random forest model. The TreeSHAP algorithm is used to calculate the marginal contribution of each influencing factor to the prediction result during the training of the gradient boosting model. The absolute values of the marginal contributions of each influencing factor are averaged to obtain the local importance of each influencing factor, which is then used as the second contribution value. The first contribution value and the second contribution value are normalized respectively. The normalized first and second contribution values are weighted and summed according to preset weights to obtain the comprehensive contribution value of each influencing factor; wherein the preset weights are optimized based on a genetic algorithm machine using historical data.
[0035] Specifically, the selected influencing factors are used as feature variables, and the biological integrity index or damage level is used as the target variable to construct a random forest model. During model training, multiple decision trees are generated through Bootstrap resampling, and the model error is evaluated based on out-of-bag data. For each decision tree, the Gini index or mean squared error reduction caused by each influencing factor when splitting at a node is calculated, and the results of all decision trees are summed and averaged to obtain the global importance of each influencing factor, i.e., the first contribution value. At the same time, a gradient boosting tree model is constructed based on the same training data, and the prediction result of each sample is decomposed using the TreeSHAP algorithm to obtain the marginal contribution value of each influencing factor to the prediction result. The absolute value of the SHAP value of each influencing factor in all samples is taken and the mean or median is calculated to obtain the second contribution value reflecting the degree of local influence. Finally, the first and second contribution values are normalized and weighted and fused according to preset weights to obtain the comprehensive contribution value.
[0036] S6. Based on the degree of damage and the fluctuation of the environmental factor data, construct an adaptively adjusted dynamic threshold, compare the comprehensive contribution value with the dynamic threshold, and classify the influencing factors into key influencing factors, potential influencing factors, and insignificant influencing factors.
[0037] In this embodiment, step S6 includes the following steps: The degree of damage is mapped to a numerical damage index; The coefficient of variation, reflecting the overall degree of fluctuation, is calculated based on the aforementioned environmental factor data; A dynamic threshold is constructed based on the damage index and the coefficient of variation, and is expressed as follows: ,in, DI Indicates the damage index, CV Represents the coefficient of variation. Indicates the baseline threshold. and These represent the adjustment coefficients for the damage index and the coefficient of variation, respectively. Indicates the data quality benchmark; Based on the ratio of the comprehensive contribution value to the dynamic threshold, the influencing factors are divided into key influencing factors, potential influencing factors, and insignificant influencing factors.
[0038] Specifically, DI The values follow the principle of equal interval mapping and are monotonically positively correlated with the degree of damage. The value range is [0.3, 1.0]. The larger the value, the more severe the system damage. For example, severe damage is assigned a value of 1, moderate damage is assigned a value of 0.6, and mild damage is assigned a value of 0.3; coefficient of variation CV The formula characterizes the overall fluctuation of all influencing factors across the observed values at each sampling point. ,in, and These represent the mean and standard deviation of the observed values, respectively. CV The larger the value, the stronger the overall volatility of the data; the benchmark threshold. The intercept term of the threshold function is determined through grid search optimization (search interval [0.1, 0.5]), ensuring... Within a reasonable range; data quality benchmark Based on data quality indicators such as data integrity, outlier ratio, and sampling specification compliance, a predetermined constant (default value is 1.0) is used. When the data quality is high (e.g., no missing values, no outliers), CV The fluctuation range is relatively controllable. The introduction of The term is positive in most cases, making CV The adjustment direction is as expected; the dynamic threshold function adjusts parameters. and To control the influence of different factors on the threshold, adjust the parameters. and Both were determined through grid search optimization, with search intervals of [0.2, 0.6] and [0.1, 0.4], respectively; the dynamic thresholds adapt to the degree of damage and data fluctuation. DI The higher the value, the more severe the damage. The smaller, A corresponding decrease in the value means that in severely damaged areas, the requirement for factor contribution should be appropriately relaxed to avoid missing potential key influencing factors due to an excessively high threshold. CV The higher the value, the greater the data fluctuation. The smaller the value, The corresponding decrease in value means that when the data quality is poor or the spatial heterogeneity is high, the judgment threshold is actively lowered to avoid the true signal being obscured by data noise.
[0039] Compared with existing technologies, the adaptive diagnostic method for water ecological problems provided in this embodiment improves the accuracy of influencing factor identification by employing a model adaptive selection mechanism based on the degree of damage, which can match the optimal statistical analysis model for different damaged areas; by combining global statistical importance and local marginal contribution for fusion analysis, it can simultaneously take into account the overall trend and local differences, thus improving the stability of key factor identification; and by constructing a dynamic threshold mechanism that considers the degree of damage and data volatility, it can effectively reduce the misjudgment rate of highly heterogeneous areas and reduce the missed detection of low-sensitivity areas, thereby improving the overall accuracy and reliability of water ecological problem diagnosis.
[0040] In some embodiments, a rationality verification step is also included, specifically: a prior knowledge base integrates the USEPA EC50 database, the Species Susceptibility Distribution (SSD) model, and localized tolerance threshold experimental data; the prior knowledge such as EC50 values and species tolerance thresholds is stored in a structured manner; and an automated verification module is developed: if the comprehensive contribution value ≥ If a factor is identified as a key influencing factor but it significantly conflicts with known ecological mechanisms (|Comprehensive Contribution Value - Normalized Value of Prior Knowledge|>0.3), a secondary verification is initiated: If the conflict originates from data anomalies (such as sampling errors or extreme events), the key influencing factor determination is maintained, but the output result is marked "needs to be reviewed in conjunction with the on-site situation"; if the conflict is confirmed to be unreasonable at the mechanism level by the preset expert rule base matching (for example, a factor is determined to be positively correlated, but ecological prior knowledge clearly indicates that it is negatively correlated), the factor is downgraded from a key factor to a potential influencing factor, and the reason for the downgrade is explained in the diagnostic report. The preset expert rule base is constructed based on literature surveys and the experience of domain experts, and stores common ecological mechanism relationships (such as the positive correlation between TP and chlorophyll a, the positive correlation between dissolved oxygen and fish diversity, etc.) in the form of rules to automate the verification process.
[0041] Example 2 This embodiment provides an adaptive diagnostic device for water ecological problems, such as... Figure 2 As shown, it includes: The data collection module is used to collect multi-source water environment data of the water area to be diagnosed; the multi-source water environment data includes aquatic biological data and environmental factor data; The biological integrity assessment module is used to calculate the biological integrity index based on the aquatic organism data. The damage severity grading module is used to grade the degree of damage to the water area to be diagnosed based on the biological integrity index. The impact factor screening module is used to construct a statistical analysis model that matches the damage characteristics based on the degree of damage, and to screen the impact factors of each damaged area using the statistical analysis model. The contribution value evaluation module is used to quantify the contribution of the influencing factor based on global statistics and local margins respectively, to obtain a first contribution value and a second contribution value, and to fuse the first contribution value and the second contribution value to obtain the comprehensive contribution value of the influencing factor. The impact factor classification module is used to construct an adaptively adjusted dynamic threshold based on the degree of damage and the fluctuation of the environmental factor data, compare the comprehensive contribution value with the dynamic threshold, and classify the impact factors into key impact factors, potential impact factors, and insignificant impact factors.
[0042] Example 3 This embodiment provides an electronic device, such as... Figure 3 As shown, it includes a memory and a processor. The memory stores a computer program, which, when executed by the processor, implements the adaptive diagnosis method for water ecological problems as described in any of the above embodiments.
[0043] Example 4 This embodiment provides a computer-readable storage medium storing a computer program thereon. When the program is executed by a processor, it implements the adaptive diagnosis method for water ecological problems as described in any of the above embodiments.
[0044] Computer-readable storage media include both permanent and non-permanent, removable and non-removable media that can store information by any method or technology. Information can be computer-readable instructions, data structures, program modules, or other data. Examples of computer storage media include, but are not limited to, phase-change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other memory technologies, CD-ROM, digital versatile optical disc (DVD) or other optical storage, magnetic tape, magnetic magnetic disk storage or other magnetic storage devices, or any other non-transferable medium that can be used to store information accessible by a computing device. As defined herein, computer-readable media does not include transient computer-readable media, such as modulated data signals and carrier waves.
[0045] Those skilled in the art will further recognize that the units and algorithm steps of the various examples described in conjunction with the embodiments disclosed herein can be implemented in electronic hardware, computer software, or a combination of both. To clearly illustrate the interchangeability of hardware and software, the components and steps of the various examples have been generally described in terms of functionality in the foregoing description. Whether these functions are implemented in hardware or software depends on the specific application and design constraints of the technical solution. Those skilled in the art can use different methods to implement the described functions for each specific application, but such implementations should not be considered beyond the scope of this invention.
[0046] The steps of the methods or algorithms described in conjunction with the embodiments disclosed herein can be implemented in hardware, a software module executed by a processor, or a combination of both. The software module can be located in random access memory (RAM), main memory, read-only memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, removable disk, CD-ROM, or any other form of storage medium known in the art.
[0047] The specific embodiments described above further illustrate the purpose, technical solution, and beneficial effects of the present invention. It should be understood that the above description is only a specific embodiment of the present invention and is not intended to limit the scope of protection of the present invention. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of the present invention should be included within the scope of protection of the present invention.
Claims
1. An adaptive diagnostic method for water ecological problems, characterized in that, include: Collect multi-source water environment data of the water area to be diagnosed; the multi-source water environment data includes aquatic biological data and environmental factor data. Calculate the biological integrity index based on the aquatic organism data; The degree of damage to the water area to be diagnosed is classified according to the biological integrity index. Based on the degree of damage, a statistical analysis model matching the damage characteristics is constructed, and the statistical analysis model is used to screen the influencing factors of each damaged area; The contribution of the influencing factor is quantitatively evaluated based on global statistics and local marginals to obtain a first contribution value and a second contribution value. The first contribution value and the second contribution value are then fused to obtain the comprehensive contribution value of the influencing factor. An adaptively adjusted dynamic threshold is constructed based on the degree of damage and the fluctuation of the environmental factor data. The comprehensive contribution value is compared with the dynamic threshold, and the influencing factors are divided into key influencing factors, potential influencing factors, and insignificant influencing factors.
2. The adaptive diagnostic method for water ecological problems according to claim 1, characterized in that, The calculation of the biological integrity index based on the aquatic organism data includes: Evaluation indicators were selected from a pre-constructed database of aquatic organism evaluation indicators; The expected values of the evaluation indicators are determined based on historical monitoring data; By comparing the expected value with the measured values of each evaluation indicator in the aquatic organism data, each evaluation indicator is uniformly quantified into a standard score. The standard scores of each evaluation indicator are summarized and calculated to obtain the biological integrity index of each sampling point.
3. The adaptive diagnostic method for water ecological problems according to claim 1, characterized in that, The classification of the degree of damage to the water area to be diagnosed based on the biological integrity index includes: Based on the distribution of biological integrity index at each sampling point, a threshold for distinguishing between healthy and damaged organisms was determined. The area to be diagnosed is divided into multiple sub-regions based on the biological integrity index; Sub-regions whose average biological integrity index is lower than the determination threshold are marked as damaged regions; Based on the biological integrity index of the damaged area, the damaged area is classified into three damage levels: severely damaged, moderately damaged, and lightly damaged.
4. The adaptive diagnostic method for water ecological problems according to claim 1, characterized in that, The statistical analysis model, constructed based on the degree of damage and matching the damage characteristics, is used to screen the influencing factors of each damaged area, including: A statistical analysis model was constructed using the biological integrity index as the response variable and the environmental factor data as the predictor variable. For severely damaged areas, the statistical analysis model is a generalized additive model; for moderately damaged areas, the statistical analysis model is a mixed-effects model; and for lightly damaged areas, the statistical analysis model is a generalized linear model. The parameters of each statistical analysis model are fitted to obtain the regression coefficients and significance test probabilities of each environmental factor. Based on the regression coefficients and significance test probabilities, screening rules are set, and environmental factors that meet the screening rules are used as influencing factors.
5. The adaptive diagnostic method for water ecological problems according to claim 2, characterized in that, The quantitative evaluation of the contribution of the influencing factors based on global statistics and local marginals, respectively, to obtain a first contribution value and a second contribution value, includes: A random forest model is constructed using the aforementioned influencing factors as input variables and the aforementioned biological integrity index or damage level as output variables. Based on the average decrease in the Gini coefficient caused by node splitting during the training of the random forest model, the global importance of each influencing factor is obtained after normalization and used as the first contribution value. A gradient boosting model is constructed based on the same input and output variables as the random forest model. The TreeSHAP algorithm is used to calculate the marginal contribution of each influencing factor to the prediction result during the training of the gradient boosting model. The absolute values of the marginal contributions of each influencing factor are averaged to obtain the local importance of each influencing factor, which is then used as the second contribution value.
6. The adaptive diagnostic method for water ecological problems according to claim 5, characterized in that, The process of fusing the first contribution value and the second contribution value to obtain the comprehensive contribution value of the influence factor includes: The first contribution value and the second contribution value are normalized respectively. The normalized first and second contribution values are weighted and summed according to preset weights to obtain the comprehensive contribution value of each influencing factor; wherein the preset weights are optimized based on a genetic algorithm machine using historical data.
7. The adaptive diagnostic method for water ecological problems according to claim 6, characterized in that, The adaptive adjustment of the dynamic threshold based on the degree of damage and the fluctuation of the environmental factor data includes: The degree of damage is mapped to a numerical damage index; The coefficient of variation, reflecting the overall degree of fluctuation, is calculated based on the aforementioned environmental factor data; A dynamic threshold is constructed based on the damage index and the coefficient of variation, and is expressed as follows: ,in, DI Indicates the damage index, CV Represents the coefficient of variation. Indicates the baseline threshold. and These represent the adjustment coefficients for the damage index and the coefficient of variation, respectively. This represents the data quality benchmark.
8. An adaptive diagnostic device for water ecological problems, characterized in that, The device includes: The data collection module is used to collect multi-source water environment data of the water area to be diagnosed; the multi-source water environment data includes aquatic biological data and environmental factor data; The biological integrity assessment module is used to calculate the biological integrity index based on the aquatic organism data. The damage severity grading module is used to grade the degree of damage to the water area to be diagnosed based on the biological integrity index. The impact factor screening module is used to construct a statistical analysis model that matches the damage characteristics based on the degree of damage, and to screen the impact factors of each damaged area using the statistical analysis model. The contribution value evaluation module is used to quantify the contribution of the influencing factor based on global statistics and local margins respectively, to obtain a first contribution value and a second contribution value, and to fuse the first contribution value and the second contribution value to obtain the comprehensive contribution value of the influencing factor. The impact factor classification module is used to construct an adaptively adjusted dynamic threshold based on the degree of damage and the fluctuation of the environmental factor data, compare the comprehensive contribution value with the dynamic threshold, and classify the impact factors into key impact factors, potential impact factors, and insignificant impact factors.
9. An electronic device, characterized in that, It includes a memory and a processor, the memory storing a computer program that, when executed by the processor, implements the adaptive diagnostic method for water ecological problems as described in any one of claims 1-7.
10. A storage medium, characterized in that, It stores a computer program, which, when executed by a processor, implements the adaptive diagnostic method for water ecological problems as described in any one of claims 1-7.