An intelligent recommendation system and method for repairing a contaminated site microbial agent
By constructing a high-fidelity sampling reconstruction, remediation zone pollution attribution, and site remediation decision-making subsystem, combined with low-disturbance drilling and multi-machine learning, intelligent remediation of contaminated sites was achieved. This solved the problems of insufficient accuracy in pollution identification and lack of intelligence in microbial agent matching in existing technologies, and enabled efficient and accurate pollution remediation and production collaboration.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- TIANJIN UNIV
- Filing Date
- 2026-02-09
- Publication Date
- 2026-06-19
AI Technical Summary
Existing technologies for contaminated site remediation suffer from problems such as insufficient accuracy in contamination identification, lack of intelligence in microbial agent matching, failure to achieve multi-objective synergy in remediation decision-making, and poor synergy between production and remediation. As a result, it is difficult to achieve efficient and accurate remediation while ensuring production continuity.
A high-fidelity sampling and reconstruction subsystem, a pollution attribution factor system for the remediation area, and a site remediation decision-making subsystem are constructed. Through low-disturbance drilling, deep learning, and multi-machine learning, combined with a multi-dimensional knowledge graph, the system achieves accurate extraction of pollution information, intelligent recommendation and precise delivery of microbial agents, and establishes a fully closed-loop intelligent remediation system.
It enables precise remediation of contaminated sites during the production process, reduces interference with production facilities, improves the compatibility and degradation efficiency of microbial agents, reduces resource waste and remediation costs, and ensures the stability and sustainability of remediation results.
Smart Images

Figure CN122243231A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of contaminated site remediation and intelligent control technology, specifically to an intelligent recommendation system and method for remediation of contaminated sites using microbial agents. Background Technology
[0002] With the continuous operation of industrial production, contaminated sites are prone to complex soil and groundwater pollution due to factors such as material leaks and equipment aging. These sites are characterized by complex pollutant components and a strong correlation between pollutant migration and production conditions, necessitating efficient and precise remediation while ensuring production continuity. Although existing technologies have been explored in the field of pollution remediation, they still have many limitations and cannot meet the remediation needs of complex contaminated sites.
[0003] For example, patent application publication number CN118218390A proposes an "In-situ Indigenous Microbial Remediation Device and Method for Contaminated Soil". This technology constructs an in-situ indigenous microbial propagation system and an in-situ remediation system. It involves excavating pits in the contaminated area and injecting stimulating solutions to propagate indigenous microorganisms, then injecting the bacterial solution into the area to be remediated. This solves problems such as high remediation costs of exogenous bacteria, biosafety risks, and significant losses of in-situ stimulating solutions. However, the remediation decision-making lacks precision. It does not construct a three-dimensional reconstruction model of the contaminated space, making it impossible to accurately depict the distribution and diffusion trends of contamination. Furthermore, the selection of microbial agents in this patent relies on experience, lacking an intelligent matching mechanism between microbial agents and contamination characteristics and site conditions. It also fails to consider the constraints of production conditions on the remediation process, making it difficult to achieve synergistic optimization of remediation efficiency, cost, and production interference.
[0004] Patent application publication number CN120928883A proposes "A method and system for optimizing refining process efficiency and real-time temperature control based on artificial intelligence." This technology collects process and raw material data through a multimodal sensor array, combines XGBoost model to predict yield, DQN model to optimize temperature curves, and a three-level efficiency compliance mechanism, solving problems such as data silos, experience-based decision-making, and rigid fixed curves in refining processes. However, its core focus is on temperature control and efficiency optimization in refining processes. It does not address the core needs of contaminated site remediation, failing to cover key aspects such as accurate contamination identification, intelligent recommendation of microbial agents, and multi-media remediation boundary delineation, thus failing to achieve intelligent decision-making and dynamic adaptation for contaminated site remediation.
[0005] Patent application publication number CN118429471A proposes "an analysis method and system for the remediation range of soil pollution." This technology solves the problems of inaccurate pollution distribution simulation and large deviations in the definition of remediation range caused by traditional interpolation methods by dividing pollution boundary sub-lines, dynamically updating sliding windows to correct boundaries, and using image deduplication models. However, the patent only focuses on the analysis and determination of the remediation range and does not establish a collaborative mechanism with production conditions. It cannot avoid carrying out remediation operations in core production areas and during peak periods, and it lacks an iterative optimization mechanism for remediation schemes, making it difficult to cope with the dynamic changes of contaminated sites. In summary, existing technologies in contaminated site remediation still have problems such as insufficient accuracy in pollution identification, lack of intelligence in microbial agent matching, failure to achieve multi-objective collaboration in remediation decision-making, and poor coordination between production and remediation.
[0006] To address the above shortcomings, the applicant proposed an intelligent recommendation system and method for remediation of contaminated sites using microbial agents. Summary of the Invention
[0007] The purpose of this invention is to solve the technical problems of large sampling disturbances, ambiguous pollution attribution, blind delivery of microbial agents, and insufficient coordination between production and remediation during the process of simultaneous production and remediation at contaminated sites. The invention aims to achieve efficient remediation of contamination through intelligent recommendation and delivery of microbial agents while ensuring production continuity.
[0008] To solve the above-mentioned technical problems, the technical solution adopted by the present invention is as follows: A smart recommendation system for microbial agents to remediate contaminated sites includes a high-fidelity sampling and reconstruction subsystem, a contamination attribution system for the remediation area, and a site remediation decision-making subsystem. The output of the high-fidelity sampling and reconstruction subsystem is connected to the input of the contamination attribution system for the remediation area, and the output of the contamination attribution system for the remediation area is connected to the input of the site remediation decision-making subsystem. The output of the high-fidelity sampling and reconstruction subsystem is connected to the input of the site remediation decision-making subsystem, and the output of the site remediation decision-making subsystem is connected to the input of the high-fidelity sampling and reconstruction subsystem.
[0009] The high-fidelity sampling and reconstruction subsystem is used to collect soil and groundwater samples from contaminated sites, complete the three-dimensional reconstruction of the subsurface and the preliminary extraction of pollution information, and output the subsurface information to the pollution attribution factor system of the remediation area. At the same time, it receives remediation information feedback instructions from the site remediation decision subsystem and dynamically optimizes the sampling strategy. The pollution attribution factor system of the remediation area is used to receive subsurface information, realize the optimization of the remediation area boundary and the attribution of major pollutants, and output pollution information to the site remediation decision subsystem. Based on subsurface information and pollution information, the site remediation decision subsystem intelligently recommends microbial agents, optimizes the delivery plan and executes precise delivery, and outputs remediation information to the high-fidelity sampling and reconstruction subsystem to update heterogeneous data fusion.
[0010] The aforementioned high-fidelity sampling and reconstruction subsystem includes: a sample point confirmation module, a low-disturbance drilling module, a high-fidelity sampling module, a sample quality control and screening module, a rapid detection module, a heterogeneous data fusion module, and a subsurface 3D reconstruction module. The output of the sample point confirmation module is connected to the input of the low-disturbance drilling module, the output of the low-disturbance drilling module is connected to the input of the high-fidelity sampling module, the output of the high-fidelity sampling module is connected to the input of the sample quality control and screening module, the output of the sample quality control and screening module is connected to the input of the rapid detection module, the output of the rapid detection module is connected to the input of the heterogeneous data fusion module, and the output of the heterogeneous data fusion module is connected to the input of the subsurface 3D reconstruction module.
[0011] The aforementioned sampling confirmation module formulates a sampling plan based on production condition data, historical site pollution data, and previous monitoring data issued by the production database. The sampling plan includes sampling locations, sampling time periods, sampling depths, and sampling frequencies. The aforementioned low-disturbance drilling module, following the sampling scheme output by the sample point confirmation module, employs low-disturbance drilling technology to conduct drilling operations. Equipped with anti-disturbance drilling tools and pressure feedback devices, it reduces the interference of the drilling process on the site's soil structure, groundwater runoff, and production facilities, and accurately obtains soil and groundwater samples at different depths. The high-fidelity sampling module described above is used to collect and preserve soil and groundwater samples obtained by the low-disturbance drilling module with high fidelity, avoiding component volatilization, oxidation or cross-contamination during the collection process and ensuring the authenticity of the samples. The sample quality control screening module described above performs purity testing, contamination screening, and validity determination on samples transported to the testing area by the high-fidelity sampling module. It eliminates interfered, damaged, or invalid samples through sample appearance observation, purity analysis, and blank control experiments to ensure the reliability of the test data. The aforementioned rapid detection module is used to rapidly determine the type, concentration, and site physicochemical parameters of pollutants in valid samples selected by the sample quality control screening module. The pollutant detection covers heavy metals, volatile organic compounds, and semi-volatile organic compounds, while the physicochemical parameter detection includes soil moisture content, porosity, pH value, redox potential, organic matter content, etc., and generates a detection data report simultaneously. The aforementioned heterogeneous data fusion module is used to integrate the detection data output by the rapid detection module with historical pollution data and site geological and hydrological data from the multi-source soil database. The data is processed through data cleaning, redundancy removal, spatiotemporal alignment, and format standardization. The aforementioned underground 3D reconstruction module, combined with the site geological stratification characteristics and groundwater runoff patterns output by the heterogeneous data fusion module, uses 3D modeling technology to depict the spatial distribution, concentration gradient, diffusion trend, and correlation with site geochemical parameters of pollutants underground, generating a visualized underground 3D reconstruction model that clarifies the core pollution area, diffusion path, and potential impact range.
[0012] The aforementioned pollution attribution system for the remediation area includes: a deep learning module, a remediation area boundary optimization module, and a pollution attribution module; the output of the deep learning module is connected to the input of the remediation area boundary optimization module, and the output of the remediation area boundary optimization module is connected to the inputs of the deep learning module and the pollution attribution module. The aforementioned deep learning module is used for deep learning spatial correlation modeling. It uses pollution concentration data and field geochemical parameters from underground information as node features and key production factors as correlation edge weights. Through model training, it mines the potential correlation patterns between pollution distribution and production activities. Based on the output results of the deep learning model, a spatial clustering algorithm is used to perform cluster analysis on the pollution concentration data, identify high-value pollution concentration clusters, pollution diffusion path clusters, and isolated pollution points, and preliminarily define the scope of pollution impact. The above-mentioned remediation area boundary optimization module, based on the data from the deep learning module and combined with spatial clustering results and site topography, initially delineates the remediation area boundary to ensure that the boundary covers all pollution cluster areas and potential diffusion paths. Combined with the production facility layout issued by the production database, the initially defined remediation area boundary is adjusted and optimized to ensure that the remediation area boundary covers the actual pollution range without interfering with production operations. The pollution attribution module, combined with the output of the remediation zone boundary optimization module, calculates the pollution contribution of different pollutants through multi-factor comprehensive evaluation and other methods, clarifies the impact weight of various pollutants on site pollution, and, based on the pollution contribution ranking results, pollutant environmental risk thresholds and remediation feasibility, selects the main pollutants that have the greatest impact on the site environment, the highest risk, and require priority remediation, and clarifies the concentration range and spatial distribution of the main pollutants.
[0013] The aforementioned site remediation decision-making subsystem includes: a knowledge graph module, an intelligent recommendation algorithm module, a fungicide effect testing module, a multi-machine learning fusion module, a multi-objective optimization module, a remediation scheme implementation module, and a remediation effect detection module. The output of the knowledge graph module is connected to the input of the intelligent recommendation algorithm module; the output of the intelligent recommendation algorithm module is connected to the input of the fungicide effect testing module; the output of the fungicide effect testing module is connected to the inputs of the knowledge graph module and the multi-machine learning fusion module; the data end of the multi-machine learning fusion module is bidirectionally connected to the data end of the multi-objective optimization module; the output of the multi-objective optimization module is connected to the input of the remediation scheme implementation module; the output of the remediation scheme implementation module is connected to the input of the remediation effect detection module; and the output of the remediation effect detection module is connected to the input of the knowledge graph module.
[0014] The aforementioned knowledge graph module is used to construct a multi-dimensional structured network in the field of microbial inoculant remediation. With "entity-relationship-weight" as the core architecture, it realizes semantic association between inoculants and information such as pollution characteristics, site conditions, and production constraints. The aforementioned intelligent recommendation algorithm module is used to call the microbial agent remediation knowledge graph constructed by the knowledge graph module based on pollution information, mine the matching relationship between microbial agents and pollution characteristics through intelligent recommendation algorithm, initially recommend candidate microbial agents, and generate a candidate agent list by sorting according to the degree of matching. The aforementioned microbial agent efficacy testing module is used to conduct small-scale laboratory tests on microbial agents from the candidate microbial agent list recommended by the intelligent recommendation algorithm module. It simulates actual site environmental conditions, sets up experimental groups with different microbial agent concentrations and application ratios, and simultaneously sets up a blank control group. During the experimental period, it regularly monitors the pollutant degradation rate, microbial agent survival concentration, and changes in soil physicochemical parameters for each group, screens the optimal microbial agent with the best degradation effect, good microbial agent survival status, and no negative impact on the soil environment, and determines the initial application parameters. The aforementioned multi-machine learning fusion module is used to construct multi-machine learning fusion models, including basic models such as Random Forest (RF), Gradient Boosting Tree (GBT), Support Vector Regression (SVR), and Deep Neural Network (DNN). These models are integrated into a fusion model through combination strategies such as dynamic weight allocation. The model is then input into a decision analysis dataset to predict the pollutant degradation effect, microbial agent survival status, and potential impact on production under different application schemes (location, type, dosage). The aforementioned multi-objective optimization module, combined with a multi-machine learning fusion module, aims to maximize repair efficiency, minimize repair costs, and minimize production interference risks. It optimizes and determines the location, type, and dosage of each delivery point by taking into account the characteristics of the microbial agent and site constraints. The above-mentioned remediation implementation module is used to deliver microbial agents to the target area through a fixed-point delivery device according to the optimization scheme output by the multi-objective optimization module, and to record data such as delivery location coordinates, delivery amount, delivery time, and environmental parameters in real time. The aforementioned remediation effect detection module is used to sample and detect different points in the remediation area at preset intervals after the remediation plan implementation module is implemented, to obtain data on changes in pollutant concentration, bacterial agent survival concentration, and soil physicochemical parameters, and to evaluate the remediation effect.
[0015] This invention also includes a method for intelligent recommendation of microbial agents for the remediation of contaminated sites. This method can be used in conjunction with the intelligent recommendation of microbial agents for the remediation of contaminated sites, or it can be implemented independently. This method is applicable to sites in production chemical enterprises where soil and groundwater are contaminated in combination (pollutants include organic matter and heavy metals), and remediation operations must be carried out under the premise of ensuring the continuous operation of production equipment. The method includes the following steps: Step 1: Receive real-time operating data from the production database, combine it with historical site pollution information, formulate a low-disturbance sampling plan, specify the sampling points, depth, time period and safety protection measures, and ensure that sampling operations avoid the core production area and peak hours to reduce interference with production operations; Step 2: Perform low-disturbance drilling and high-fidelity sampling according to the sampling plan, complete sample quality control screening, obtain pollution data and physicochemical parameters through rapid detection and data calibration, construct a three-dimensional underground reconstruction model based on multi-source heterogeneous data fusion, accurately depict the spatial distribution of pollution, and output underground information to the pollution attribution factor system; Step 3: Based on underground information, identify the pollution range through deep learning modeling and spatial clustering, combine production database analysis to optimize the boundary of the remediation area, determine the main pollutants by pollution attribution, and output pollution information including the boundary of the remediation area, pollutant type and concentration to the remediation decision subsystem. Step 4: Call the intelligent recommendation algorithm based on the knowledge graph of microbial agent remediation, combine the pollution information to recommend candidate microbial agents, and screen the optimal microbial agent through laboratory small-scale experiments; Step 5: Based on underground information, integrate multi-source data to train a multi-machine learning fusion model to predict the repair effect. With the goal of "maximizing repair efficiency, minimizing repair cost, and minimizing production interference risk", construct an optimization model to solve for the optimal delivery scheme (location, type, dosage). Step 6: During the preset safe delivery period, use precision delivery equipment to perform fixed-point and quantitative delivery according to the optimized plan, record delivery data in real time, and ensure that the delivery process is coordinated and adapted to the production conditions; Step 7: Conduct sampling and testing of the remediation area according to the preset cycle to obtain data such as pollutant concentration and bacterial agent survival status, evaluate the remediation effect, and determine whether the preset remediation target has been achieved; if the target is met, the remediation is completed and the process is transferred to the later stage of supervision; if the target is not met, the plan adjustment process is initiated. Step 8: For the remediation areas that do not meet the standards, provide feedback on the reconstruction parameters, update the remediation information to the knowledge graph and 3D reconstruction model, reanalyze the pollution status of the remediation area, optimize the microbial agent delivery plan (adjust the location, type, and dosage), and repeat steps 6-7 to perform the next round of remediation until the remediation meets the standards.
[0016] like Figure 2 As shown, steps 1-2 constitute the core execution flow of the high-fidelity sampling and reconstruction subsystem, which involves a six-level linkage of "sample point confirmation - low-disturbance sampling - quality control screening - rapid detection - data fusion - three-dimensional reconstruction" to achieve accurate extraction of pollution information. Step 2 includes the following sub-steps: Step 2-1: According to the sampling plan, use low-disturbance drilling equipment to carry out drilling operations, collect soil samples and groundwater samples with high fidelity, and record the location coordinates, depth, sampling time and corresponding production condition parameters (such as production load and equipment operating status) in real time during the sampling process. Step 2-2: Conduct verification through the sample quality control screening module, such as appearance verification to remove damaged or contaminated samples, purity verification to eliminate cross-contamination through blank control experiments, and consistency verification to remove samples that exceed the reasonable deviation range. Valid samples are retained and numbered and marked. Steps 2-3: Using the rapid detection module, equipment including portable GC-MS, photoionization detector (PID), X-ray fluorescence spectrometer (XRF), and membrane interface detector (MIP) are used to rapidly detect the pollutant type, concentration, and physicochemical parameters of soil moisture content, porosity, pH value, redox potential, and organic matter content in the effective samples, and generate a detection data report; Steps 2-4: The heterogeneous data fusion module integrates rapid detection data, production condition data, site geological and hydrological data, etc., and performs data preprocessing, such as data cleaning to remove outliers, redundancy removal and deletion of highly correlated duplicate parameters, spatiotemporal alignment and association of various types of data, and standardization processing to unify the data range, to form a structured dataset. Steps 2-5: Determine whether the dataset meets the requirements for 3D reconstruction. If not, supplement the sample points accordingly and repeat steps 2-1 to 2-4. If it meets the requirements, perform 3D reconstruction. The underground 3D reconstruction module is based on a standardized dataset and combines the site's geological stratification characteristics and groundwater runoff patterns. It adopts a three-level technical route of "CT scanning - image segmentation - model construction" to characterize the spatial distribution of pollutants. The underground 3D reconstruction module is based on a structured dataset and combines information such as site geological stratification characteristics and groundwater runoff patterns to generate an underground 3D reconstruction model, which clarifies the core pollution area, pollution plume range and potential impact area, and synchronously outputs underground information to the pollution attribution factor system of the remediation area. Includes the following steps: Step 2-5-1: Perform CT tomography on the reconstructed soil sample. First, remove image noise using the nonlocal mean method, and then use a Gaussian high-pass filter to enhance the contrast between pores and the skeleton. The filtering formula is as follows: ; In the formula, Coordinates of the filtered image pixel value, These are the original pixel values. K It is a 3×3 filter kernel matrix; k This represents the row index of the filter kernel matrix. The column index of the filter kernel matrix. Coordinates in the filter kernel matrix kernel coefficients at the location, Original image coordinates Pixel value at; Step 2-5-2: Based on the trough threshold between the two peaks of the gray-level histogram, segment the soil into three parts: pores, organic matter, and soil skeleton; use the moving cube algorithm to generate a pore structure model, classify pores into three categories according to pore volume quantiles, and calculate the equivalent diameter of each pore type: ; In the formula, For the first i Pore-like equivalent diameter For the first i Mean volume of pore-like structures; Step 2-5-3: Conduct compression tests on reconstructed soil samples, plot load-void ratio curves, estimate in-situ formation loads, and define the compression coefficient. ( In-situ porosity To reconstruct porosity, ), multiply the pore volume by Restoring the original pore morphology; Step 2-5-4: Integrate pollutant concentration data with geological and hydrological parameters, and construct a spatial distribution model of pollutant concentration using the Kriging interpolation method. The formula is as follows: ; In the formula, Z ( x 0) represents the concentration of pollutants at an unknown point. Z ( x i() represents the concentration at a known sampling point. interpolation weights ( For unknown points and the first i The distance between each sampling point b (distance attenuation coefficient) n The number of samples; Step 2-5-5: Simulate pollution diffusion trends under different production conditions, generate a visualized 3D underground reconstruction model, identify the core pollution area, pollution plume range, and potential impact areas, and simultaneously output the underground information to the pollution attribution factor system of the remediation area. The construction of the underground 3D reconstruction model requires parameter correction using a composite machine learning model. The specific steps are as follows: Step 2-5-5-1: Using the measured values of porosity and organic matter volume fraction of preset sample points as training labels, and using multi-color space features and soil resistance parameters as inputs, construct an SVR+Lasso composite model; Step 2-5-5-2: Use the mean squared error (MSE) as the loss function, as shown in the following formula: ; In the formula, Y i For the first i Measured values of a sample For the first i The model prediction value for each sample. n The number of samples; Step 2-5-5-3: Combine the CNN model with the inverse distance weighting method to perform secondary correction, and output the porosity correction value and the organic matter volume fraction correction value to optimize the accuracy of 3D reconstruction.
[0017] like Figure 3 As shown, step 3 is the core execution process of the pollution attribution system for remediation areas. Through a three-level analysis of "model building - boundary optimization - pollution attribution," it achieves accurate definition of remediation areas and pollution source tracing. The specific sub-steps are as follows: Step 3-1: Receive the underground 3D reconstruction model information and construct a deep learning spatial correlation model. The specific steps are as follows: Step 3-1-1: Extract pollution data from the underground 3D reconstruction model. Each sampling point is treated as a node, and the node feature vector is defined as follows: ,in For the first m The pollutants in the first k The concentration of each node, For the first t Soil physicochemical parameters in the first k The value of each node, For the first k The spatial coordinates of each node; the features of all nodes form a feature matrix. ( n For the number of nodes, d (for feature dimensions) Step 3-1-2: Construct edge weights based on the correlation between node spatial distance and pollution concentration. First, calculate the Euclidean distance between nodes. ( For the first i The node and the first j The three-dimensional Euclidean distance between the nodes For the first i The three-dimensional spatial coordinates of each node, For the first j (The three-dimensional spatial coordinates of each node); combined with concentration correlation. ( For the first i The node and the first j The correlation coefficient of pollutant concentration at each node. For the first i The node and the first j Covariance of pollutant concentration at each node For the first i Standard deviation of pollutant concentration at each node For the first i (standard deviation of pollutant concentration at each node), edge weight ( For the first i The node and the first j The edge weights of each node. For the first i The node and the first j Euclidean distance between nodes The maximum node distance. For the first i The node and the first j (The absolute value of the correlation coefficient of the concentration of each node); construct the adjacency matrix. ( This is the adjacency matrix between nodes. (Total number of nodes), and then perform symmetric normalization. ( D (a degree matrix), diagonal elements ( For the first i Line number i (diagonal elements of the column) Step 3-1-3: Use a two-layer graph convolutional layer to extract the spatial correlation features of pollution. The first layer calculates the core formula as in Step 3-1, and the second layer outputs the final node feature matrix. The training loss function uses contrastive loss to enhance the feature discrimination of areas with high pollution concentrations. ; In the formula, To compare loss functions, For the first i The original feature vectors of each node, For the first i Enhanced feature vectors of each node, For cosine similarity, n This represents the total number of nodes. Step 3-2: Based on the output feature matrix The K-means++ algorithm was used for clustering (the spatial clustering algorithm was used to perform cluster analysis on the pollution concentration data and clustering parameters were set) to identify high-value pollution concentration clusters, pollution diffusion path clusters and isolated pollution points, and to preliminarily define the scope of pollution impact. Step 3-3: Combining spatial clustering results and site topography information, the boundary of the restoration area is initially delineated using a polygon fitting method; Steps 3-4: Use data from the production database, such as the scope of the core production area, the direction of key pipelines, and the boundary of the explosion-proof area, to optimize the initial boundary, eliminate overlapping parts, and expand the edge areas with a high risk of pollution spread. Step 3-5: Verify the validity of the boundary range to ensure that the boundary of the remediation area covers the actual pollution range. If the boundary is valid, then attribute the pollutants in the remediation area; if the boundary is invalid, then readjust the deep learning parameters and repeat steps 3-1 to 3-5. Steps 3-6: Based on the attribution analysis dataset, calculate the pollution contribution using a multi-factor comprehensive evaluation method. The calculation formula is as follows: ; In the formula, C i For the first i The pollution contribution of each pollutant c ik For the first i The pollutants in the first k Concentration at each sampling point S k For the first k The pollution impact area corresponding to each sampling point w ik For the first i The pollutants in the first k The influence weight of production activities at each sampling point m The total types of pollutants, n This represents the number of sampling points; Based on the pollution contribution ranking in descending order, and combined with the environmental risk thresholds and remediation feasibility in standards such as the "Soil Environmental Quality Construction Land Soil Pollution Risk Control Standard (GB36600-2018)" and the "Groundwater Quality Standard (GB / T14848-2017)", the main pollutant types, concentration ranges and pollution priorities are determined, and the pollution information is output to the site remediation decision subsystem.
[0018] like Figure 4 As shown, steps 4-8 are the core execution process of the site remediation decision-making subsystem. Through the four-level linkage of "microbial agent recommendation - effect verification - model prediction - optimized delivery", the optimal microbial agent delivery plan is determined. like Figure 5 As shown, step 4 is the execution flow for recommending and validating the microbial agent, which includes the following sub-steps: Step 4-1: Construct a knowledge graph of microbial inoculant remediation. The specific steps are as follows: Step 4-1-1: Select core nodes, including the inoculant node. (Including attributes such as type of microbial agent, degradation type, and suitable pH value), pollutant nodes (Including pollutant type, chemical structure, concentration threshold, and other attributes), site nodes (Including soil texture, porosity, moisture content, and other attributes), production constraint nodes (Including attributes such as production load periods and safety windows) and other nodes; Step 4-1-2: Establish core relationship edges, including "degradation relationships". (The ability of microbial agents to degrade pollutants), "compatibility" (The degree of compatibility between the microbial agent and the site conditions), "compatibility relationship" (Compatibility of microbial agents with production conditions); Step 4-1-3: Based on historical and experimental data, calculate edge weights using correlation analysis. The formula is as follows: ; In the formula, For nodes in a knowledge graph i With nodes j Interrelationship The weight, For relationship Frequency of occurrence For relationship The attribute similarity, with values ranging from [0, 1]; Step 4-1-4: Combine nodes and relation edges into triples ( head , relation , tail , weight),For example( a 1, s 1, b 3, 0.85), forming a structured knowledge graph. G =( V , S , W ()( V For a set of nodes, S For relational edge sets, W (for weight set) Step 4-1-5: Use a supervised random walk algorithm to mine potential associations and define path features. Calculate path feature values X a,p(b) (along the path) p From the microbial agent node a To pollutant nodes b The probability of (the probability) is updated according to the following rules: ; In the formula, e For nodes in the path, sub-path The set of all nodes covered For nodes and e The value is 1 if there is a corresponding edge between them, and 0 otherwise. For nodes The total number of nodes that can be associated; Step 4-2: Based on the microbial agent remediation knowledge graph, an intelligent recommendation algorithm is used to mine the compatibility relationship between microbial agents and pollution characteristics, and a candidate microbial agent list is generated by sorting them according to the compatibility degree; the specific steps are as follows: Step 4-2-1: Use the TransE algorithm to map all nodes in the knowledge graph into low-dimensional embedding vectors. The embedding vector for the fungal agent node is... The pollutant node embedding vector is The relation edge embedding vector is Satisfying constraints ; Step 4-2-2: Target the main pollutant nodes in the current pollution information. Calculate the relationship between each inoculum node and The relevance score is calculated using the following formula: ; In the formula, For the first i Inoculum node With major pollutant nodes The relevance score, For bacterial agent nodes The low-dimensional embedding vector, for The transpose of , The embedding vector for the "degradation relation" k The denominator is the normalized term of the correlation scores of all fungal agent nodes, representing the embedding vector dimension. Step 4-2-3: Using site nodes (Current site conditions) and production constraint nodes To constrain the preferences, a second round of preference propagation is performed to calculate the constraint fit score: ; In the formula, For the first i Constraint fit score of inoculum agent For cosine similarity, ( To adapt weights to the venue, (For production compatibility weights). Step 4-2-4: Combine the correlation score and constraint fit score to obtain the overall fitness of the microbial agent: ; In the formula, For the first i The overall compatibility of inoculants. As a weight for degradation capacity, For the first i Correlation scores between inoculants and major pollutants; Sort the microbial agents in descending order and select the top 3-5 agents to generate a candidate microbial agent list; Step 4-3: Conduct small-scale laboratory tests on the microbial agents in the candidate microbial agent list, simulate the actual environmental conditions of the site, set up experimental groups with different concentrations and application ratios of microbial agents, and simultaneously set up a blank control group; Step 4-4: During the experimental period, periodically test the degradation rate of pollutants, the survival concentration of microbial agents, and changes in soil physicochemical parameters for each group, and screen 1-2 optimal microbial agents with the best degradation effect, good survival status of microbial agents, and no negative impact on the soil environment.
[0019] like Figure 6 As shown, step 5 is the execution flow of model prediction, which includes the following sub-steps: Step 5-1: Integrate pollution information, site parameters, pilot test data, etc., and form a structured decision analysis dataset by unifying the data format and aligning it in time and space. Divide the dataset into training set and test set according to a reasonable ratio. Step 5-2: Construct a multi-machine learning fusion model, selecting Random Forest (RF), Gradient Boosting Tree (GBT), Support Vector Regression (SVR), and Deep Neural Network (DNN) as base models. Using the decision dataset training set as input, optimize the hyperparameters of each base model using a grid search method. Employ a dynamic weight allocation ensemble strategy, dynamically allocating weights based on the coefficients of determination of each base model on the validation set, as shown in the following formula: ; In the formula, This is the final predicted value of the fusion model. For the first m Dynamic weights of the basic models ( For the first m (the coefficient of determination of each basic model) For the first m The output values of the basic model M The total number of basic models; the model prediction accuracy is verified using a test set, and the coefficient of determination and mean absolute percentage error are used for evaluation; the model inputs are "pollutant concentration, site parameters, inoculant type, delivery point coordinates, and delivery amount", and the outputs are "degradation efficiency, inoculant survival time, and production interference level"; Step 5-3: Construction and solution of a multi-objective optimization model, with the objectives of "maximizing repair efficiency, minimizing repair cost, and minimizing production disruption risk". Decision variable: Coordinates of delivery point. Types of microbial agents ( T ), deployment volume ( q i ); Objective function: ; In the formula, To improve overall repair efficiency ( c i0 For the first i Initial concentration at the sampling point c if The concentration after repair. For effective bacterial dosage, q i (for the amount of delivery) C total Total repair cost ( C T For the unit cost of the inoculant, C D Cost per unit of delivery d i For the first i Delivery distance to the location C F (for fixed costs of delivery equipment) RTo mitigate the risk of production disruption ( d i-pro For the first i Minimum distance between the location and the production facility. t overlap This represents the percentage of time that overlaps between delivery periods and peak production times. , (weighting coefficients) N This refers to the number of delivery points; Constraints: ; In the formula, q i For the first i The amount of microbial agent applied at each location, q max , q min These are the upper and lower limits for the amount of material distributed at a single location. For the first i The three-dimensional spatial coordinates of each delivery point D forbid It is a production restricted area. The minimum repair efficiency threshold, S i For the first i Site pollution impact coefficient Q total The upper limit of the total amount of microbial agent is set; the NSGA-Ⅲ algorithm is used to solve the model to obtain the Pareto optimal solution set, and the scheme with the best overall performance is selected as the initial remediation scheme in combination with expert decision-making. Step 5-4: Refine the initial remediation plan into a detailed plan that includes "coordinates of the delivery point, type of inoculant, and amount of inoculant," and review it to form the final remediation plan.
[0020] Step 7 includes the following sub-steps: Step 7-1: Remediation Effectiveness Testing Based on the boundary of the remediation area, the characteristics of contamination distribution, and the delivery plan, a testing plan is formulated; Step 7-2: According to the monitoring plan, adopt a low-disturbance drilling and high-fidelity sampling method to avoid interfering with production; conduct rapid testing on the collected samples to obtain data on various test indicators; Step 7-3: Based on the test data report, conduct degradation effect assessment: calculate the remediation efficiency = (initial concentration - current concentration) / initial concentration × 100%; assess the survival of the microbial agent: analyze the correlation between the survival concentration of the microbial agent and the remediation efficiency, and determine the suitability of the microbial agent; assess production interference: confirm that the remediation operation has not had a negative impact on production conditions or equipment operation; compare with the remediation target to determine whether the remediation effect meets the standard; Step 7-4: If the standard is met, the current stage of remediation is deemed qualified, a remediation effect report is generated, and the remediation is completed and the process moves to the later stage of environmental monitoring; if the standard is not met, the reasons for the non-compliance are analyzed, the next round of remediation continues, and the feedback optimization process in step 8 is initiated.
[0021] Step 8 includes the following sub-steps: Step 8-1: Update the repair information to the microbial agent repair knowledge graph and 3D reconstruction model, and repeat steps 2-5 to obtain a new round of microbial agent delivery repair scheme after iterative optimization; Step 8-2: Following the iteratively optimized plan, repeat steps 6-7 to perform the next round of repair and repair effect detection; Step 8-3: Verification of results: If the standard is met, proceed to subsequent environmental monitoring; if the standard is still not met, repeat steps 8-1-8-2 until the remediation effect meets the standard.
[0022] Compared with the prior art, the present invention has the following technical effects: 1) This invention innovatively constructs a closed-loop intelligent system of "high-fidelity sampling reconstruction - pollution attribution in remediation area - site remediation decision-making - dynamic feedback optimization". It uses a knowledge graph of microbial agent remediation to achieve intelligent matching of agents, and combines multi-machine learning fusion models and multi-objective optimization models to improve the accuracy of remediation effect prediction. It breaks through the limitations of traditional remediation that relies on experience-based selection, and achieves accurate matching of agents with pollution characteristics, site conditions and production conditions, which greatly improves the compatibility and degradation efficiency of agents. 2) Adopting a collaborative model of "production and repair simultaneously", the project reduces interference with production facilities by using low-disturbance drilling and high-fidelity sampling technology. It also dynamically and synchronously optimizes the sampling and delivery time based on production conditions, accurately avoiding core production areas and peak periods. This solves the problem of conflict between traditional repair and production operations, ensuring production continuity while reducing the impact of repair on site operations. 3) Integrating technologies such as deep learning spatial clustering, multi-machine fusion prediction, and multi-objective optimization, it achieves precise delineation of remediation zone boundaries, targeted attribution of major pollutants, and precise point-to-quantity delivery of recommended microbial agents, overcoming problems such as vague remediation scope and blind delivery in traditional methods, and reducing resource waste and remediation costs; 4) Establish a dynamic optimization mechanism of "monitoring-feedback-iteration". By monitoring the repair effect in real time and updating the parameters of the knowledge graph and 3D reconstruction model, the delivery plan is driven to iteratively optimize and achieve dynamic adaptation of the delivery plan. At the same time, combined with routine supervision and risk warning in the later stage, the repair effect is ensured to be stable and without rebound, filling the gap of the lack of full-cycle dynamic control in the existing technology. 5) The system has strong compatibility and can be expanded to adapt to different types of contaminated sites (such as production enterprises and industrial legacy sites) and different types of pollutants (heavy metals, organic matter, etc.). It does not require large-scale modification of production facilities. The core technology modules can be flexibly adapted to different site conditions, and have broad engineering application prospects and promotion value. Attached Figure Description
[0023] The present invention will be further described below with reference to the accompanying drawings and embodiments: Figure 1 This is a block diagram of the overall system of the present invention; Figure 2 This is a schematic diagram of the high-fidelity sampling and reconstruction subsystem described in an embodiment of the present invention; Figure 3 This is a schematic diagram of the pollution attribution factor system for the remediation area as described in an embodiment of the present invention; Figure 4 This is a schematic diagram of the site remediation decision-making subsystem according to an embodiment of the present invention; Figure 5 This is a schematic diagram illustrating the process of recommending and verifying the effectiveness of the microbial agents described in the embodiments of the present invention; Figure 6 This is a flowchart illustrating the multi-machine learning fusion model and multi-objective optimization model described in the embodiments of the present invention. Detailed Implementation
[0024] Figure 1 As shown, a smart microbial agent recommendation remediation system and method for contaminated sites are applicable to sites with combined soil and groundwater contamination (pollutants include organic matter and heavy metals) in operating chemical enterprises, requiring remediation operations to be carried out while ensuring continuous operation of production facilities. The system is based on a dynamic linkage architecture of "production conditions - pollution data - remediation decision," connecting a high-fidelity sampling and reconstruction subsystem, a pollution attribution system for the remediation area, and a site remediation decision subsystem, constructing a fully closed-loop operation system of "sampling and reconstruction - pollution attribution - decision delivery - feedback optimization."
[0025] The intelligent recommendation and remediation system for contaminated sites using microbial agents includes a high-fidelity sampling and reconstruction subsystem, a remediation area contamination attribution factor system, and a site remediation decision-making subsystem. The output of the high-fidelity sampling and reconstruction subsystem is connected to the input of the remediation area contamination attribution factor system, and the output of the remediation area contamination attribution factor system is connected to the input of the site remediation decision-making subsystem. The output of the high-fidelity sampling and reconstruction subsystem is connected to the input of the site remediation decision-making subsystem, and the output of the site remediation decision-making subsystem is connected to the input of the high-fidelity sampling and reconstruction subsystem.
[0026] The high-fidelity sampling and reconstruction subsystem is used to collect soil and groundwater samples from contaminated sites, complete the three-dimensional reconstruction of the subsurface and the preliminary extraction of pollution information, and output the subsurface information to the pollution attribution factor system of the remediation area. At the same time, it receives remediation information feedback instructions from the site remediation decision subsystem and dynamically optimizes the sampling strategy. The pollution attribution factor system of the remediation area is used to receive subsurface information, realize the optimization of the remediation area boundary and the attribution of major pollutants, and output pollution information to the site remediation decision subsystem. Based on subsurface information and pollution information, the site remediation decision subsystem intelligently recommends microbial agents, optimizes the delivery plan and executes precise delivery, and outputs remediation information to the high-fidelity sampling and reconstruction subsystem to update heterogeneous data fusion.
[0027] The high-fidelity sampling and reconstruction subsystem includes: a sample point confirmation module, a low-disturbance drilling module, a high-fidelity sampling module, a sample quality control and screening module, a rapid detection module, a heterogeneous data fusion module, and a subsurface 3D reconstruction module. The output of the sample point confirmation module is connected to the input of the low-disturbance drilling module, the output of the low-disturbance drilling module is connected to the input of the high-fidelity sampling module, the output of the high-fidelity sampling module is connected to the input of the sample quality control and screening module, the output of the sample quality control and screening module is connected to the input of the rapid detection module, the output of the rapid detection module is connected to the input of the heterogeneous data fusion module, and the output of the heterogeneous data fusion module is connected to the input of the subsurface 3D reconstruction module.
[0028] The sampling point confirmation module formulates a sampling plan based on production condition data, historical site pollution data, and previous monitoring data issued by the production database. The sampling plan includes sampling points, sampling time periods, sampling depth, and sampling frequency. The low-disturbance drilling module conducts drilling operations using low-disturbance drilling technology according to the sampling plan output by the sample point confirmation module. It is equipped with anti-disturbance drilling tools and pressure feedback devices to reduce the interference of the drilling process on the site soil structure, groundwater runoff and production facilities, and accurately obtain soil and groundwater samples at different depths. The high-fidelity sampling module is used to collect and preserve soil and groundwater samples obtained by the low-disturbance drilling module with high fidelity, avoiding component volatilization, oxidation or cross-contamination during the collection process and ensuring the authenticity of the samples. The sample quality control screening module performs purity testing, contamination screening, and validity determination on the samples transported to the detection area by the high-fidelity sampling module. It eliminates interfered, damaged, or invalid samples through sample appearance observation, purity analysis, and blank control experiments to ensure the reliability of the detection data. The rapid detection module is used to rapidly determine the pollutant type, concentration, and site physicochemical parameters of valid samples selected by the sample quality control screening module. The pollutant detection covers heavy metals, volatile organic compounds, and semi-volatile organic compounds, while the physicochemical parameter detection includes soil moisture content, porosity, pH value, redox potential, organic matter content, etc., and generates a detection data report simultaneously. The heterogeneous data fusion module is used to integrate the detection data output by the rapid detection module with historical pollution data and site geological and hydrological data from the multi-source soil database. The data is processed through data cleaning, redundancy removal, spatiotemporal alignment, and format standardization. The underground 3D reconstruction module combines the site geological stratification characteristics and groundwater runoff patterns output by the heterogeneous data fusion module. Through 3D modeling technology, it depicts the spatial distribution, concentration gradient, diffusion trend, and correlation with site geochemical parameters of pollutants underground, generating a visualized underground 3D reconstruction model that clarifies the core pollution area, diffusion path, and potential impact range.
[0029] The pollution attribution system for the remediation zone includes: a deep learning module, a remediation zone boundary optimization module, and a pollution attribution module; the output of the deep learning module is connected to the input of the remediation zone boundary optimization module, and the output of the remediation zone boundary optimization module is connected to the inputs of the deep learning module and the pollution attribution module. The deep learning module is used for deep learning spatial correlation modeling. It uses pollution concentration data and field geomorphological parameters in underground information as node features and key production factors as correlation edge weights. Through model training, it explores the potential correlation between pollution distribution and production activities. Based on the output of the deep learning model, a spatial clustering algorithm is used to perform cluster analysis on the pollution concentration data, identify high-value pollution concentration clusters, pollution diffusion path clusters, and pollution isolated points, and preliminarily define the scope of pollution impact. The remediation zone boundary optimization module initially delineates the remediation zone boundary based on data from the deep learning module, combined with spatial clustering results and site topography, ensuring that the boundary covers all pollution cluster areas and potential diffusion paths. It then adjusts and optimizes the initially defined remediation zone boundary based on the production facility layout provided by the production database, ensuring that the remediation zone boundary covers the actual pollution area without interfering with production operations. The pollution attribution module, combined with the output of the remediation zone boundary optimization module, calculates the pollution contribution of different pollutants through multi-factor comprehensive evaluation and other methods, clarifies the impact weight of various pollutants on site pollution, and, based on the pollution contribution ranking results, pollutant environmental risk thresholds and remediation feasibility, selects the main pollutants that have the greatest impact on the site environment, the highest risk, and require priority remediation, and clarifies the concentration range and spatial distribution of the main pollutants.
[0030] The site remediation decision-making subsystem includes: a knowledge graph module, an intelligent recommendation algorithm module, a fungicide effect testing module, a multi-machine learning fusion module, a multi-objective optimization module, a remediation scheme implementation module, and a remediation effect detection module. The output of the knowledge graph module is connected to the input of the intelligent recommendation algorithm module, the output of the intelligent recommendation algorithm module is connected to the input of the fungicide effect testing module, the output of the fungicide effect testing module is connected to the inputs of the knowledge graph module and the multi-machine learning fusion module, the data end of the multi-machine learning fusion module is bidirectionally connected to the data end of the multi-objective optimization module, the output of the multi-objective optimization module is connected to the input of the remediation scheme implementation module, the output of the remediation scheme implementation module is connected to the input of the remediation effect detection module, and the output of the remediation effect detection module is connected to the input of the knowledge graph module.
[0031] The knowledge graph module is used to construct a multi-dimensional structured association network in the field of microbial inoculant remediation. With "entity-relationship-weight" as the core architecture, it realizes semantic association between inoculants and information such as pollution characteristics, site conditions, and production constraints. The intelligent recommendation algorithm module is used to call the microbial agent remediation knowledge graph constructed by the knowledge graph module based on pollution information, mine the matching relationship between microbial agents and pollution characteristics through the intelligent recommendation algorithm, initially recommend candidate microbial agents, and generate a candidate agent list by sorting according to the degree of matching. The microbial agent effect testing module is used to conduct small-scale laboratory tests on microbial agents from the candidate microbial agent list recommended by the intelligent recommendation algorithm module. It simulates the actual environmental conditions of the site, sets up experimental groups with different microbial agent concentrations and application ratios, and simultaneously sets up a blank control group. During the experimental period, it regularly monitors the pollutant degradation rate, microbial agent survival concentration, and changes in soil physicochemical parameters in each group, screens the optimal microbial agent with the best degradation effect, good microbial agent survival status, and no negative impact on the soil environment, and determines the initial application parameters. The multi-machine learning fusion module is used to construct a multi-machine learning fusion model, including basic models such as Random Forest (RF), Gradient Boosting Tree (GBT), Support Vector Regression (SVR), and Deep Neural Network (DNN). These models are integrated into a fusion model through combination strategies such as dynamic weight allocation. The model is then input into a decision analysis dataset to predict the pollutant degradation effect, microbial agent survival status, and potential impact on production under different application schemes (location, type, dosage). The multi-objective optimization module, combined with a multi-machine learning fusion module, aims to maximize repair efficiency, minimize repair costs, and minimize production interference risks. It optimizes and determines the location, type, and amount of each delivery point by taking into account the characteristics of the microbial agent and site constraints. The remediation scheme implementation module is used to deliver microbial agents to the target area through a fixed-point delivery device according to the optimization scheme output by the multi-objective optimization module, and to record data such as delivery location coordinates, delivery amount, delivery time, and environmental parameters in real time. The remediation effect detection module is used to sample and detect different points in the remediation area at a preset cycle after the remediation scheme implementation module is implemented, to obtain data on changes in pollutant concentration, bacterial agent survival concentration, and soil physicochemical parameters, and to evaluate the remediation effect.
[0032] This invention also includes a method for intelligent recommendation of microbial agents for the remediation of contaminated sites. This method can be used in conjunction with the intelligent recommendation of microbial agents for the remediation of contaminated sites as the system's usage / operation method, or it can be implemented independently for sites with combined soil and groundwater contamination (pollutants including organic matter and heavy metals) in operating chemical plants. Remediation work can be carried out while ensuring the continuous operation of production facilities. The steps of this method are as follows: Step 1: Receive real-time operating data from the production database, combine it with historical site pollution information, formulate a low-disturbance sampling plan, specify the sampling points, depth, time period and safety protection measures, and ensure that sampling operations avoid the core production area and peak hours to reduce interference with production operations; Step 2: Perform low-disturbance drilling and high-fidelity sampling according to the sampling plan, complete sample quality control screening, obtain pollution data and physicochemical parameters through rapid detection and data calibration, construct a three-dimensional underground reconstruction model based on multi-source heterogeneous data fusion, accurately depict the spatial distribution of pollution, and output underground information; Step 3: Based on underground information, identify the pollution range through deep learning modeling and spatial clustering, combine production database analysis to optimize the boundary of the remediation area, determine the main pollutants by pollution attribution, and output pollution information including the boundary of the remediation area, pollutant type and concentration; Step 4: Based on the knowledge graph of microbial agent remediation, candidate agents are recommended in combination with pollution information, and the optimal agent is screened through small-scale laboratory tests. Step 5: Construct an optimization model with the goal of "maximizing repair efficiency, minimizing repair costs, and minimizing production disruption risks" to solve for the optimal delivery scheme (location, type, dosage). Step 6: During the preset safe delivery period, use precision delivery equipment to perform fixed-point and quantitative delivery according to the optimized plan, record delivery data in real time, and ensure that the delivery process is coordinated and adapted to the production conditions; Step 7: Conduct sampling and testing of the remediation area according to the preset cycle to obtain data such as pollutant concentration and bacterial agent survival status, evaluate the remediation effect, and determine whether the preset remediation target has been achieved; if the target is met, the remediation is completed and the process is transferred to the later stage of supervision; if the target is not met, the plan adjustment process is initiated. Step 8: For the remediation areas that do not meet the standards, provide feedback on the reconstruction parameters, update the remediation information to the knowledge graph and 3D reconstruction model, reanalyze the pollution status of the remediation area, optimize the microbial agent delivery plan (adjust the location, type, and dosage), and repeat steps 6-7 to perform the next round of remediation until the remediation meets the standards.
[0033] Step 2 includes the following sub-steps: Step 2-1: According to the sampling plan, use low-disturbance drilling equipment to carry out drilling operations, collect soil samples and groundwater samples with high fidelity, and record the location coordinates, depth, sampling time and corresponding production condition parameters, including production load and equipment operating status in real time during the sampling process. Step 2-2: Conduct verification, including appearance verification to remove damaged or contaminated samples, purity verification to eliminate cross-contamination through blank control experiments, and consistency verification to remove samples that exceed the reasonable deviation range. Retain valid samples and label them. Steps 2-3: Use equipment including portable GC-MS, photoionization detector (PID), X-ray fluorescence spectrometer (XRF), and membrane interface detector (MIP) to quickly detect pollutant type, concentration, and physicochemical parameters such as soil moisture content, porosity, pH value, redox potential, and organic matter content in effective samples, and generate a test data report; Steps 2-4: Integrate rapid detection data, production condition data, site geological and hydrological data, etc., and perform data preprocessing, such as data cleaning to remove outliers, redundancy removal and deletion of highly correlated duplicate parameters, spatiotemporal alignment and association of various types of data, standardization processing to unify the data range, etc., to form a structured dataset; Steps 2-5: Determine whether the dataset meets the requirements for 3D reconstruction. If not, supplement the sample points accordingly and repeat steps 2-1 to 2-4. If it meets the requirements, perform 3D reconstruction. Based on the standardized dataset, combined with the site's geological stratification characteristics and groundwater runoff patterns, a three-level technical route of "CT scanning - image segmentation - model building" is adopted to characterize the spatial distribution of pollutants. Based on structured datasets, combined with information such as site geological stratification characteristics and groundwater runoff patterns, a three-dimensional underground reconstruction model is generated to identify the core pollution area, pollution plume range, and potential impact area, and underground information is output synchronously. Includes the following steps: Step 2-5-1: Perform CT tomography on the reconstructed soil sample. First, remove image noise using the nonlocal mean method, and then use a Gaussian high-pass filter to enhance the contrast between pores and the skeleton. The filtering formula is as follows: ; In the formula, Coordinates of the filtered image pixel value, These are the original pixel values.K It is a 3×3 filter kernel matrix; k This represents the row index of the filter kernel matrix. The column index of the filter kernel matrix. Coordinates in the filter kernel matrix kernel coefficients at the location, Original image coordinates Pixel value at; Step 2-5-2: Based on the trough threshold between the two peaks of the gray-level histogram, segment the soil into three parts: pores, organic matter, and soil skeleton; use the moving cube algorithm to generate a pore structure model, classify pores into three categories according to pore volume quantiles, and calculate the equivalent diameter of each pore type: ; In the formula, For the first i Pore-like equivalent diameter For the first i Mean volume of pore-like structures; Step 2-5-3: Conduct compression tests on reconstructed soil samples, plot load-void ratio curves, estimate in-situ formation loads, and define the compression coefficient. ( In-situ porosity To reconstruct porosity, ), multiply the pore volume by Restoring the original pore morphology; Step 2-5-4: Integrate pollutant concentration data with geological and hydrological parameters, and construct a spatial distribution model of pollutant concentration using the Kriging interpolation method. The formula is as follows: ; In the formula, Z ( x 0) represents the concentration of pollutants at an unknown point. Z ( x i () represents the concentration at a known sampling point. interpolation weights ( For unknown points and the first i The distance between each sampling point b (distance attenuation coefficient) n The number of samples; Step 2-5-5: Simulate pollution diffusion trends under different production conditions, generate a visualized 3D underground reconstruction model, identify the core pollution area, pollution plume range, and potential impact areas, and output underground information synchronously. The construction of the underground 3D reconstruction model requires parameter correction using a composite machine learning model. The specific steps are as follows: Step 2-5-5-1: Using the measured values of porosity and organic matter volume fraction of preset sample points as training labels, and using multi-color space features and soil resistance parameters as inputs, construct an SVR+Lasso composite model; Step 2-5-5-2: Use the mean squared error (MSE) as the loss function, as shown in the following formula: ; In the formula, Y i For the first i Measured values of a sample For the first i The model prediction value for each sample. n The number of samples; Step 2-5-5-3: Combine the CNN model with the inverse distance weighting method to perform secondary correction, and output the porosity correction value and the organic matter volume fraction correction value to optimize the accuracy of 3D reconstruction.
[0034] The sub-steps of step 3 are as follows: Step 3-1: Receive the underground 3D reconstruction model information and construct a deep learning spatial correlation model. The specific steps are as follows: Step 3-1-1: Extract pollution data from the underground 3D reconstruction model. Each sampling point is treated as a node, and the node feature vector is defined as follows: ,in For the first m The pollutants in the first k The concentration of each node, For the first t Soil physicochemical parameters in the first k The value of each node, For the first k The spatial coordinates of each node; the features of all nodes form a feature matrix. ( n For the number of nodes, d (for feature dimensions) Step 3-1-2: Construct edge weights based on the correlation between node spatial distance and pollution concentration. First, calculate the Euclidean distance between nodes. ( For the first i The node and the first j The three-dimensional Euclidean distance between the nodes For the first i The three-dimensional spatial coordinates of each node, For the first j (The three-dimensional spatial coordinates of each node); combined with concentration correlation. ( For the first i The node and the first jThe correlation coefficient of pollutant concentration at each node. For the first i The node and the first j Covariance of pollutant concentration at each node For the first i Standard deviation of pollutant concentration at each node For the first i (standard deviation of pollutant concentration at each node), edge weight ( For the first i The node and the first j The edge weights of each node. For the first i The node and the first j Euclidean distance between nodes The maximum node distance. For the first i The node and the first j (The absolute value of the correlation coefficient of the concentration of each node); construct the adjacency matrix. ( This is the adjacency matrix between nodes. (Total number of nodes), and then perform symmetric normalization. ( D (a degree matrix), diagonal elements ( For the first i Line number i (diagonal elements of the column) Step 3-1-3: Use a two-layer graph convolutional layer to extract the spatial correlation features of pollution. The first layer calculates the core formula as in Step 3-1, and the second layer outputs the final node feature matrix. The training loss function uses contrastive loss to enhance the feature discrimination of areas with high pollution concentrations. ; In the formula, To compare loss functions, For the first i The original feature vectors of each node, For the first i Enhanced feature vectors of each node, For cosine similarity, n This represents the total number of nodes. Step 3-2: Based on the output feature matrix The K-means++ algorithm was used for clustering (the spatial clustering algorithm was used to perform cluster analysis on the pollution concentration data and clustering parameters were set) to identify high-value pollution concentration clusters, pollution diffusion path clusters and isolated pollution points, and to preliminarily define the scope of pollution impact. Step 3-3: Combining spatial clustering results and site topography information, the boundary of the restoration area is initially delineated using a polygon fitting method; Steps 3-4: Use data from the production database, such as the scope of the core production area, the direction of key pipelines, and the boundary of the explosion-proof area, to optimize the initial boundary, eliminate overlapping parts, and expand the edge areas with a high risk of pollution spread. Step 3-5: Verify the validity of the boundary range to ensure that the boundary of the remediation area covers the actual pollution range. If the boundary is valid, then attribute the pollutants in the remediation area; if the boundary is invalid, then readjust the deep learning parameters and repeat steps 3-1 to 3-5. Steps 3-6: Based on the attribution analysis dataset, calculate the pollution contribution using a multi-factor comprehensive evaluation method. The calculation formula is as follows: ; In the formula, C i For the first i The pollution contribution of each pollutant c ik For the first i The pollutants in the first k Concentration at each sampling point S k For the first k The pollution impact area corresponding to each sampling point w ik For the first i The pollutants in the first k The influence weight of production activities at each sampling point m The total types of pollutants, n This represents the number of sampling points; Based on the pollution contribution ranking in descending order, and combined with the environmental risk thresholds and remediation feasibility in standards such as the "Soil Environmental Quality Construction Land Soil Pollution Risk Control Standard (GB36600-2018)" and the "Groundwater Quality Standard (GB / T14848-2017)", the main pollutant types, concentration ranges and pollution priorities are determined, and pollution information is output.
[0035] like Figure 5 As shown, step 4 is the execution flow for recommending and validating the microbial agent, which includes the following sub-steps: Step 4-1: Construct a knowledge graph of microbial inoculant remediation. The specific steps are as follows: Step 4-1-1: Select core nodes, including the inoculant node. (Including attributes such as type of microbial agent, degradation type, and suitable pH value), pollutant nodes (Including pollutant type, chemical structure, concentration threshold, and other attributes), site nodes (Including soil texture, porosity, moisture content, and other attributes), production constraint nodes (Including attributes such as production load periods and safety windows) and other nodes; Step 4-1-2: Establish core relationship edges, including "degradation relationships". (The ability of microbial agents to degrade pollutants), "compatibility" (The degree of compatibility between the microbial agent and the site conditions), "compatibility relationship" (Compatibility of microbial agents with production conditions); Step 4-1-3: Based on historical and experimental data, calculate edge weights using correlation analysis. The formula is as follows: ; In the formula, For nodes in a knowledge graph i With nodes j Interrelationship The weight, For relationship Frequency of occurrence For relationship The attribute similarity, with values ranging from [0, 1]; Step 4-1-4: Combine nodes and relation edges into triples ( head , relation , tail , weight ),For example( a 1, s 1, b 3, 0.85), forming a structured knowledge graph. G =( V , S , W ()( V For a set of nodes, S For relational edge sets, W (for weight set) Step 4-1-5: Use a supervised random walk algorithm to mine potential associations and define path features. Calculate path feature values X a,p(b) (along the path) p From the microbial agent node a To pollutant nodes b The probability of (the probability) is updated according to the following rules: ; In the formula, e For nodes in the path, sub-path The set of all nodes covered For nodes and e The value is 1 if there is a corresponding edge between them, and 0 otherwise. For nodes The total number of nodes that can be associated; Step 4-2: Based on the microbial agent remediation knowledge graph, an intelligent recommendation algorithm is used to mine the compatibility relationship between microbial agents and pollution characteristics, and a candidate microbial agent list is generated by sorting them according to the compatibility degree; the specific steps are as follows: Step 4-2-1: Use the TransE algorithm to map all nodes in the knowledge graph into low-dimensional embedding vectors. The embedding vector for the fungal agent node is... The pollutant node embedding vector is The relation edge embedding vector is Satisfying constraints ; Step 4-2-2: Target the main pollutant nodes in the current pollution information. Calculate the relationship between each inoculum node and The relevance score is calculated using the following formula: ; In the formula, For the first i Inoculum node With major pollutant nodes The relevance score, For bacterial agent nodes The low-dimensional embedding vector, for The transpose of , The embedding vector for the "degradation relation" k The denominator is the normalized term of the correlation scores of all fungal agent nodes, representing the embedding vector dimension. Step 4-2-3: Using site nodes (Current site conditions) and production constraint nodes To constrain the preferences, a second round of preference propagation is performed to calculate the constraint fit score: ; In the formula, For the first i Constraint fit score of inoculum agent For cosine similarity, ( To adapt weights to the venue, (For production compatibility weights). Step 4-2-4: Combine the correlation score and constraint fit score to obtain the overall fitness of the microbial agent: ; In the formula, For the first iThe overall compatibility of inoculants. As a weight for degradation capacity, For the first i Correlation scores between inoculants and major pollutants; Sort the microbial agents in descending order and select the top 3-5 agents to generate a candidate microbial agent list; Step 4-3: Conduct small-scale laboratory tests on the microbial agents in the candidate microbial agent list, simulate the actual environmental conditions of the site, set up experimental groups with different concentrations and application ratios of microbial agents, and simultaneously set up a blank control group; Step 4-4: During the experimental period, periodically test the degradation rate of pollutants, the survival concentration of microbial agents, and changes in soil physicochemical parameters for each group, and screen 1-2 optimal microbial agents with the best degradation effect, good survival status of microbial agents, and no negative impact on the soil environment.
[0036] like Figure 6 As shown, step 5 is the execution flow of model prediction, which includes the following sub-steps: Step 5-1: Integrate pollution information, site parameters, pilot test data, etc., and form a structured decision analysis dataset by unifying the data format and aligning it in time and space. Divide the dataset into training set and test set according to a reasonable ratio. Step 5-2: Construct a multi-machine learning fusion model, selecting Random Forest (RF), Gradient Boosting Tree (GBT), Support Vector Regression (SVR), and Deep Neural Network (DNN) as base models. Using the decision dataset training set as input, optimize the hyperparameters of each base model using a grid search method. Employ a dynamic weight allocation ensemble strategy, dynamically allocating weights based on the coefficients of determination of each base model on the validation set, as shown in the following formula: ; In the formula, This is the final predicted value of the fusion model. For the first m Dynamic weights of the basic models ( For the first m (the coefficient of determination of each basic model) For the first m The output values of the basic model M The total number of basic models; the model prediction accuracy is verified using a test set, and the coefficient of determination and mean absolute percentage error are used for evaluation; the model inputs are "pollutant concentration, site parameters, inoculant type, delivery point coordinates, and delivery amount", and the outputs are "degradation efficiency, inoculant survival time, and production interference level"; Step 5-3: Construction and solution of a multi-objective optimization model, with the objectives of "maximizing repair efficiency, minimizing repair cost, and minimizing production disruption risk". Decision variable: Coordinates of delivery point. Types of microbial agents ( T ), deployment volume (q i ); Objective function: ; In the formula, To improve overall repair efficiency ( c i0 For the first i Initial concentration at the sampling point c if The concentration after repair. For effective bacterial dosage, q i (for the amount of delivery) C total Total repair cost ( C T For the unit cost of the inoculant, C D Cost per unit of delivery d i For the first i Delivery distance to the location C F (for fixed costs of delivery equipment) R To mitigate the risk of production disruption ( d i-pro For the first i Minimum distance between the location and the production facility. t overlap This represents the percentage of time that overlaps between delivery periods and peak production times. , (weighting coefficients) N This refers to the number of delivery points; Constraints: ; In the formula, q i For the first i The amount of microbial agent applied at each location, q max , q min These are the upper and lower limits for the amount of material distributed at a single location. For the first i The three-dimensional spatial coordinates of each delivery point D forbid It is a production restricted area. The minimum repair efficiency threshold, S i For the first i Site pollution impact coefficient Q totalThe upper limit of the total amount of microbial agent is set; the NSGA-Ⅲ algorithm is used to solve the model to obtain the Pareto optimal solution set, and the scheme with the best overall performance is selected as the initial remediation scheme in combination with expert decision-making. Step 5-4: Refine the initial remediation plan into a detailed plan that includes "coordinates of the delivery point, type of inoculant, and amount of inoculant," and review it to form the final remediation plan.
[0037] Step 7 includes the following sub-steps: Step 7-1: Remediation Effectiveness Testing Based on the boundary of the remediation area, the characteristics of contamination distribution, and the delivery plan, a testing plan is formulated; Step 7-2: According to the monitoring plan, adopt a low-disturbance drilling and high-fidelity sampling method to avoid interfering with production; conduct rapid testing on the collected samples to obtain data on various test indicators; Step 7-3: Based on the test data report, conduct degradation effect assessment: calculate the remediation efficiency = (initial concentration - current concentration) / initial concentration × 100%; assess the survival of the microbial agent: analyze the correlation between the survival concentration of the microbial agent and the remediation efficiency, and determine the suitability of the microbial agent; assess production interference: confirm that the remediation operation has not had a negative impact on production conditions or equipment operation; compare with the remediation target to determine whether the remediation effect meets the standard; Step 7-4: If the standard is met, the current stage of remediation is deemed qualified, a remediation effect report is generated, and the remediation is completed and the process moves to the later stage of environmental monitoring; if the standard is not met, the reasons for the non-compliance are analyzed, the next round of remediation continues, and the feedback optimization process in step 8 is initiated.
[0038] Step 8 includes the following sub-steps: Step 8-1: Update the repair information to the microbial agent repair knowledge graph and 3D reconstruction model, and repeat steps 2-5 to obtain a new round of microbial agent delivery repair scheme after iterative optimization; Step 8-2: Following the iteratively optimized plan, repeat steps 6-7 to perform the next round of repair and repair effect detection; Step 8-3: Verification of results: If the standard is met, proceed to subsequent environmental monitoring; if the standard is still not met, repeat steps 8-1-8-2 until the remediation effect meets the standard.
[0039] To verify the effectiveness of the system and method of the present invention, two embodiments will be described in further detail below, using actual contaminated sites. It should be noted that, unless otherwise specified, the embodiments and features described herein can be combined with each other.
[0040] The main difference between Example 2 and Example 1 is that, unlike Example 1 which treats contaminated soil, Example 2 treats contaminated groundwater.
[0041] Example 1: This example describes a large-scale petrochemical production facility where, due to point-source leaks in petrochemical product pipelines during long-term production, the soil has become contaminated with petroleum hydrocarbons (C10-C40, including n-alkanes and polycyclic aromatic hydrocarbons). The soil type is silty clay (permeability coefficient 1.2 × 10⁻⁶). -6 (m³ / s), the contaminated area is adjacent to the tank foundation, and remediation must be carried out without affecting the normal operation of the tank to achieve compliance with standards for all components of petroleum hydrocarbons. The specific implementation process is as follows: S1 receives real-time operating data from the production database and, combined with historical petroleum hydrocarbon pollution data from the site, formulates a low-disturbance sampling plan: sampling points are set up to avoid the foundations of storage tanks and oil pipelines, with a sampling depth of 4m, sampling is conducted on weekends, and the sampling frequency is once every 7 days. Disturbance-resistant drilling tools are used to reduce damage to the soil structure.
[0042] S2 performed low-disturbance drilling and high-fidelity sampling according to the sampling plan. After sample quality control screening, one damaged sample was removed, and five valid samples were retained. The rapid detection module determined the total concentration of petroleum hydrocarbons (C10-C40) to be 5800-6500 mg / kg (of which n-alkanes accounted for 72%, with a concentration of 3956-4460 mg / kg; polycyclic aromatic hydrocarbons accounted for 28%, with a concentration of 1844-2040 mg / kg), which met the engineering premise of "remediation is required for values exceeding the screening value". The physicochemical parameters were: soil porosity 38%, pH 7.1-7.3, organic matter content 2.1%, and redox potential 230 mV. The heterogeneous data fusion module integrated the detection data and site geological data, and formed a structured dataset after redundancy removal and standardization. The underground 3D reconstruction module generated a model, which identified a core pollution area of 150 m², with a high degree of spatial overlap between n-alkanes and polycyclic aromatic hydrocarbons. The diffusion path extended longitudinally along the soil pores, and the potential impact range did not involve the tank foundation.
[0043] Based on underground information, the S3 deep learning modeling module uses the total concentration of petroleum hydrocarbons, the concentration of each component, and soil physicochemical parameters as node features, and the crude oil leakage amount as the associated edge weight. Through spatial clustering, it identifies one cluster area for the pollution diffusion path. The remediation area boundary definition and optimization module, combined with the tank layout, avoids the tank foundation protection zone and optimizes the remediation area boundary to 138m², ensuring coverage of the pollution range of all components of petroleum hydrocarbons. The pollution attribution module calculates that n-alkanes contribute 72% to the pollution and polycyclic aromatic hydrocarbons contribute 28%, determining n-alkanes as the main pollutant and polycyclic aromatic hydrocarbons as the secondary pollutant. Both components are concentrated in the area 8-12m around the leak point. The degradation characteristics and spatial distribution of each component are considered simultaneously when formulating the comprehensive remediation plan.
[0044] S4 invokes the microbial agent remediation knowledge graph and recommends three candidate agent combinations based on the characteristics of petroleum hydrocarbon complex pollution, as shown in the table below (all satisfying a viable bacteria concentration ≥ 1 × 10⁻⁶). 9 CFU / mL):
[0045] A small-scale laboratory experiment simulated the soil environment of the site (22% moisture content, 38% porosity, and 25°C), monitoring for 18 days to obtain the degradation and survival effects of three candidate microbial agents: Combination 1 showed the highest degradation rates for both petroleum hydrocarbons and n-alkanes, making it suitable for the pollution characteristics of n-alkanes as the main pollutant in this embodiment; although Combination 2 had a slightly lower degradation rate than Combination 1, it had the highest survival concentration of the microbial agent, demonstrating better environmental adaptability potential; Combination 3 showed weaker degradation and survival performance than the other two. Combinations 1 and 2 were selected as the optimal candidate microbial agents, providing data support for subsequent multi-machine learning models.
[0046] S5 integrates multi-source data to train a multi-machine fusion model to predict the degradation effect of each component, the survival of microbial agents, and the risk of production interference under different delivery schemes. An optimization model is constructed with the goal of "maximizing the remediation efficiency of all components of petroleum hydrocarbons, minimizing the remediation cost, and minimizing the risk of production interference", and the NSGA-Ⅲ algorithm is used to solve it.
[0047] Prediction results show that, with a dosage of 150 L / point, combination 1 can reduce the concentration of petroleum hydrocarbons to 1150 mg / kg in 135 days, demonstrating better overall performance than combination 2 (combination 2 requires 150 days to reduce to 1250 mg / kg). Combination 1 was ultimately selected as the final inoculant, with the optimal delivery scheme being 5 delivery points at a dosage of 150 L / point.
[0048] During the preset safe delivery period, the S6 performs fixed-point and quantitative delivery through precise delivery equipment, and records delivery data in real time: the coordinate deviation of the delivery point is ≤0.3m, the actual delivery amount is 148-151L / point, and there is no production interference during the delivery process.
[0049] S7 conducted sampling and testing of the remediation area according to a preset cycle, simultaneously monitoring the total concentration of petroleum hydrocarbons and the concentration of each component. The data on the changes in pollutant concentrations during the remediation process are shown in the table below:
[0050] During the initial remediation period (0-45 days), the microorganisms adapt and their degradation efficiency is relatively low. The middle period (45-90 days) is the active microbial period, with the largest decrease in pollutant concentration. In the later period (90-135 days), the pollutant concentration stabilizes, decreasing to 1100-1300 mg / kg, reaching the preset risk control target (below the screening value of 4500 mg / kg), and the concentration of the microbial agent remains at an effective level, inhibiting pollutant rebound. The remediation is then deemed to have met engineering control requirements, and the process transitions to post-remediation monitoring.
[0051] Example 2: This implementation involves a contaminated site in operation where groundwater is contaminated with a combination of vinyl chloride and chloroform. In-situ remediation is required while ensuring continuous production. The contaminated area is close to the production water supply system, necessitating strict control over groundwater runoff during the remediation process, while simultaneously achieving the efficient degradation of both pollutants. The specific implementation process is as follows: S1 receives real-time operating data from the production database and, combined with historical chlorinated hydrocarbon pollution data from the site, formulates a low-disturbance sampling plan: sampling points avoid core production units and water supply wells, five sampling points are set up, the sampling depth is 5m, the sampling period is selected from 2:00 AM to 5:00 AM, the sampling frequency is once every three days, and it is equipped with anti-disturbance drilling tools and pressure feedback devices.
[0052] S2 employed low-disturbance drilling equipment to conduct drilling operations according to the sampling plan, collecting high-fidelity groundwater samples. After sample quality control screening, two interfering samples were removed, retaining three valid samples. Portable GC-MS and PID rapid detection revealed: vinyl chloride concentration 130-170 μg / L, chloroform concentration 280-330 μg / L; physicochemical parameters: pH 6.8-7.2, redox potential 280-310 mV, water content 26%, and groundwater runoff velocity 0.3 m / d. A heterogeneous data fusion module integrated the detection data with site hydrogeological data, forming a standardized dataset through data cleaning and spatiotemporal alignment. A groundwater 3D reconstruction module generated a visual model, clearly identifying a core pollution area of approximately 200 m², with the diffusion path extending along the groundwater runoff direction, and the potential impact area covering three production auxiliary facilities.
[0053] Based on underground 3D reconstruction information, the S3 deep learning modeling module uses the concentration data and physicochemical parameters of the two pollutants as node features and the discharge intensity of production wastewater as the associated edge weight. Through spatial clustering, two high-value pollution clusters are identified. The remediation area boundary definition and optimization module, combined with the layout of production facilities, avoids production water supply wells and pipelines, adjusting the remediation area boundary to 180m² to ensure coverage of all polluted areas and diffusion paths of the two pollutants. The pollution attribution module calculates that vinyl chloride pollution contributes 68% and PCE pollution contributes 32%, identifying chloroform as the primary pollutant (concentrated in the area 10-15m downstream of the production wastewater collection pond) and vinyl chloride as the secondary pollutant (distributed around the chloroform pollution area). Simultaneously, the spatial overlap range and independent diffusion areas of the two pollutants are clarified, providing a basis for the formulation of a comprehensive remediation plan.
[0054] S4 invokes the microbial agent remediation knowledge graph and recommends three candidate agents based on the characteristics of chlorinated hydrocarbon complex pollution, as shown in the table below (all satisfying a viable bacteria concentration ≥ 1 × 10⁻⁶). 9 CFU / mL):
[0055] A small-scale laboratory experiment simulated the groundwater environment of the site (pH 7.0, temperature 22℃, dissolved oxygen 1.2 mg / L). Three experimental groups and one control group were set up. Monitoring for 14 days revealed that combination 2 exhibited a slightly higher degradation rate for vinyl chloride than combination 1, but combination 1 showed a superior degradation rate for chloroform and the highest bacterial concentration, demonstrating stronger overall adaptability to the degradation of both pollutants. Combination 3 showed the lowest degradation rates for both pollutants. Based on the characteristics of the combined vinyl chloride and chloroform pollution in this embodiment, combinations 1 and 2 were selected as the optimal candidate bacterial agents, providing fundamental data for subsequent multi-objective optimization models.
[0056] S5 integrates multi-source data to train a multi-machine learning fusion model, inputs a decision analysis dataset, and predicts the degradation effect of two pollutants, the survival status of the microbial agents, and the risk of production interference under different microbial agents and different delivery schemes. A multi-objective optimization model is constructed with the goal of "maximizing remediation efficiency, minimizing remediation cost, and minimizing production interference risk", and the NSGA-Ⅲ algorithm is used to solve it.
[0057] The prediction results show that, under the same dosage, combination 1 has a 12% higher overall degradation efficiency for both pollutants than combination 2, a 15% improvement in the survival stability of the microbial agent, and an 8% reduction in remediation costs. Combination 1 was ultimately selected as the final microbial agent, and a comprehensive remediation plan was developed: 3 application points, with a dosage of 200L / point, ensuring simultaneous coverage of the pollution range of both pollutants.
[0058] During the preset safe delivery period, the S6 uses precise delivery equipment to perform fixed-point and quantitative delivery according to the optimized plan, and records delivery data in real time: the coordinate deviation of the point is ≤0.5m, the actual delivery amount is 198-203L / point, there are no abnormal fluctuations in production during the delivery process, and it does not interfere with the groundwater runoff.
[0059] S7 conducted sampling and testing of the remediation area according to a preset cycle, simultaneously monitoring the concentrations of two pollutants and the survival status of the microbial agent. The data on the changes in pollutant concentrations during the remediation process are shown in the table below:
[0060] During the remediation process, the concentrations of both pollutants decreased in a stepwise manner, consistent with the phased degradation process of microorganisms. After 90 days, the concentrations of vinyl chloride and chloroform dropped to 3-5 μg / L and chloroform to 45-58 μg / L, both strictly meeting the Class III standard limits of the Groundwater Quality Standard (GB / T14848-2017). Furthermore, the microbial agent maintained an effective survival concentration, preventing pollution rebound. The remediation of both pollutants was deemed satisfactory, and the process transitioned to post-remediation monitoring.
Claims
1. A smart recommendation system for microbial inoculants to remediate contaminated sites, characterized in that, It includes a high-fidelity sampling and reconstruction subsystem, a remediation area pollution attribution factor system, and a site remediation decision-making subsystem; the output of the high-fidelity sampling and reconstruction subsystem is connected to the input of the remediation area pollution attribution factor system, and the output of the remediation area pollution attribution factor system is connected to the input of the site remediation decision-making subsystem. The output of the high-fidelity sampling and reconstruction subsystem is connected to the input of the site remediation decision subsystem, and the output of the site remediation decision subsystem is connected to the input of the high-fidelity sampling and reconstruction subsystem.
2. The system according to claim 1, characterized in that, The high-fidelity sampling and reconstruction subsystem is used to collect soil and groundwater samples from contaminated sites, complete the three-dimensional reconstruction of underground and the preliminary extraction of pollution information, output underground information to the pollution attribution factor system of the remediation area, and simultaneously receive remediation information feedback instructions from the site remediation decision subsystem and dynamically optimize the sampling strategy. The pollution attribution system for the remediation area is used to receive underground information, optimize the boundary of the remediation area and attribute major pollutants, and output pollution information to the site remediation decision subsystem. The site remediation decision subsystem, based on underground information and pollution information, intelligently recommends microbial agents, optimizes delivery plans and executes precise delivery, and outputs remediation information to the high-fidelity sampling and reconstruction subsystem to update heterogeneous data fusion.
3. The system according to claim 2, characterized in that, The high-fidelity sampling and reconstruction subsystem includes a sample point confirmation module; the output of the sample point confirmation module is connected to the input of the low-disturbance drilling module, the output of the low-disturbance drilling module is connected to the input of the high-fidelity sampling module, the output of the high-fidelity sampling module is connected to the input of the sample quality control and screening module, the output of the sample quality control and screening module is connected to the input of the rapid detection module, the output of the rapid detection module is connected to the input of the heterogeneous data fusion module, and the output of the heterogeneous data fusion module is connected to the input of the underground three-dimensional reconstruction module.
4. The system according to claim 3, characterized in that, The sampling point confirmation module formulates a sampling plan based on production condition data, historical site pollution data, and previous monitoring data issued by the production database. The sampling plan includes sampling points, sampling time periods, sampling depth, and sampling frequency. The low-disturbance drilling module conducts drilling operations using low-disturbance drilling technology according to the sampling plan output by the sample point confirmation module. It is equipped with anti-disturbance drilling tools and pressure feedback devices to reduce the interference of the drilling process on the site soil structure, groundwater runoff and production facilities, and accurately obtain soil and groundwater samples at different depths. The high-fidelity sampling module is used to collect and preserve soil and groundwater samples obtained by the low-disturbance drilling module with high fidelity, avoiding component volatilization, oxidation or cross-contamination during the collection process and ensuring the authenticity of the samples. The sample quality control and screening module performs purity testing, contamination screening, and validity determination on the samples transported to the detection area by the high-fidelity sampling module. It eliminates interfered, damaged, or invalid samples through sample appearance observation, purity analysis, and blank control experiments to ensure the reliability of the detection data. The rapid detection module is used to rapidly determine the pollutant type, concentration, and site physicochemical parameters of valid samples selected by the sample quality control screening module. The pollutant detection covers heavy metals, volatile organic compounds, and semi-volatile organic compounds, while the physicochemical parameter detection includes soil moisture content, porosity, pH value, redox potential, and organic matter content. The module also generates a detection data report simultaneously. The heterogeneous data fusion module is used to integrate the detection data output by the rapid detection module with historical pollution data and site geological and hydrological data from the multi-source soil database. The data is processed through data cleaning, redundancy removal, spatiotemporal alignment, and format standardization. The underground 3D reconstruction module combines the site geological stratification characteristics and groundwater runoff patterns output by the heterogeneous data fusion module. Through 3D modeling technology, it depicts the spatial distribution, concentration gradient, diffusion trend, and correlation with site geochemical parameters of pollutants underground, generating a visualized underground 3D reconstruction model that clarifies the core pollution area, diffusion path, and potential impact range.
5. The system according to claim 2, characterized in that, The pollution attribution system for the remediation zone includes a deep learning module; the output of the deep learning module is connected to the input of the remediation zone boundary optimization module, and the output of the remediation zone boundary optimization module is connected to the inputs of both the deep learning module and the pollution attribution module. The deep learning module is used for deep learning spatial correlation modeling. It uses pollution concentration data and field geomorphological parameters in underground information as node features and key production factors as correlation edge weights. Through model training, it explores the potential correlation between pollution distribution and production activities. Based on the output of the deep learning model, a spatial clustering algorithm is used to perform cluster analysis on the pollution concentration data, identify high-value pollution concentration clusters, pollution diffusion path clusters, and pollution isolated points, and preliminarily define the scope of pollution impact. The remediation zone boundary optimization module initially delineates the remediation zone boundary based on data from the deep learning module, combined with spatial clustering results and site topography, ensuring that the boundary covers all pollution cluster areas and potential diffusion paths. It then adjusts and optimizes the initially defined remediation zone boundary based on the production facility layout provided by the production database, ensuring that the remediation zone boundary covers the actual pollution area without interfering with production operations. The pollution attribution module, combined with the output of the remediation zone boundary optimization module, calculates the pollution contribution of different pollutants through a multi-factor comprehensive evaluation method, clarifies the impact weight of various pollutants on site pollution, and, based on the pollution contribution ranking results, pollutant environmental risk thresholds and remediation feasibility, selects the main pollutants that have the greatest impact on the site environment, the highest risk, and require priority remediation, and clarifies the concentration range and spatial distribution of the main pollutants.
6. The system according to any one of claims 1 to 5, characterized in that, The site remediation decision-making subsystem includes a knowledge graph module; the output of the knowledge graph module is connected to the input of the intelligent recommendation algorithm module, the output of the intelligent recommendation algorithm module is connected to the input of the fungicide effect testing module, the output of the fungicide effect testing module is connected to the input of the knowledge graph module and the multi-machine learning fusion module, the data end of the multi-machine learning fusion module is bidirectionally connected to the data end of the multi-objective optimization module, the output of the multi-objective optimization module is connected to the input of the remediation scheme implementation module, the output of the remediation scheme implementation module is connected to the input of the remediation effect detection module, and the output of the remediation effect detection module is connected to the input of the knowledge graph module.
7. The system according to claim 6, characterized in that, The knowledge graph module is used to construct a multi-dimensional structured association network in the field of microbial inoculant remediation. With "entity-relationship-weight" as the core architecture, it realizes the semantic association between inoculants and information such as pollution characteristics, site conditions, and production constraints. The intelligent recommendation algorithm module is used to call the microbial agent remediation knowledge graph constructed by the knowledge graph module based on pollution information, mine the matching relationship between microbial agents and pollution characteristics through the intelligent recommendation algorithm, initially recommend candidate microbial agents, and generate a candidate agent list by sorting according to the degree of matching. The microbial agent effect testing module is used to conduct small-scale laboratory tests on microbial agents from the candidate microbial agent list recommended by the intelligent recommendation algorithm module. It simulates the actual environmental conditions of the site, sets up experimental groups with different microbial agent concentrations and application ratios, and simultaneously sets up a blank control group. During the experimental period, it regularly monitors the pollutant degradation rate, microbial agent survival concentration, and changes in soil physicochemical parameters in each group, screens the optimal microbial agent with the best degradation effect, good microbial agent survival status, and no negative impact on the soil environment, and determines the initial application parameters. The multi-machine learning fusion module integrates into a fusion model through strategies such as dynamic weight allocation. It inputs a decision analysis dataset to predict the pollutant degradation effect, microbial agent survival status and potential impact on production under different deployment schemes, including location, type and dosage. The multi-objective optimization module, combined with a multi-machine learning fusion module, aims to maximize repair efficiency, minimize repair costs, and minimize production interference risks. It optimizes and determines the location, type, and amount of each delivery point by taking into account the characteristics of the microbial agent and site constraints. The remediation scheme implementation module is used to deliver microbial agents to the target area through a fixed-point delivery device according to the optimization scheme output by the multi-objective optimization module, and to record data such as delivery location coordinates, delivery amount, delivery time, and environmental parameters in real time. The remediation effect detection module is used to sample and detect different points in the remediation area at a preset cycle after the remediation scheme implementation module is implemented, to obtain data on changes in pollutant concentration, bacterial agent survival concentration, and soil physicochemical parameters, and to evaluate the remediation effect.
8. The system according to claim 1, 2, 3, 4, 5, or 7, characterized in that, When the system is in operation, the specific steps include: Step 1: Receive real-time operating data from the production database, combine it with historical site pollution information, formulate a low-disturbance sampling plan, specify the sampling points, depth, time period and safety protection measures, and ensure that sampling operations avoid the core production area and peak hours to reduce interference with production operations; Step 2: Perform low-disturbance drilling and high-fidelity sampling according to the sampling plan, complete sample quality control screening, obtain pollution data and physicochemical parameters through rapid detection and data calibration, construct a three-dimensional underground reconstruction model based on multi-source heterogeneous data fusion, accurately depict the spatial distribution of pollution, and output underground information to the pollution attribution factor system; Step 3: Based on underground information, identify the pollution range through deep learning modeling and spatial clustering, combine production database analysis to optimize the boundary of the remediation area, determine the main pollutants by pollution attribution, and output pollution information including the boundary of the remediation area, pollutant type and concentration to the remediation decision subsystem. Step 4: Call the intelligent recommendation algorithm based on the knowledge graph of microbial agent remediation, combine the pollution information to recommend candidate microbial agents, and screen the optimal microbial agent through laboratory small-scale experiments; Step 5: Based on underground information, integrate multi-source data to train a multi-machine learning fusion model to predict the repair effect. With the goal of "maximizing repair efficiency, minimizing repair cost, and minimizing production interference risk", construct an optimization model and solve for the optimal delivery scheme. Step 6: During the preset safe delivery period, use precision delivery equipment to perform fixed-point and quantitative delivery according to the optimized plan, record delivery data in real time, and ensure that the delivery process is coordinated and adapted to the production conditions; Step 7: Conduct sampling and testing of the remediation area according to the preset cycle to obtain data such as pollutant concentration and bacterial agent survival status, evaluate the remediation effect, and determine whether the preset remediation target has been achieved; if the target is met, the remediation is completed and the process is transferred to the later stage of supervision; if the target is not met, the plan adjustment process is initiated. Step 8: For the remediation areas that do not meet the standards, provide feedback on the reconstruction parameters, update the remediation information to the knowledge graph and 3D reconstruction model, reanalyze the contamination status of the remediation area, optimize the microbial agent delivery plan, and repeat steps 6-7 to perform the next round of remediation until the remediation meets the standards.
9. The system according to claim 8, characterized in that, Step 2 includes the following sub-steps: Step 2-1: According to the sampling plan, use low-disturbance drilling equipment to carry out drilling operations, collect soil samples and groundwater samples with high fidelity, and record the location coordinates, depth, sampling time and corresponding production condition parameters in real time during the sampling process. Step 2-2: Conduct verification through the sample quality control screening module, such as appearance verification to remove damaged or contaminated samples, purity verification to eliminate cross-contamination through blank control experiments, and consistency verification to remove samples that exceed the reasonable deviation range, retaining valid samples and marking them with numbers; Steps 2-3: Using the rapid detection module, devices including portable GC-MS, photoionization detector (PID), X-ray fluorescence spectrometer (XRF), and membrane interface detector (MIP) are used to rapidly detect pollutant type, concentration, and physicochemical parameters such as soil moisture content, porosity, pH value, redox potential, and organic matter content in valid samples, and generate a detection data report. Steps 2-4: The heterogeneous data fusion module integrates rapid detection data, production condition data, and site geological and hydrological data, and performs data preprocessing, such as data cleaning to remove outliers, redundancy removal to delete highly correlated duplicate parameters, spatiotemporal alignment to associate various types of data, and standardization to unify the data range, forming a structured dataset. Step 2-5: Determine whether the dataset meets the requirements for 3D reconstruction. If not, supplement the sample points accordingly and repeat steps 2-1 to 2-4. If it meets the requirements, perform 3D reconstruction. The underground 3D reconstruction module is based on a standardized dataset and combines the site's geological stratification characteristics and groundwater runoff patterns. It adopts a three-level technical route of "CT scanning-image segmentation-model construction" to characterize the spatial distribution of pollutants.
10. The system according to claim 9, characterized in that, Steps 2-5 include the following steps: Step 2-5-1: Perform CT tomography on the reconstructed soil sample. First, remove image noise using the nonlocal mean method, and then use a Gaussian high-pass filter to enhance the contrast between pores and the skeleton. The filtering formula is as follows: ; In the formula, Coordinates of the filtered image pixel value, These are the original pixel values. K It is a 3×3 filter kernel matrix; k This represents the row index of the filter kernel matrix. The column index of the filter kernel matrix. Coordinates in the filter kernel matrix kernel coefficients at the location, Original image coordinates Pixel value at; Step 2-5-2: Based on the trough threshold between the two peaks of the gray-level histogram, segment the soil into three parts: pores, organic matter, and soil skeleton; use the moving cube algorithm to generate a pore structure model, classify pores into three categories according to pore volume quantiles, and calculate the equivalent diameter of each pore type: ; In the formula, For the first i Pore-like equivalent diameter For the first i Mean volume of pore-like structures; Step 2-5-3: Conduct compression tests on reconstructed soil samples, plot load-void ratio curves, estimate in-situ formation loads, and define the compression coefficient. , In-situ porosity To reconstruct porosity, Multiply the pore volume by Restoring the original pore morphology; Step 2-5-4: Integrate pollutant concentration data with geological and hydrological parameters, and construct a spatial distribution model of pollutant concentration using the Kriging interpolation method. The formula is as follows: ; In the formula, Z ( x 0) represents the concentration of pollutants at an unknown point. Z ( x i () represents the concentration at a known sampling point. For interpolation weights, For unknown points and the first i The distance between each sampling point b This is the distance attenuation coefficient; n The number of samples; Step 2-5-5: Simulate the pollution diffusion trend under different production conditions, generate a visualized underground 3D reconstruction model, and identify the core pollution area, pollution plume range, and potential impact area.