Machine learning based regional pollution investigation intelligent decision sampling system and method
By employing a machine learning-based intelligent decision sampling method, utilizing geographic environmental data and data fusion technology, sampling points are iteratively deployed at each level, solving the problems of insufficient resolution and high cost in existing technologies, and achieving accurate and efficient pollution investigation and management support.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- CHINA CONSTRUCTION EIGHTH BUREAU ENVIRONMENTAL PROTECTION TECHNOLOGY CO LTD
- Filing Date
- 2026-04-24
- Publication Date
- 2026-06-19
AI Technical Summary
In existing regional pollution surveys, the grid method for setting up sampling points has insufficient resolution, is prone to missing hotspots, is difficult to accurately identify risk areas that need to be focused on, and is also costly and inefficient.
A machine learning-based intelligent decision sampling method is adopted. By constructing a machine learning model, the pollution risk is initially screened using geographical environmental data. Sampling points are deployed in a step-by-step manner. Combined with data fusion technology, a pollution distribution prediction map is generated. Suspected pollution hotspots are identified through three-level iterative sampling, and the deployment of sampling points is dynamically adjusted.
It enables precise sampling of high-risk pollution areas, reduces survey costs, improves survey efficiency and targeting, and the output results support environmental planning and risk management, thus having high application value.
Smart Images

Figure CN122241378A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of sampling technology in machine learning, and specifically to an intelligent decision-making sampling system and method for regional pollution investigation based on machine learning. Background Technology
[0002] Current regional pollution surveys mainly rely on standards such as the "Technical Specification for Soil Environmental Monitoring" (HJ / T 166-2004), and adopt a systematic grid method to deploy sampling points. This "uniform distribution" model exposes many inherent defects when dealing with vast and highly heterogeneous areas: insufficient "resolution," making it easy to miss hotspots and difficult to accurately identify risk areas that need key control; high economic cost and low efficiency: simply reducing the grid spacing to improve the accuracy of the survey will lead to a geometric increase in the number of sampling points, resulting in huge costs and time costs for sample collection, transportation, laboratory analysis, and making the survey work cumbersome and uneconomical.
[0003] In recent years, machine learning technology has made significant progress in the field of site environmental management, providing a new technological path for intelligent environmental monitoring. However, existing technologies mostly focus on data analysis, thus necessitating a new technical solution for sampling. Summary of the Invention
[0004] The purpose of this invention is to overcome the shortcomings of the prior art and provide a machine learning-based intelligent decision-making sampling system and method for regional pollution investigation. This addresses the problems of insufficient resolution, easy omission of hotspots, difficulty in accurately identifying risk areas requiring key control, high economic costs, and low efficiency associated with existing grid-based sampling point deployment methods.
[0005] The technical solution to achieve the above objectives is: This invention provides a machine learning-based intelligent decision-making sampling method for regional pollution surveys, comprising the following steps: Build a machine learning model; Collect geographic environmental data for the target area; The collected geographic environmental data of the target area is input into the machine learning model for prediction to obtain a preliminary screening map of the pollution risk of the target area. First-level sampling points are set up on the pollution risk screening map of the target area, and on-site detection data are collected at the locations of the corresponding first-level sampling points within the target area. Data fusion technology is used to fuse the on-site detection data at the location of the corresponding first-level sampling point with the pollution risk screening map of the target area to generate a first-level pollution distribution prediction map. Second-level sampling points were set up in the medium- and high-risk areas of the first-level pollution distribution prediction map, and on-site detection data were collected at the locations of the corresponding second-level sampling points within the target area. Data fusion technology is used to fuse the on-site detection data at the corresponding second-level sampling points with the first-level pollution distribution prediction map to generate a second-level pollution distribution prediction map. Displays the first-level sampling points, the second-level sampling points, and a predicted distribution map of the second-level pollution.
[0006] A further improvement of the machine learning-based intelligent decision-making sampling method for regional pollution surveys in this invention is that it further includes: On the second-level pollution distribution prediction map, suspected pollution hotspots are delineated using on-site detection data at the locations of the corresponding first-level sampling points and on-site detection data at the locations of the corresponding second-level sampling points. Third-level sampling points were set up in the identified suspected pollution hotspots to collect on-site detection data at the locations of the corresponding third-level sampling points within the target area; Data fusion technology is used to fuse the on-site detection data at the corresponding third-level sampling points with the second-level pollution distribution prediction map to generate a third-level pollution distribution prediction map.
[0007] A further improvement of the machine learning-based intelligent decision-making sampling method for regional pollution surveys in this invention is that it further includes: After collecting the field detection data at the location of the corresponding first-level sampling point, the machine learning model is retrained using the collected field detection data at the location of the corresponding first-level sampling point. After collecting the field detection data at the location of the corresponding second-level sampling point, the machine learning model is retrained using the collected field detection data at the location of the corresponding second-level sampling point. After collecting the field detection data at the location of the corresponding third-level sampling point, the machine learning model is retrained using the collected field detection data at the location of the corresponding third-level sampling point.
[0008] A further improvement of the machine learning-based intelligent decision sampling method for regional pollution surveys in this invention is that the collected geographic environmental data of the target area includes at least three of the following: basic site information, distribution of industrial pollution sources, soil type, geological background, road traffic data, and remote sensing images.
[0009] A further improvement of the machine learning-based intelligent decision sampling method for regional pollution investigation in this invention is that the selected data fusion technology includes at least one of Kriging interpolation and Bayesian maximum entropy method.
[0010] This invention also provides a regional survey intelligent decision sampling system based on machine learning, comprising: The predictive modeling module is used to build a machine learning model. The data acquisition module is used to collect geographic environmental data of the target area; The risk screening module is connected to the data acquisition module and the prediction modeling module. The risk screening module is used to input the collected geographic environmental data of the target area into the machine learning model for prediction, so as to obtain a pollution risk screening map of the target area. An intelligent decision-making module, connected to the risk screening module and the data acquisition module, is used to deploy first-level sampling points on the pollution risk screening map of the target area, and collect on-site detection data at the locations corresponding to the first-level sampling points within the target area through the data acquisition module. The intelligent decision-making module is also used to fuse the collected on-site detection data at the locations corresponding to the first-level sampling points with the pollution risk screening map of the target area using data fusion technology to generate a first-level pollution distribution prediction map. Furthermore, the intelligent decision-making module is used to deploy second-level sampling points in the medium-to-high-risk areas of the first-level pollution distribution prediction map, and collect on-site detection data at the locations corresponding to the second-level sampling points within the target area through the data acquisition module. The intelligent decision-making module is also used to fuse the collected on-site detection data at the locations corresponding to the second-level sampling points with the first-level pollution distribution prediction map using data fusion technology to generate a second-level pollution distribution prediction map. The visualization platform module, connected to the intelligent decision-making module, is used to display the first-level sampling points, the second-level sampling points, and the second-level pollution distribution prediction map.
[0011] A further improvement of the machine learning-based regional survey intelligent decision sampling system of the present invention is that the intelligent decision module is also used to delineate suspected pollution hotspots on the second-level pollution distribution prediction map using the on-site detection data at the location of the corresponding first-level sampling point and the on-site detection data at the location of the corresponding second-level sampling point. Third-level sampling points are set up in the identified suspected pollution hotspot areas, and on-site detection data at the locations of the corresponding third-level sampling points within the target area are collected through the data acquisition module. It is also used to combine the on-site detection data at the location of the corresponding third-level sampling point with the second-level pollution distribution prediction map using data fusion technology to generate a third-level pollution distribution prediction map.
[0012] A further improvement of the machine learning-based regional survey intelligent decision sampling system of the present invention is that the prediction modeling module is also used to receive the field detection data of the corresponding first-level sampling point sent by the data acquisition module, and to retrain the machine learning model using the received field detection data of the corresponding first-level sampling point. The predictive modeling module is also used to receive the field detection data at the location of the corresponding second-level sampling point sent by the data acquisition module, and to retrain the machine learning model using the received field detection data at the location of the corresponding second-level sampling point. The predictive modeling module is also used to receive the field detection data at the location of the corresponding third-level sampling point sent by the data acquisition module, and to retrain the machine learning model using the received field detection data at the location of the corresponding third-level sampling point.
[0013] A further improvement of the machine learning-based regional survey intelligent decision sampling system of the present invention is that the geographical environmental data collected by the data acquisition module includes at least three of the following: basic site information, distribution of industrial pollution sources, soil type, geological background, road traffic data, and remote sensing images.
[0014] A further improvement of the machine learning-based regional survey intelligent decision sampling system of the present invention is that the data fusion technology selected by the intelligent decision module includes at least one of Kriging interpolation and Bayesian maximum entropy method.
[0015] The beneficial effects of the regional survey intelligent decision-making sampling system and method based on machine learning of this invention are as follows: The sampling system and method of this invention are precise, efficient, and targeted. Based on the initial screening guided by the machine learning model, sampling points are deployed in a way that involves at least two levels of iteration. This transforms limited survey resources from uniform casting to focused fishing, enabling precise sampling of high-risk pollution areas and improving the targeting and efficiency of the census.
[0016] The sampling system and method of this invention are cost-effective. This invention dynamically adjusts the sampling points based on actual on-site detection data, avoiding oversampling in clean areas and undersampling in risk areas. It can obtain richer and more accurate information on contaminated spaces with a much smaller sample size than traditional methods, significantly reducing the total cost of the census.
[0017] The sampling system and method of this invention construct a machine learning model with self-learning ability, which can continuously improve itself as data accumulates, has a stronger adaptability to complex areas, and makes the census results more scientific and reliable.
[0018] The sampling system and method of this invention have significantly increased the value of the results: the final output is not only discrete sample data, but also a digital and visualized result system that includes spatial prediction, risk classification and uncertainty, which can directly serve refined management decisions such as environmental planning, risk management and land reuse, and has extremely high application value. Attached Figure Description
[0019] Figure 1 This is a system diagram of the machine learning-based regional survey intelligent decision sampling system of the present invention.
[0020] Figure 2 This is a flowchart of the machine learning-based regional survey intelligent decision sampling method of the present invention.
[0021] Figure 3 This is an architecture diagram of the machine learning-based regional survey intelligent decision sampling system of the present invention.
[0022] Figure 4 This is a flowchart illustrating the model optimization and intelligent decision-making process of the regional survey intelligent decision-making sampling system, i.e., the method based on machine learning, according to the present invention. Detailed Implementation
[0023] The present invention will be further described below with reference to the accompanying drawings and specific embodiments.
[0024] See Figure 1 This invention provides a machine learning-based intelligent decision-making sampling system and method for regional surveys. Through the collection of multi-source geographic environmental data and dynamic prediction by machine learning models, it generates and updates an initial pollution risk screening map. Based on a two- or three-level iterative sampling strategy, it gradually focuses on high-risk pollution areas from macro to micro levels. Combined with dynamic feedback from real-time monitoring data, it optimizes the sampling point layout, achieving adaptive and intelligent survey processes. This effectively solves the problems of insufficient resolution, easy omission of hotspots, difficulty in accurately identifying risk areas requiring key control, high economic costs, and low efficiency inherent in traditional survey methods using uniform point distribution. The following description, in conjunction with the accompanying drawings, illustrates the machine learning-based intelligent decision-making sampling system and method for regional surveys of this invention.
[0025] It should be noted that, unless otherwise specified, the embodiments and features described in this application can be combined with each other.
[0026] See Figure 1 The diagram shows the system diagram of the regional survey intelligent decision sampling system based on machine learning of this invention. The following is in conjunction with... Figure 1 The present invention describes the regional survey intelligent decision sampling system based on machine learning.
[0027] like Figure 1As shown, the machine learning-based regional survey intelligent decision-making sampling system of the present invention includes a predictive modeling module 21, a data acquisition module 22, a risk screening module 23, an intelligent decision-making module 24, and a visualization platform module 25. The risk screening module 23 is connected to the data acquisition module 22 and the predictive modeling module 21; the intelligent decision-making module 24 is connected to the risk screening module 23 and the data acquisition module 22; and the visualization platform module 25 is connected to the intelligent decision-making module 24. The predictive modeling module 21 is used to construct a machine learning model; the data acquisition module 22 is used to collect geographical environmental data of the target area; the risk screening module 23 is used to input the collected geographical environmental data of the target area into the machine learning model for prediction to obtain a pollution risk screening map of the target area; and the intelligent decision-making module 24 is used to deploy first-level sampling points on the pollution risk screening map of the target area and, through data... The data acquisition module 22 collects on-site detection data at the locations of the corresponding first-level sampling points within the target area. The intelligent decision-making module 24 is also used to fuse the collected on-site detection data at the locations of the corresponding first-level sampling points with the pollution risk preliminary screening map of the target area using data fusion technology to generate a first-level pollution distribution prediction map. The intelligent decision-making module 24 is also used to deploy second-level sampling points in the medium- and high-risk areas of the first-level pollution distribution prediction map, and collects on-site detection data at the locations of the corresponding second-level sampling points within the target area through the data acquisition module 22. The intelligent decision-making module 24 is also used to fuse the collected on-site detection data at the locations of the corresponding second-level sampling points with the first-level pollution distribution prediction map using data fusion technology to generate a second-level pollution distribution prediction map. The visualization platform module 25 is used to display the first-level sampling points, the second-level sampling points, and the second-level pollution distribution prediction map.
[0028] The sampling system of this invention first generates a preliminary pollution risk screening map using a machine learning model based on the geographical environmental data of the target area. Then, it deploys first-level sampling points on this map. The principle for deploying these first-level sampling points is a small number and even distribution. This deployment of first-level sampling points enables a macro-level background survey of the target area, ensuring the comprehensiveness of the survey. These first-level sampling points form a sparse, basic system grid of sampling points deployed throughout the entire target area. For example, the system can pre-set the number of sampling points and then evenly distribute them within the target area based on its actual size. Alternatively, it can set the spacing between sampling points (which can be a large value) and then deploy the sampling points within the actual target area according to the set spacing. Alternatively, sampling points can be deployed based on the risk level of each sub-region shown in the initial pollution risk screening map. For low-risk or clean sub-regions, a small number of sampling points or sampling points with large spacing can be deployed (the specific number and spacing system can be preset), while for medium- and high-risk sub-regions, a large number of sampling points or sampling points with small spacing can be deployed (the specific number and spacing system can be preset).
[0029] Then, based on the layout plan of the first-level sampling points, corresponding on-site testing is conducted in the target area to obtain on-site testing data. This on-site testing data can be obtained through methods such as rapid on-site testing, laboratory analysis of soil samples, and multispectral or hyperspectral remote sensing inversion data. This on-site testing data is actual soil or soil sample data, possessing high authenticity. Data fusion technology is used to fuse the on-site testing data with the initial pollution risk screening map, generating an updated version of the initial pollution risk screening map, i.e., the first-level pollution distribution prediction map. This first-level pollution distribution prediction map has higher accuracy and lower uncertainty compared to the initial pollution risk screening map. Based on the medium- and high-risk areas of the first-level pollution distribution prediction map, second-level sampling points are then deployed. The deployment rule for the second-level sampling points is focused sampling in medium- and high-risk areas, i.e., denser deployment of sampling points in medium- and high-risk areas. The deployment density of the second-level sampling points is greater than that of the first-level sampling points, used for focused verification and characterization of risk areas within the target area. Preferably, the deployment density of the second-level sampling points is 2 to 5 times that of the first-level sampling points. Then, based on the layout plan of the second-level sampling points, corresponding on-site detection is carried out in the target area to obtain on-site detection data. The detection method for this on-site detection data is the same as that for the on-site detection data of the first-level sampling points. Data fusion technology is used to fuse the on-site detection data of the second-level sampling points with the first-level pollution distribution prediction map to generate an updated version of the first-level pollution distribution prediction map, namely the second-level pollution distribution prediction map. Compared with the first-level pollution distribution prediction map, the second-level pollution distribution prediction map has higher accuracy and less uncertainty.
[0030] This invention performs a second-level iteration on the risk screening map predicted by the machine learning model, and optimizes and updates the risk screening map using on-site detection data to generate a more accurate risk prediction map.
[0031] In one specific embodiment of the present invention, the intelligent decision module 24 is further used to delineate suspected pollution hotspots on the second-level pollution distribution prediction map using on-site detection data at the location of the corresponding first-level sampling point and on-site detection data at the location of the corresponding second-level sampling point. Third-level sampling points were set up in the identified suspected pollution hotspot areas, and on-site detection data at the corresponding third-level sampling points within the target area were collected through the data acquisition module; It is also used to combine the on-site detection data at the location of the corresponding third-level sampling point with the second-level pollution distribution prediction map using data fusion technology to generate a third-level pollution distribution prediction map.
[0032] The deployment of these third-level sampling points is used to achieve high-density verification of suspected pollution hotspots. The density of these third-level sampling points is greater than that of the first-level sampling points; preferably, the density is at least 10 times that of the first-level sampling points. Based on the on-site detection data from the first and second-level sampling points, one or more suspected pollution hotspots are delineated or delineated. Targeted, high-density sampling is then conducted in these areas—that is, high-density deployment of third-level sampling points—to ultimately confirm or eliminate pollution hotspots. By fusing the on-site detection data from the third-level sampling points with the second-level pollution distribution prediction map, the resulting third-level pollution distribution prediction map has higher accuracy and lower uncertainty.
[0033] In one specific embodiment of the present invention, the prediction modeling module 21 is further configured to receive the field detection data at the location corresponding to the first-level sampling point sent by the data acquisition module 22, and to retrain the machine learning model using the received field detection data at the location corresponding to the first-level sampling point. The predictive modeling module 21 is also used to receive the field detection data at the location of the corresponding second-level sampling point sent by the data acquisition module 22, and to retrain the machine learning model using the received field detection data at the location of the corresponding second-level sampling point. The predictive modeling module 21 is also used to receive the field detection data at the location of the corresponding third-level sampling point sent by the data acquisition module 22, and to retrain the machine learning model using the received field detection data at the location of the corresponding third-level sampling point.
[0034] In the initial construction of the machine learning model, a large amount of multi-source geographic environmental data of the target area is collected and processed to construct a feature dataset for model training. Then, the predictive modeling module 21 uses the feature dataset to train the selected machine learning model, enabling the machine learning model to generate a preliminary pollution risk screening map based on the geographic environmental data of the target area. This geographic environmental data includes, but is not limited to, at least three of the following: basic site information and surrounding conditions (geographical location, land use planning, production history, production process, production facilities, raw and auxiliary materials), distribution and intensity of industrial pollution sources (historical environmental monitoring data, on-site pollution auxiliary detection), soil type, geological background, road traffic network data, and multispectral or hyperspectral remote sensing inversion data.
[0035] In the subsequent on-site testing at sampling points, the on-site testing data is used as new training samples, allowing the predictive modeling module 21 to retrain the machine learning model using the training samples, thereby optimizing the model and improving the accuracy of the pollution screening map generated by the model.
[0036] Preferably, the machine learning model is a random forest, gradient boosting decision tree, or neural network model.
[0037] Furthermore, the data fusion technology selected by the intelligent decision-making module includes at least one of the following: Kriging interpolation and Bayesian maximum entropy method.
[0038] In one specific embodiment of the present invention, the geographic environmental data collected by the data acquisition module includes at least three of the following: basic site information, distribution of industrial pollution sources, soil type, geological background, road traffic data, and remote sensing images.
[0039] The data acquisition module acquires geographic environmental data through multi-source sensors and remote sensing equipment. The geographic environmental data includes, but is not limited to, at least three of the following: basic site information and surrounding conditions (geographical location, land use planning, production history, production process, production facilities, raw and auxiliary materials), distribution and intensity of industrial pollution sources (historical environmental monitoring data, on-site pollution auxiliary detection), soil type, geological background, road traffic network data, and multispectral or hyperspectral remote sensing inversion data.
[0040] The visualization platform module is used to display the distribution of pollution risks, sampling locations, and survey results.
[0041] After completing the three-level iterative sampling, the intelligent decision-making module 24 integrates the sampling data and prediction results from all levels and outputs a final report that includes a high-resolution spatial distribution map of soil pollution, a risk level zoning map, an uncertainty assessment, and comprehensive conclusions.
[0042] The system of this invention is deployed on a cloud server or mobile terminal, and supports wireless data transmission and remote collaborative operation.
[0043] like Figure 3 and Figure 4 As shown, the intelligent decision-making sampling system of this invention includes a data acquisition module that acquires geographic environmental data through multi-source sensors and remote sensing equipment; a predictive modeling module that generates a preliminary pollution risk screening map using a machine learning model; an intelligent decision-making module that formulates a three-level iterative sampling plan and continuously optimizes the prediction and sampling strategies through data fusion; and a visualization platform module that provides full-process visual monitoring and results display. All data and operation interfaces are uniformly managed through a cloud-based visualization platform, supporting multi-user collaboration and mobile access.
[0044] This invention also provides a machine learning-based intelligent decision-making sampling method for regional pollution surveys. The sampling method is described below.
[0045] like Figure 2 As shown, the sampling method of the present invention includes the following steps: Perform step S11 to build a machine learning model; then perform step S12. Execute step S12 to collect geographic environmental data of the target area; then execute step S13. Step S13 is executed, in which the collected geographic environmental data of the target area is input into the machine learning model for prediction to obtain a preliminary pollution risk screening map of the target area; then step S14 is executed. Execute step S14, set up first-level sampling points on the pollution risk screening map of the target area, and collect on-site detection data at the locations corresponding to the first-level sampling points within the target area; then execute step S15; Step S15 is executed, where data fusion technology is used to fuse the on-site detection data collected at the corresponding first-level sampling point with the pollution risk screening map of the target area to generate a first-level pollution distribution prediction map; then step S16 is executed. Execute step S16, set up second-level sampling points in the medium- and high-risk areas of the first-level pollution distribution prediction map, and collect on-site detection data at the locations of the corresponding second-level sampling points within the target area; then execute step S17. Step S17 is executed, where data fusion technology is used to fuse the on-site detection data at the location of the corresponding second-level sampling point with the first-level pollution distribution prediction map to generate the second-level pollution distribution prediction map; then step S18 is executed. Execute step S18 to display the first-level sampling points, the second-level sampling points, and the second-level pollution distribution prediction map.
[0046] In one specific embodiment of the present invention, the sampling method further includes: On the second-level pollution distribution prediction map, suspected pollution hotspots are delineated using on-site detection data at the locations of the corresponding first-level sampling points and on-site detection data at the locations of the corresponding second-level sampling points; Third-level sampling points were set up in the identified suspected pollution hotspots to collect on-site detection data at the locations of the corresponding third-level sampling points within the target area; Data fusion technology is used to fuse the on-site detection data at the corresponding third-level sampling points with the second-level pollution distribution prediction map to generate a third-level pollution distribution prediction map.
[0047] In one specific embodiment of the present invention, the sampling method further includes: After collecting the field detection data at the location of the corresponding first-level sampling point, the machine learning model is retrained using the collected field detection data at the location of the corresponding first-level sampling point. After collecting the field detection data at the corresponding second-level sampling point, the machine learning model is retrained using the collected field detection data at the corresponding second-level sampling point. After collecting the field detection data at the corresponding third-level sampling point, the machine learning model is retrained using the collected field detection data at the corresponding third-level sampling point.
[0048] In one specific embodiment of the present invention, the collected geographic environmental data of the target area includes at least three of the following: basic site information, distribution of industrial pollution sources, soil type, geological background, road traffic data, and remote sensing images.
[0049] In one specific embodiment of the present invention, the selected data fusion technology includes at least one of Kriging interpolation and Bayesian maximum entropy method.
[0050] The present invention has been described in detail above with reference to the accompanying drawings and embodiments. Those skilled in the art can make various modifications to the present invention based on the above description. Therefore, certain details in the embodiments should not be construed as limiting the present invention, and the scope of protection of the present invention shall be defined by the appended claims.
Claims
1. A machine learning-based intelligent decision-making sampling method for regional pollution surveys, characterized in that, Includes the following steps: Build a machine learning model; Collect geographic environmental data for the target area; The collected geographic environmental data of the target area is input into the machine learning model for prediction to obtain a preliminary screening map of the pollution risk of the target area. First-level sampling points are set up on the pollution risk screening map of the target area, and on-site detection data are collected at the locations of the corresponding first-level sampling points within the target area. Data fusion technology is used to fuse the on-site detection data at the location of the corresponding first-level sampling point with the pollution risk screening map of the target area to generate a first-level pollution distribution prediction map. Second-level sampling points were set up in the medium- and high-risk areas of the first-level pollution distribution prediction map, and on-site detection data were collected at the locations of the corresponding second-level sampling points within the target area. Data fusion technology is used to fuse the on-site detection data at the corresponding second-level sampling points with the first-level pollution distribution prediction map to generate a second-level pollution distribution prediction map. Displays the first-level sampling points, the second-level sampling points, and a predicted distribution map of the second-level pollution.
2. The intelligent decision-making sampling method for regional pollution investigation based on machine learning as described in claim 1, characterized in that, Also includes: On the second-level pollution distribution prediction map, suspected pollution hotspots are delineated using on-site detection data at the locations of the corresponding first-level sampling points and on-site detection data at the locations of the corresponding second-level sampling points. Third-level sampling points were set up in the identified suspected pollution hotspots to collect on-site detection data at the locations of the corresponding third-level sampling points within the target area; Data fusion technology is used to fuse the on-site detection data at the corresponding third-level sampling points with the second-level pollution distribution prediction map to generate a third-level pollution distribution prediction map.
3. The intelligent decision-making sampling method for regional pollution investigation based on machine learning as described in claim 2, characterized in that, Also includes: After collecting the field detection data at the location of the corresponding first-level sampling point, the machine learning model is retrained using the collected field detection data at the location of the corresponding first-level sampling point. After collecting the field detection data at the location of the corresponding second-level sampling point, the machine learning model is retrained using the collected field detection data at the location of the corresponding second-level sampling point. After collecting the field detection data at the location of the corresponding third-level sampling point, the machine learning model is retrained using the collected field detection data at the location of the corresponding third-level sampling point.
4. The intelligent decision-making sampling method for regional pollution investigation based on machine learning as described in claim 1, characterized in that, The collected geographic environmental data for the target area includes at least three of the following: basic site information, distribution of industrial pollution sources, soil type, geological background, road traffic data, and remote sensing imagery.
5. The intelligent decision-making sampling method for regional pollution investigation based on machine learning as described in claim 1, characterized in that, The selected data fusion techniques include at least one of Kriging interpolation and Bayesian maximum entropy.
6. A regional survey intelligent decision-making sampling system based on machine learning, characterized in that, include: The predictive modeling module is used to build a machine learning model. The data acquisition module is used to collect geographic environmental data of the target area; The risk screening module is connected to the data acquisition module and the prediction modeling module. The risk screening module is used to input the collected geographic environmental data of the target area into the machine learning model for prediction, so as to obtain a pollution risk screening map of the target area. An intelligent decision-making module, connected to the risk screening module and the data acquisition module, is used to deploy first-level sampling points on the pollution risk screening map of the target area, and collect on-site detection data at the locations corresponding to the first-level sampling points within the target area through the data acquisition module. The intelligent decision-making module is also used to fuse the collected on-site detection data at the locations corresponding to the first-level sampling points with the pollution risk screening map of the target area using data fusion technology to generate a first-level pollution distribution prediction map. Furthermore, the intelligent decision-making module is used to deploy second-level sampling points in the medium-to-high-risk areas of the first-level pollution distribution prediction map, and collect on-site detection data at the locations corresponding to the second-level sampling points within the target area through the data acquisition module. The intelligent decision-making module is also used to fuse the collected on-site detection data at the locations corresponding to the second-level sampling points with the first-level pollution distribution prediction map using data fusion technology to generate a second-level pollution distribution prediction map. The visualization platform module, connected to the intelligent decision-making module, is used to display the first-level sampling points, the second-level sampling points, and the second-level pollution distribution prediction map.
7. The machine learning-based regional survey intelligent decision sampling system as described in claim 6, characterized in that, The intelligent decision-making module is also used to delineate suspected pollution hotspots on the second-level pollution distribution prediction map using on-site detection data at the location of the corresponding first-level sampling point and on-site detection data at the location of the corresponding second-level sampling point. Third-level sampling points are set up in the identified suspected pollution hotspot areas, and on-site detection data at the locations of the corresponding third-level sampling points within the target area are collected through the data acquisition module. It is also used to combine the on-site detection data at the location of the corresponding third-level sampling point with the second-level pollution distribution prediction map using data fusion technology to generate a third-level pollution distribution prediction map.
8. The machine learning-based regional survey intelligent decision sampling system as described in claim 7, characterized in that, The predictive modeling module is also used to receive the field detection data at the location of the corresponding first-level sampling point sent by the data acquisition module, and to retrain the machine learning model using the received field detection data at the location of the corresponding first-level sampling point. The predictive modeling module is also used to receive the field detection data at the location of the corresponding second-level sampling point sent by the data acquisition module, and to retrain the machine learning model using the received field detection data at the location of the corresponding second-level sampling point. The predictive modeling module is also used to receive the field detection data at the location of the corresponding third-level sampling point sent by the data acquisition module, and to retrain the machine learning model using the received field detection data at the location of the corresponding third-level sampling point.
9. The machine learning-based regional survey intelligent decision sampling system as described in claim 6, characterized in that, The geographic environmental data collected by the data acquisition module includes at least three of the following: basic site information, distribution of industrial pollution sources, soil type, geological background, road traffic data, and remote sensing images.
10. The machine learning-based regional survey intelligent decision sampling system as described in claim 6, characterized in that, The data fusion technology used in the intelligent decision-making module includes at least one of Kriging interpolation and Bayesian maximum entropy method.