High-risk pollution source classification forecasting method based on principal component analysis and random forest

A principal component analysis and random forest technology, which is applied in the field of classification and prediction of high-risk pollution sources, can solve the problems of inability to effectively improve classification accuracy, ignore the correlation of input variables, and reduce modeling efficiency, so as to reduce the number of input index factors and improve prediction. Accuracy and quality of results, the effect of reducing operational complexity

Inactive Publication Date: 2017-12-15
SHENZHEN POWERDATA INFO TECH CO LTD
View PDF0 Cites 15 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the classification problem is theoretically a complex function extension problem, so there is no one classification model method suitable for all different situations. Although many classification methods have emerged one after another, the main problems are as follows: In the field of data analysis, although there are many classification prediction methods available, few of them are applied to the field of enterprise high-risk pollution source prediction.
With the advent of a large number of artificial intelligence classification algorithms, its highly nonlinear mapping ability overcomes the shortcomings of many traditional statistical classification algorithms, but in practical applications, many ignore the correlation between input variables, and in actual modeling When there are too many input variables, it will also lead to a decrease in modeling efficiency
[0004] Usually, the modeler uses different single classification methods to establish multiple classification models for the same classification problem under different assumptions, and then selects the best result from multiple classification methods according to the classification accuracy, and excludes other classification methods. single-item classification method, but this does not effectively improve the classification accuracy

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • High-risk pollution source classification forecasting method based on principal component analysis and random forest
  • High-risk pollution source classification forecasting method based on principal component analysis and random forest
  • High-risk pollution source classification forecasting method based on principal component analysis and random forest

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0060] The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some, not all, embodiments of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.

[0061] In the embodiment of the classification and prediction method of high-risk pollution sources based on principal component analysis and random forest in the present invention, the flow chart of the classification and prediction method of high-risk pollution sources based on principal component analysis and random forest is as follows figure 1 shown. figure 1 Among them, the classification and prediction method of high-risk pollution sources based on prin...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a high-risk pollution source classification forecasting method based on principal component analysis and random forest. The method includes the steps of collecting and integrating environmental pollution source behavior data of enterprises into primary selection indexes, and screening out illegal pollution source behavior indexes influencing pollution sources to serve as a high-risk pollution source index system; conducting data cleaning and data normalization processing on the environmental pollution source behavior data; finding out a function relationship indicating whether or not the high-risk pollution source index system and the pollution sources are illegal, and building a random forest model; conducting model training and evaluating the precision of the random forest model after training is finished; sorting importance degrees of the pollution source behavior indexes; conducting the principal component analysis to obtain principal components, utilizing the principal components to conduct weighting and work out comprehensive scores; according to the comprehensive scores, judging the risk score coefficient of each enterprise, automatically ranking the risk core coefficients and generating a TOP enterprise list, wherein the risk score coefficients indicate the occurrence probability of illegal behaviors of the corresponding enterprises. The high-risk pollution source classification forecasting method based on the principal component analysis and the random forest can reduce complexity of operations and improve forecasting precision and the quality of results.

Description

technical field [0001] The invention relates to the field of prediction of high-risk pollution sources, in particular to a classification and prediction method of high-risk pollution sources based on principal component analysis and random forest. Background technique [0002] After the development of environmental informatization in recent years, environmental protection departments at all levels have established a large number of environmental business application systems, but there are serious problems of departmentalization, localization, and scattered distribution of environmental big data in the construction of environmental informatization. Efficient, scientific and clear management mechanism. Pollution source data is the core foundation of environmental management. Predicting high-risk pollution sources that may cause environmental pollution risks and illegal activities in advance is of great significance for more targeted pollution control. [0003] Prediction of h...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06Q10/04G06Q10/06G06Q50/26G06F17/30
CPCG06Q10/04G06F16/215G06Q10/0635G06Q10/06393G06Q50/265
Inventor 康庆罗艳唐文超庞东博王登优
Owner SHENZHEN POWERDATA INFO TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products