Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Improved LR-Bagging algorithm based on characteristic selection

A feature selection and algorithm technology, applied in computing, instrumentation, data processing applications, etc., can solve problems such as the small number of variables, the model is not direct enough, and the model prediction results are not satisfactory, so as to reduce the possibility and improve the diversity effect.

Inactive Publication Date: 2016-12-21
GUIZHOU POWER GRID INFORMATION & TELECOMM
View PDF0 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] Professionals in the domestic academic circles started relatively late in the research on the problem of electricity arrears, mainly focusing on theoretical research on the current situation, influencing factors, evaluation, and effectiveness measures of electricity charge recovery risks, lacking the support of quantitative models based on real data; Although there are many literatures that predict the risk of arrears by modeling the credit rating of power customers, the model is not direct enough; with the vigorous development of the big data mining industry, data mining algorithms based on logistic regression and decision trees have appeared in recent years However, the features selected by the former are binary variables, and the applicability is low, and the number of variables is also small; although the model variables selected by the latter are more diverse, the prediction results of the model are not satisfactory.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Improved LR-Bagging algorithm based on characteristic selection
  • Improved LR-Bagging algorithm based on characteristic selection
  • Improved LR-Bagging algorithm based on characteristic selection

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0034] This embodiment is based on the improved algorithm of this description to make predictions for Guiyang electric power arrears high-risk residential customers, combined with figure 2 The model establishment and solution process, the specific steps are as follows:

[0035] Step 1: Determine the initial data set from the original data, and the degree of correlation between the independent variable and the dependent variable cannot be too low;

[0036] Step 2: Perform WEO coding on discrete independent variables;

[0037] Step 3: Training and testing of the base LR model, integrating the combined model;

[0038] Step 4: Carry out the loop iteration of step 3 until the combination model is better;

[0039] Step 5: Prediction and evaluation using the optimal combination model.

[0040] Among them, step 1 is specifically described as follows:

[0041] The application data involved in the present invention comes from the arrears data of the grid residents in Guiyang City, ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses an improved LR-Bagging algorithm based on feature selection, which comprises the following steps: first, determine the initial data set from the original data, and require that the degree of correlation between the independent variable and the dependent variable cannot be too low; secondly, the initial data set Discrete independent variables are encoded by WEO; then random sampling is used to obtain a certain number of records and feature fields to form training examples, and the training examples are trained with LR ((LogisticRegression) model and the normal significance test of the coefficient is performed. If not significant, then Eliminate, on the contrary, add the combination model. Carry out cyclic iterations until the combination model is better. Finally, you can use the better combination model for prediction and grouping. This algorithm can improve the diversity of classification results, the degree of extraction of variable information and the prediction results It can also effectively reduce the possibility of multicollinearity and "overfitting" caused by too many variables in the base LR model.

Description

technical field [0001] The invention relates to the field of electric risk probability classification and prediction, in particular to an improved LR-Bagging algorithm based on feature selection. Background technique [0002] The deepening reform of my country's electric power system has introduced a market mechanism for the electric power industry. While effectively realizing the optimal allocation of electric power resources and improving the efficiency of power resource production and transmission, it also brings greater market risks to electric power companies, and customers arrears. The risk of electricity fee recovery has always been one of the major risks in electricity marketing. As an effective way for power companies to return funds, electricity bills maintain the normal operation of the economic chain of supply, production, and sales in the power system, but the phenomenon of arrears in electricity bills emerges endlessly. It is very important for electric power c...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06Q50/06
CPCG06Q50/06
Inventor 吴漾朱州谭驰曾路王鹏宇王玮罗念华吴忠张克贤郭仁超杨箴方继宇龙娜钱俊凤王倩冰陆岫昶
Owner GUIZHOU POWER GRID INFORMATION & TELECOMM
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products