Data feature screening method and device and computer equipment

A technology of data characteristics and screening methods, which is applied in computing, data processing applications, and other database retrievals. It can solve problems such as inaccurate measurement of a single feature, large resource occupation, and long time consumption, and achieve simplified model complexity, fast speed, and high efficiency effect

Active Publication Date: 2021-01-08
SHANGHAI ICEKREDIT INC
View PDF3 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] However, the above technologies have the problems of taking too long when screening feature combinations, occupying too many resources, and measuring a single feature inaccurately.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data feature screening method and device and computer equipment
  • Data feature screening method and device and computer equipment

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0053] In order to better understand the above technical solutions, the technical solutions of the present invention will be described in detail below through the accompanying drawings and specific examples. It should be understood that the embodiments of the present invention and the specific features in the examples are detailed descriptions of the technical solutions of the present invention, and It is not a limitation to the technical solutions of the present invention, and the embodiments of the present invention and the technical features in the embodiments can be combined with each other under the condition of no conflict.

[0054] The inventor found through investigation and research that the main steps of the prior art are as follows.

[0055] 1) Computer equipment obtains data with binary labels from text files or databases. Binary labels are generally called positive examples and negative examples. Normal repayment is a counterexample. The computer obtains all the ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

According to a data feature screening method and device and computer equipment provided by the embodiment of the invention, most features with relatively large trend fluctuation are eliminated by utilizing the correlation coefficient during screening, and the whole process only relates to correlation calculation and is not required to be visualized, so that the overall speed is higher, and the efficiency is higher. Through screening of the whole process, the number of variables finally entering the model is reduced, the complexity of the model is simplified, the overall service cost is reduced, and the interpretability of the model is improved. Therefore, the correlation between a binning value and the corresponding positive example ratio is calculated by utilizing a binning result table in the IV calculation process, the characteristic trend can be measured, and then the characteristic trend is combined with the IV value to perform characteristic screening, so that the characteristictrend to be molded is good, the measurement is accurate and the interpretability is high under the condition of reducing the time consumption and the computer resource consumption,.

Description

technical field [0001] The present invention relates to the technical field of data feature processing, in particular, to a data feature screening method, device and computer equipment. Background technique [0002] In the process of building a risk control model, an important step is feature engineering. Feature engineering refers to the process of using professional background knowledge and skills to process data to generate features that can better describe the data, and using these features to make machine learning algorithms play a better role. The process includes modules such as feature extraction, feature construction, and feature screening. [0003] The linear model represented by logistic regression is widely used in the industry as a model with strong interpretability. In the feature screening module of linear model modeling, the usual practice is to first filter the features through EDA (exploratory data analysis), such as filtering out high missing rate, const...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/901G06F16/906G06Q10/06
CPCG06Q10/06393G06F16/9027G06F16/906
Inventor 顾凌云谢旻旗段湾陶雨婕张涛潘峻
Owner SHANGHAI ICEKREDIT INC
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products