Method and device for screening machine learning features

A screening machine and machine learning technology, applied in the direction of instruments, computer components, character and pattern recognition, etc., can solve problems such as heavy workload, prolonged modeling cycle, and affecting model training effect

Active Publication Date: 2017-12-15
ZHEJIANG TMALL TECH CO LTD
View PDF6 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0002] At present, the modeling of financial-related models mainly collects a large number of features first, and uses machine learning algorithms to classify or regress large data. In order to easily obtain features that contribute to machine learning, it is necessary to sort out all features. In related technologies, artificially use business experience to select features related to it, and then comprehensively optimize the features in different dimensions such as interpretability and indicators, which can be applied to various and complex features that may affect financial-related models. There are many types. According to preliminary statistics, there are thousands or even tens of thousands of features that can be used for financial-related model training. Each step in the middle requires a lot of manual intervention, and the workload of manually screening features is very large. The lengthening of the modeling cycle has become the bottleneck of the entire modeling development cycle
Moreover, the selection of features directly affects the training effect of the model. Modelers need to have strong business experience, which greatly reduces work efficiency.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for screening machine learning features
  • Method and device for screening machine learning features
  • Method and device for screening machine learning features

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0145] like image 3 As shown, the embodiment of the present invention illustrates the steps of screening machine learning feature tasks:

[0146] The first step is to use the machine learning algorithm to screen the features, and filter out the features that are obviously irrelevant to the target variable.

[0147] In the second step, the remaining features are used for model training and their performance results are evaluated. Use the logistic regression algorithm to train the remaining features, calculate the indicators AUC and KS for evaluating the performance of the model, and record w1, w2, ... wn to represent the weight value of each feature in the model.

[0148] In the third step, each feature is removed for model training and evaluation. Remove each feature separately and retrain with the logistic regression model to obtain the evaluation index. The performance index of the model with the i-th feature removed is AUC respectively. i 、KS i .

[0149] The fourth s...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a method and a device for screening machine learning features. The method and the device relate to the field of machine learning models. The method comprises the steps of: performing preliminary screening on collected features by using a machine learning algorithm, so as to obtain a training feature set; carrying out model training by utilizing all features in the training feature set, and acquiring a performance result of total features of the training feature set; removing each feature in the training feature set separately, carrying out model training by utilizing the remaining features in the training feature set, and acquiring a performance result of the training feature set after the feature is removed; comparing the performance result of the total features and the performance result after the feature is removed, so as to obtain a performance result attenuation rate after the feature is removed; and determining the features meeting a preset condition to be a screened feature set according to the attenuation rate. The method and the device minimize the complexity of the model, thereby greatly reducing labor cost and time cost, and improving working efficiency.

Description

technical field [0001] The invention relates to the field of machine learning models, in particular to a method and device for screening machine learning features. Background technique [0002] At present, the modeling of financial-related models mainly collects a large number of features first, and uses machine learning algorithms to classify or regress large data. In order to easily obtain features that contribute to machine learning, it is necessary to sort out all features. In related technologies, artificially use business experience to select features related to it, and then comprehensively optimize the features in different dimensions such as interpretability and indicators, which can be applied to various and complex features that may affect financial-related models. There are many types. According to preliminary statistics, there are thousands or even tens of thousands of features that can be used for financial-related model training. Each step in the middle require...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/62G06Q40/00
CPCG06Q40/00G06F18/2155
Inventor 张柯褚巍施兴姜晓燕
Owner ZHEJIANG TMALL TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products