Unlock instant, AI-driven research and patent intelligence for your innovation.

Model establishing method and system based on feature importance

A technology of model building and feature modeling, applied in character and pattern recognition, instruments, computer components, etc., can solve problems such as differences in data retrieval rates, sequence of access times, and limited performance of traditional algorithms

Active Publication Date: 2020-10-30
深圳无域科技技术有限公司
View PDF12 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0002] In the field of big data risk control modeling, there are usually many data sources used, and due to the difference in data retrieval rate and the order of access time, the quality of data that can be used for risk control modeling is ultimately low
Faced with this kind of data situation, the performance of traditional algorithms is limited, and a large amount of data preprocessing is required, which cannot meet the risk control needs of the business. Therefore, choosing a model with higher performance is the primary goal, and XGBOOST is commonly used One of the algorithms, the content of this patent is based on the XGBOOST algorithm
[0003] The XGBOOST algorithm uses an ensemble learning scheme to build trees, and finally learns N trees, whose feature importance is the ratio of the number of times they appear on tree nodes, because the test set overfitting and stability of the model need to be considered during the tree building process. For other factors, parameters need to be restricted, such as the maximum tree depth, node splitting conditions, etc., which will eventually lead to limited features used in the entire tree forest. Therefore, among a large number of modeled feature variables, most of the feature importance is zero. , the fact that most of the feature importance is zero is not conducive to the subsequent feature screening work, and the feature importance is zero does not mean that the feature is useless

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Model establishing method and system based on feature importance
  • Model establishing method and system based on feature importance
  • Model establishing method and system based on feature importance

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0045] Preferred embodiments of the present invention will be described in detail below in conjunction with the accompanying drawings.

[0046] In order to further understand the present invention, the preferred embodiments of the present invention are described below in conjunction with examples, but it should be understood that these descriptions are only to further illustrate the features and advantages of the present invention, rather than limiting the claims of the present invention.

[0047] The description in this part is only for several typical embodiments, and the present invention is not limited to the scope of the description of the embodiments. The mutual replacement of the same or similar prior art means and some technical features in the embodiments is also within the scope of the description and protection of the present invention.

[0048] The present invention discloses a method for building a model based on feature importance, figure 1 It is a flowchart of ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a model establishing method and system based on feature importance. The model establishing method comprises the steps: S1, initializing feature data; S2, sampling the feature data to form a plurality of feature data combinations, and taking each feature data combination as a sub-model; S3, setting model parameters of each group of feature models, wherein each feature combination is provided with the same model parameter range; S4, training a model, and calculating the importance of each sub-model feature; S5, calculating the importance of the comprehensive weighted features by using all the sub-models; S6, sorting the feature importance to obtain a new importance sequence; and S7, modeling again according to the feature sequence. According to the model establishingmethod and system based on the feature importance, the relative sorting fluctuation of the feature importance obtained through calculation can be reduced, and the credibility is improved.

Description

technical field [0001] The invention belongs to the technical field of big data processing, and relates to a model building method, in particular to a feature importance-based model building method and system. Background technique [0002] In the field of big data risk control modeling, a variety of data sources are usually used, and due to the difference in data retrieval rate and the order of access time, the quality of data that can be used for risk control modeling is ultimately low. Faced with this kind of data situation, the performance of traditional algorithms is limited, and a large amount of data preprocessing is required, which cannot meet the risk control needs of the business. Therefore, choosing a model with higher performance is the primary goal, and XGBOOST is commonly used One of the algorithms, the content of this patent is based on the XGBOOST algorithm. [0003] The XGBOOST algorithm uses an ensemble learning scheme to build trees, and finally learns N t...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/62
CPCG06F18/214
Inventor 林建明
Owner 深圳无域科技技术有限公司