Method for high-latitude variable screening of credit variable data

A variable screening and credit technology, which is applied in data processing applications, complex mathematical operations, instruments, etc., can solve the problems that variables cannot be screened, and achieve the effects of improving validity and stability, optimizing credit data subsets, and improving discrimination accuracy

Pending Publication Date: 2022-03-01
武汉众邦银行股份有限公司
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

This method also considers the KS value and feature contribution of variables to ensure a certain degree of accuracy. However, KS represents the variable’s ability to distinguish between good and bad. limitations

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0035] Extract customer application information data, lending data, loan overdue information data, and pedestrian credit information data composition high-dimensional credit variable data, as basic credit data variables, user transaction stream data is as follows:

[0036] time Xxxx Name Xxxx ID number Xxxx Residence address Xxxx Application Amount Xxxx loan amount Xxxx Days Overdue Xxxx ... Xxxx Has also reward Xxxx Total amount of loans Xxxx

[0037] Single data basic format As shown in the above table, a single user behavior sequence data consists of a series of the above-described water data.

[0038] The process is as follows:

[0039] Step 1, obtain all the related application data, lending data, overdue data, and credit data, where the credit information is tagged, and the tag includes normal repayment customers and default customers. The data file is saved as a matrix of 100000 × 430 to get the credit variable data...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to the field of credit scoring modeling, and provides a method for high-latitude variable screening of credit variable data. The method aims at solving the problem of how to fully screen high-dimensional variables, trying to construct a new variable screening method to assist credit scoring modeling, and ensuring the optimal accuracy of an obtained modeling model. According to the main scheme, the method comprises the steps of obtaining application data, loan data, overdue data and credit investigation data; intercepting data in a period of time, performing preliminary screening on the extracted data according to a cumauc induction method, and selecting data with a large auc value; chi-square binning is carried out on the preliminarily screened variables, the chi-square binning enables each piece of data to have an independent weight, nonlinearity is introduced for a subsequent scoring model, the risk of model over-fitting is reduced, and the data with the high weight is selected; and performing stepwise regression analysis according to the screened data to finally obtain all data meeting the screening condition.

Description

Technical field [0001] The present invention relates to the field of credit score modeling, and is specifically a method of performing high latency variable data for a credit variable data. Background technique [0002] Wind control technology is one of the cornerstones of modern finance, and the wind control model plays a key role in wind control technology. In the Internet Financial Era, the automation wind control is the only way to reduce risk costs due to the employment of personal small and micro enterprises, and the wind control model is widely used in automation. [0003] The effect of the wind control modeling model is dependent on basic data, which determines the critical impact of the model sample set and the mold variable on the validity of the wind control model. Financial companies will acquire a large number of basic variables and derivative variables before wind control mode, and screening the mold variable as a wind control model. Usually, the original mold varia...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06Q40/02G06F17/18
CPCG06F17/18G06Q40/03
Inventor 钟磊田羽刘银龙段笑游江珊
Owner 武汉众邦银行股份有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products