Interception strategy derivation method and system for credit anti-fraud

A strategy and credit technology, applied in data processing applications, finance, instruments, etc., can solve problems such as the inability of anti-fraud methods to adapt, and achieve the effect of high accuracy, strong interpretability, and strong business interpretability

Pending Publication Date: 2022-06-17
江苏城乡建设职业学院
View PDF0 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, in the face of the complex and changeable business environment and the terabyte or even petabyte data scale in the era of big data, the traditiona

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Interception strategy derivation method and system for credit anti-fraud
  • Interception strategy derivation method and system for credit anti-fraud
  • Interception strategy derivation method and system for credit anti-fraud

Examples

Experimental program
Comparison scheme
Effect test

Example Embodiment

[0056]本发明的一种用于信贷反欺诈的拦截策略衍生方法,至少包括以下步骤:

[0057]步骤1:获取样本数据,在样本数据中提取用户相关的特征变量,进行数据预处理,对特征变量进行分箱处理,获得变量分箱;

[0058]步骤2:计算特征变量在每个变量分箱下的WOE值,根据计算得到WOE值,计算特征变量的IV值,WOE值表示变量权重,IV值表示变信息价值,剔除IV值小于设定值的特征变量;

[0059]步骤3:对保留的特征变量进行WOE编码,用计算得到的每个变量分箱的WOE值替换变量分箱对应的特征变量的数值;

[0060]步骤4:将样本数据划分为训练集和测试集,基于逻辑回归模型建立预测模型,通过训练集训练预测模型,得到训练好的预测模型;通过测试集对训练好的预测模型进行测试,通过评价指标AUC对预测模型进行评估,调整模型参数,获得最佳预测模型;

[0061]步骤5:通过最佳预测模型计算所有变量分箱的评分,将评分不大于设定阈值的变量分箱进行交叉生成拦截策略;验证拦截策略是否满足上线条件,保留所有满足上线条件的拦截策略,拦截策略用于识别具有高逾期风险的用户。

[0062]本发明实施例提供的用于信贷反欺诈的拦截策略衍生方法,能快速学习到海量数据中能标识坏客户的特征变量,并基于这些变量进行特征交叉,衍生出具有强业务解释性的黑样本拦截策略。

[0063]具体在本发明的实施例中,方法在步骤1中,具体包括:

[0064]步骤101:以信贷业务中的正常用户为白样本,逾期用户为黑样本,进行用户打标,获得样本数据;具体在一个实施例中,对于信贷业务的骗贷场景,通常选取首期即逾期超过30天的作为黑样本,三期内未有逾期超过30天的作为白样本。

[0065]步骤102:在样本数据中提取用户相关的特征变量,这些特征变量中不可避免会遇到有些变量单类别占比多大的问题。本发明的目的就是解决众多特征变量的情况下,快速找到具有业务解释性及高区分性的特征变量组合,所以输入的特征变量不是固定的,特征变量包括数值型特征变量和类别型特征变量;

[0066]步骤103:对样本数据进行数据预处理,对样本数据中的异常值和缺失值进行处理,剔除不满足数据缺失率要求和表现为异常值的特征变量;

[0067]具体在一个实施例中,首先对数据中缺失值进行处理,在反欺诈场景中,通常面临着黑白样本极不平衡的问题,相应的特征变量也存在单类别占比过多的情况,但这并不...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides an interception strategy derivation method and system for credit and loan anti-fraud, which can quickly derive an effective interception strategy from massive data and identify users with high overdue risks, and comprises the following steps: acquiring sample data, extracting characteristic variables related to the users from the sample data, and performing data preprocessing, performing binning processing on the characteristic variables to obtain variable binning; calculating a WOE value and an IV value of the characteristic variable under each variable sub-box; wOE coding is carried out, and a WOE value is used to replace the numerical value of the characteristic variable corresponding to the variable binning; a prediction model is established and trained, the prediction model is evaluated through an evaluation index AUC, model parameters are adjusted, and an optimal prediction model is obtained; calculating scores of all variable sub-boxes through the optimal prediction model, and crossing the variable sub-boxes of which the scores are not greater than a set threshold value to generate an interception strategy; and verifying whether the interception strategies meet online conditions or not, and reserving all the interception strategies meeting the online conditions.

Description

technical field [0001] The present invention relates to the technical field of big data credit risk control, and in particular to a derivation method, system, computer device and computer-readable storage medium for interception strategy for credit anti-fraud. Background technique [0002] With the rapid development of the Internet, more and more industries and companies are actively integrating into the new century Internet model. Among them, Internet finance is slowly entering door-to-door, and credit lending based on personal and corporate big data information is quietly changing traditional bank lending. However, with the development of technology and the simplification of Internet lending methods, more and more new fraudulent loan fraud methods have appeared, causing immeasurable losses to the current Internet and bank credit business. [0003] The rise of big data has made it easier to obtain information, and it has given a direction to fight against fraud and loan ga...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06Q40/02G06F21/56G06K9/62
CPCG06F21/56G06Q40/03G06F18/29G06F18/214
Inventor 季爽陈良顾志文李剑许磊磊
Owner 江苏城乡建设职业学院
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products