Taxpayer credit evaluation method based on distributed automatic feature combination

A credit evaluation and feature combination technology, applied in data processing applications, instruments, finance, etc., to improve accuracy, reduce cumbersome and complicated artificial feature construction process, and reduce cumbersome effects

Active Publication Date: 2020-02-21
CHINA NAT SOFTWARE & SERVICE
View PDF5 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the scorecard model can only process processed features. To obtain a more accurate credit scoring effect, a large number of professionals are required to construct carefully calculated indicators.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Taxpayer credit evaluation method based on distributed automatic feature combination
  • Taxpayer credit evaluation method based on distributed automatic feature combination
  • Taxpayer credit evaluation method based on distributed automatic feature combination

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0042] This part describes in detail the specific implementation of the invention.

[0043] The training process of the credit evaluation model of distributed automatic combination features can be mainly divided into five steps S1-S5.

[0044] In step S1, it is necessary to construct a training sample of the credit evaluation model. The training sample selected here takes the taxpayer as the unit and includes the basic characteristics of the taxpayer in the four main fields of basic information, declaration information, tax information, invoice information, and relationship network. , where each domain includes a rich set of basic features. In addition, the taxpayer’s risk label is constructed according to the taxpayer’s historical risk situation. Taxpayers with risky behavior in historical records are used as black samples, and taxpayers without risky behavior are used as white samples for subsequent model training.

[0045] In step S2, a distributed random forest model is u...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a taxpayer credit assessment method based on distributed automatic feature combination. The method comprises the following steps: 1) training a random forest model by using a training sample and adopting a MapReduce distributed computing framework to obtain a distributed random forest model; 2) inputting the training samples into the distributed random forest model, and generating a plurality of input combined features of each training sample; 3) combining the generated combined features with the feature information of the corresponding taxpayers; 4) training a score card model by using the combined features; and 5) for a taxpayer to be subjected to credit assessment, generating a combined feature of the taxpayer by using the distributed random forest model, combining the combined feature with the feature information of the taxpayer, and then inputting the combined feature of the taxpayer into the trained score card model to predict the credit score of the taxpayer. According to the invention, accurate credit assessment of taxpayers can be carried out.

Description

technical field [0001] The invention relates to a credit evaluation model and a taxpayer credit evaluation method, specifically a credit evaluation model and a taxpayer credit evaluation method for automatic feature combination through a distributed random forest, belonging to the field of computer big data processing. [0002] technical background [0003] Credit evaluation has been developed in the field of bank credit for decades. It is mainly used to evaluate the personal credit of applicants for loans, and assists the issuance of loans through credit evaluation, reducing the bank's capital income and risk of capital recovery. [0004] Taxpayer credit assessment in the tax field has only emerged in recent years, and it is mainly based on expert experience. Tax experts select indicators that can represent tax risks based on their professional experience, and assign different scores to different indicators. For each taxpayer, a lot of manual analysis and investigation are r...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06Q40/00
CPCG06Q40/10
Inventor 刘宗前武锦王彦李雪峰韩佶兴付婷婷郭乐乐
Owner CHINA NAT SOFTWARE & SERVICE
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products