Unlock instant, AI-driven research and patent intelligence for your innovation.

Cost-sensitive dynamic clustering method for carrying out rapid feature learning on unbalanced data

A cost-sensitive, feature learning technology, applied in the field of financial transaction risk control, to achieve the effects of stable and robust model learning, fast learning, and reduced training time

Active Publication Date: 2020-05-19
ZHEJIANG UNIV
View PDF0 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0007] The present invention overcomes the deficiencies of the prior art, and provides a cost-sensitive dynamic clustering method in the case of reducing time complexity, which can realize fast feature learning on unbalanced data

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Cost-sensitive dynamic clustering method for carrying out rapid feature learning on unbalanced data
  • Cost-sensitive dynamic clustering method for carrying out rapid feature learning on unbalanced data

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0025] A cost-sensitive dynamic clustering method for fast feature learning on imbalanced data, comprising the following steps:

[0026] 1) A benchmark feed-forward neural network;

[0027] Prepare a two-category unbalanced data set. There are N samples in the training set, and the feature dimension of each sample is d-dimensional. Construct a benchmark feedforward neural network, including three layers: input layer, hidden layer, and output layer, and the number of neurons in each layer is d, 2d, and 1, respectively. The parameters in the middle of the neural network are respectively denoted as W 0 and W 1 , the activation function used in the hidden layer is RELU, the form is f(x)=max(x,0), and the output layer uses the Sigmoid function, the form is f(x)=1 / 1+e -x . Note that the input sample feature is x, and the expression of the hidden layer is h, then h=RELU(W 0 *x), the expression of the output layer is o, then o=Sigmoid(W 1 *h).

[0028] 2) Relabel the sample lab...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a cost-sensitive dynamic clustering method for carrying out rapid feature learning on an unbalanced data set, and the method comprises the following steps: firstly enabling thewhole training set of unbalanced data to be transmitted to a feedforward neural network, and obtaining the sample representation in front of an output layer; setting the number K of classes of clustering, taking out sample representations belonging to a large class of the data set, and clustering the batch of samples into K classes by using a K-Means method; utilizing the obtained clustering labels of the batch of samples as class labels for training, calculating neural network loss under a cost sensitivity coefficient, and utilizing back propagation to train a neural network; and iterativelycalculating the representation of the next batch of samples, setting the initialization label of K-Means as the K-Means label of the last round of large class samples, and continuing the training process until convergence. The method can better solve the problem of model bias in unbalanced data set training, has good performance in large-class and small-class classification results, and is used for classification learning of unbalanced data in financial data.

Description

technical field [0001] The invention belongs to the field of risk control of financial transactions. Aiming at problems such as unbalanced machine learning caused by the fact that fraud cases are very rare compared with normal cases in risk control of financial transactions, a cost-sensitive dynamic aggregation method for fast feature learning on unbalanced data is proposed. class method. Background technique [0002] With the Internetization of traditional finance and the rapid development of Internet finance, the black industrial chain of the Internet has developed rapidly with the trend of collectivization and industrialization. In order to resist card theft, counterfeit cards, fleece, cash out, illegal fund-raising and other behaviors in the black industry chain, the financial industry combines the big data platform and the expert experience of business personnel to build a central risk control system for financial business. However, the black industry chain has a varie...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06Q20/40G06K9/62
CPCG06Q20/4016G06F18/232G06F18/2411
Inventor 宋明黎郑铜亚
Owner ZHEJIANG UNIV