Integrated transfer learning method for classification of unbalance samples

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A transfer learning and unbalanced technology, applied in the field of machine learning, can solve problems such as classification accuracy decline, imbalance, and importance difference, and achieve the effect of improving efficiency and accuracy, and increasing contribution rate

Inactive Publication Date: 2012-06-27

BEIJING TECHNOLOGY AND BUSINESS UNIVERSITY

View PDF2 Cites 53 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

However, in the real world, the distribution of samples representing two different classes can be extremely unbalanced, and there are also large differences in importance

[0008] In addition, there are often a large amount of redundant data in the auxiliary data, which may be very different from the target data set. Their existence will not only affect the training speed of the model, but also lead to a decline in classification accuracy.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0046] The integrated migration learning method (referred to as UBITLA) of unbalanced sample classification provided by the present invention, the steps are as follows (refer to figure 1 ):

[0047] 1. Input: The input data comes from two parts: migration auxiliary data set A and target data set O. Part of the data is extracted from these two parts of data and mixed in proportion to form a training data set C={(X 1 , Y 1 ), (X 2 , Y 2 ),…, (X N , Y N )}, where (X i , Y i ) is a training sample composed of sample feature attribute vector and sample category. i=1, 2, . . . , N. The first n samples in C are the data in A, and the remaining m samples in C are the data in O (n+m=N). The predetermined number of iterations is T. where X i ∈X, X is the input sample data, X i Is the characteristic attribute vector of the sample, the dimension is q, Y i ∈{0,+1} is the class label of the sample.

[0048] 2. Initialize sample weights:

[0049]

[0050] in, is the ini...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention relates to an integrated transfer learning method for classification of unbalance samples, which comprises the following steps of: in the initializing process, giving different weights to positive and negative samples to ensure that the negative samples which account a small ratio for the total samples and have a large amount of information have large initial weights; in the training process in each round, extracting part of samples according to a certain ratio and using the selected samples as a training subset to carry out training, and after finishing the training, selecting the classifier with the smallest error from a plurality of simple classifiers as a weak classifier and regulating the training dataset according to a redundant data dynamic eliminating algorithm; and obtaining a weak classifier sequence after T rounds of iteration and overlaying and combining a plurality of weak classifiers into a strong classifier. According to the invention, the classification law of novel data which is distributed similarly with old data is found by effectively utilizing the classification law of the old data; particularly, a novel method is provided for solving the problem of classification of the data which is classified in an unbalance mode; the effect of a small amount of the negative samples in the classification process in the classification training process is ensured; the contribution rate of the negative samples is effectively improved; and the classification efficiency and accuracy are improved.

Description

technical field [0001] The invention belongs to the field of machine learning. Aiming at auxiliary training data with a large amount of redundant data and unbalanced positive and negative samples, an improved integrated transfer learning algorithm is proposed, and the transfer of these auxiliary training data is used to help target data to be classified. Background technique [0002] Migration learning is a hot topic in the field of machine learning in recent years. It aims at the small amount of labeled data in new tasks, and proposes to effectively use outdated data migration to new tasks: There are differences, but there will certainly be some data that will help new classification problems. In order to be able to find these useful data, a small amount of new data that has been classified is used to mine valuable information in old data. Finally, a more efficient classification model is trained based on all the useful information in the two parts of the data to realize k...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G06N5/00

Inventor 于重重谭励田蕊刘宇吴子珺

Owner BEIJING TECHNOLOGY AND BUSINESS UNIVERSITY

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Integrated transfer learning method for classification of unbalance samples

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology