Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Classifier construction method based on active learning

A technology of active learning and construction methods, applied in the field of data classification, can solve the problems that the training results cannot meet the expected goals, and achieve the effect of low time complexity, high classification performance, and optimized calculation methods

Active Publication Date: 2014-05-14
苏州飞宇互娱信息科技有限公司
View PDF4 Cites 20 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the BvSB algorithm only considers samples that are uncertain about the current classifier. In actual use, when samples are added to the training sample set, the uncertainty of the added samples will have an impact on the updated classifier, resulting in training results that cannot be achieved. expected goal
Therefore, the classifier constructed by the BvSB algorithm has certain defects

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Classifier construction method based on active learning
  • Classifier construction method based on active learning
  • Classifier construction method based on active learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0039] Example 1: see figure 1 As shown, a method for constructing a classifier based on active learning, generating a training sample set and training a classifier based on unlabeled samples and data features, includes the following steps:

[0040] (1) Randomly select 20-50 samples from the unlabeled sample set for manual labeling, construct the initial training sample set, and then build the initial classifier H according to the data characteristics of the initial training sample set (0) ;

[0041] (2) Using the classifier H obtained in the previous step (t) Calculate the BvSB value of each unlabeled sample. The calculation method of the BvSB value is:

[0042] ,

[0043] Where x is the sample in the unlabeled sample set U, Is the posterior probability that the sample belongs to the optimal category, Is the posterior probability that the sample belongs to the sub-optimal category, and t is the number of cycles from steps (2) to (6);

[0044] (3) According to step (2), select h unla...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a classifier construction method based on active learning. The current value and the prospective value of unlabeled samples are fully considered, so that samples with high value are selected. The method includes the steps that first, the current value of samples is calculated by using the optimal class information and the suboptimal class information of the samples first, and a part of samples with high value are selected to constitute a candidate sample set according to the high-low current value of the samples; afterwards, the prospective value of the samples in the candidate sample set is calculated, and the total value of the samples is obtained combined with the current value of the samples; finally, unlabeled samples with high value are selected to be labeled according to the total value of the samples, the unlabeled samples are added into a training sample set, and a classifier is updated. The experimental result on the basis of different data sets shows that the classifier with high classification accuracy can be obtained through the method under the condition that the samples of the same quantity are selected.

Description

Technical field [0001] The invention relates to a method for data classification by using a computer, in particular to a method for selecting and generating a training sample set from a large number of samples based on an active learning method, and constructing a trained data classifier. Background technique [0002] Automatic data classification is an important technology in computer automatic processing, which is widely used in data mining, medical diagnosis, traffic management, human body feature recognition and other fields. The data classification method in computer processing usually includes constructing a classifier model and training the classifier model with a training sample set to obtain a trained data classifier. [0003] In the data classification method, training the classifier model is the key difficulty, mainly because the classifier model requires users to label a large number of data training samples, and labeling a large number of data samples requires a lot of...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30
CPCG06F16/285G06F16/35
Inventor 吴健张宇徐在俊
Owner 苏州飞宇互娱信息科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products