Active learning sample selection strategy integrated with confidence criterion and diversity criterion

A technology of active learning and selection strategies, applied in character and pattern recognition, instruments, biological neural network models, etc., can solve problems such as not considering sample diversity, active learning methods relying on model performance, etc.

Inactive Publication Date: 2018-11-23
NANJING UNIV OF POSTS & TELECOMM
View PDF0 Cites 20 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The purpose of the present invention is to: aim at the defects existing in the prior art, propose an active learning sample selection strategy that combines confidence and diversity, a...

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Active learning sample selection strategy integrated with confidence criterion and diversity criterion
  • Active learning sample selection strategy integrated with confidence criterion and diversity criterion
  • Active learning sample selection strategy integrated with confidence criterion and diversity criterion

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0033] Specific embodiments of the present invention are described in detail below, but it should be understood that the protection scope of the present invention is not limited by the specific embodiments.

[0034] This embodiment provides an active learning sample selection strategy that combines confidence and diversity criteria, and is applied to a continuous learning framework. The present invention is stated below in conjunction with the examples of the field of audio recognition, and its flow process is as follows figure 1 shown, including the following steps:

[0035] Step 1. Train the model M based on the existing labeled data t .

[0036] First, the strategy is calculated based on the output of the model, we need to use the existing labeled training set D before using this strategy to select samples L build model M t . Among them, the initial training set D L The data of is preprocessed feature data. For example, in the field of audio, the input data of the mo...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to an active learning sample selection strategy integrated with a confidence criterion and a diversity criterion. The active learning sample selection strategy comprises the following steps: training a model Mt based on an existing labeled data set DL; predicting a current unlabelled data set DU by using the Mt to obtain a predicted vector set Pt; calculating an information entropy of each sample according to the Pt, and selecting front K samples each having a largest entropy; extracting feature representations of K unlabelled samples according to the Mt to obtain a feature vector set Ft; performing density peaks clustering on the Ft, respectively selecting corresponding proportion and number of samples from a center of a cluster generated by the density peaks clustering, and an edge point and an outlier of the cluster, handing the samples to an expert for labeling, adding the labeled data set DL, and simultaneously deleting corresponding samples from the unlabelled data set DU; updating the Mt by using the current labeled data set DL to obtain Mt + 1; and repeating the above steps till labeling of all samples is ended or reaches to a designated number of iteration times to complete a whole algorithm flow.

Description

technical field [0001] The invention relates to an active learning sample selection strategy, in particular to an active learning sample selection strategy that combines confidence criteria and diversity criteria, and belongs to the technical field of computer applications. Background technique [0002] Statistical learning techniques have been widely used in recent years. When using some traditional supervised learning methods for classification, the larger the training data, the better the classification effect. However, in many scenarios in real life, it is difficult to obtain labeled samples, which requires experts in the field to manually label, which requires a lot of time and economic costs. Moreover, if the size of the training samples is too large, the time spent on training will be relatively large. So is there a way to use fewer training samples to obtain a classifier with better performance? Active learning (Active Learning) provides us with this possibility. ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06K9/62G06N3/04
CPCG06N3/045G06F18/23213
Inventor 王晓军潘龙飞
Owner NANJING UNIV OF POSTS & TELECOMM
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products