Check patentability & draft patents in minutes with Patsnap Eureka AI!

Training sample annotation cost reduction method for transfer learning

A technology of training samples and transfer learning, applied in the field of transfer learning of artificial intelligence, can solve the problems that the common feature space affects the final model learning effect, cannot truly represent the target task samples, and affects the target task learning, etc., and achieves the quality of the labeled sample set. The effect of good, increased labeling cost, and decreased labeling cost

Pending Publication Date: 2020-12-01
GUIZHOU NORMAL UNIVERSITY
View PDF0 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

In this way, a large number of labeled samples can be obtained at one time, but there are the following problems: (1) The information of these source task samples has already been included in some models and parameters from the source task used by the target model, and it is somewhat difficult to use them again for training. waste of computing resources
(2) It does not solve the problem of repeated labeling waste when labeling new target task samples
(4) Due to the difference between the source task and the target task, a large number of different samples ( figure 1 samples in region 1) will affect the learning of the real target task
(2) The sample in area 2 is retrieved as a substitute for the current target task sample, but the retrieved sample cannot truly represent the current target task sample
[0010] The choice of public feature space will obviously affect the effect of final model learning, but finding the best public feature space is transfer learning rather than the concern of the present invention

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Training sample annotation cost reduction method for transfer learning
  • Training sample annotation cost reduction method for transfer learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0039] In this embodiment, the technical solution provided by the present invention is applied to obtain labeling data for model training for a machine vision task of recognizing pictures of cats and dogs.

[0040] Preparation and instructions before implementing the program:

[0041] Source task: CIFAR-10 classification.

[0042] The source task has annotated sample set: the open dataset CIFAR-10, there are 10 different object categories (including cats and dogs), and each category has 6000 color images of size 32*32.

[0043] Source task model: A 10-classification model trained on the CIFAR-10 dataset. The model is a deep neural network model with VGG16 structure.

[0044] Objective task: To accurately determine whether the animal in an input picture from an Internet pet website is a cat, a dog, or something else (divided into 3 categories).

[0045] Target task unlabeled sample set: 50,000 colored pet pictures of different sizes downloaded from the pet website, most of w...

Embodiment approach

[0051] start the implementation (e.g. figure 2 shown):

[0052] (1) The pictures of cats and dogs in the labeled sample set of the source task are taken out to form an alternative sample set S. Another feasible approach is to remove the non-animal categories (airplanes, cars, boats, trucks) in the labeled sample set of the source task, and then change the categories of non-cats and dogs (birds, deer, frogs, horses) to "Other" uniformly. Alternative sample set S.

[0053] (2) Use the common feature mapping function to map all samples in the surrogate sample set to the common feature space. That is, each sample in the replacement sample set is used as the input of the source task model, and then the corresponding output is extracted from the output of the feature extraction network module of the source task model, and all the outputs form a new set.

[0054] (3) Select N=ρ*T samples from the target task unlabeled sample set to form the target task sample set U to be labeled....

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a training sample annotation cost reduction method for transfer learning. The method is used for carrying out distinguishing processing on a to-be-annotated sample of a targettask. The method comprises the following steps: annotating a target task sample which is not located in a source task sample space; and for a target task sample located in the source task sample space, automatically finding the optimal substitution from the labeled sample set of the source task. According to the method, the annotation quantity of the training samples can be reduced, repeated annotation of the samples is avoided, the annotation sample quantity is increased under the condition that the annotation cost is not increased, and the actual sample annotation cost can be controlled by adjusting parameters. Besides, the method can keep the model performance stable under the condition of compressing a large annotation cost, and can be matched with any sample selection method for use.Meanwhile, the method is also suitable for application scenarios in which the model needs to be updated and a personalized model is established on the basis of a universal model.

Description

technical field [0001] The invention relates to the field of migration learning of artificial intelligence, in particular to a method for reducing the labeling cost of target task samples by utilizing source task knowledge and existing labeling data. Background technique [0002] Current successful machine learning models attach great importance to data, and their high performance relies on a large amount of labeled data. Machine learning models have been successfully applied to tasks that have accumulated or can easily obtain a large amount of labeled data (such as computer vision, text translation, speech recognition, etc.), however, more application scenarios have not accumulated or can not easily obtain a large amount of labeled data. To cope with the long tail of the distribution of application scenarios, the task knowledge that has been learned needs to be transferable to new tasks. [0003] The goal of transfer learning is to transfer the knowledge learned from the s...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06K9/00G06K9/62G06N20/00
CPCG06N20/00G06V40/10G06F18/214
Inventor 曹永锋刘大鹏苏彩霞王鹏举
Owner GUIZHOU NORMAL UNIVERSITY
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More