Training sample annotation cost reduction method for transfer learning

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology of training samples and transfer learning, applied in the field of transfer learning of artificial intelligence, can solve the problems that the common feature space affects the final model learning effect, cannot truly represent the target task samples, and affects the target task learning, etc., and achieves the quality of the labeled sample set. The effect of good, increased labeling cost, and decreased labeling cost

Pending Publication Date: 2020-12-01

GUIZHOU NORMAL UNIVERSITY

View PDF0 Cites 2 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

In this way, a large number of labeled samples can be obtained at one time, but there are the following problems: (1) The information of these source task samples has already been included in some models and parameters from the source task used by the target model, and it is somewhat difficult to use them again for training. waste of computing resources

(2) It does not solve the problem of repeated labeling waste when labeling new target task samples

(4) Due to the difference between the source task and the target task, a large number of different samples ( figure 1 samples in region 1) will affect the learning of the real target task

(2) The sample in area 2 is retrieved as a substitute for the current target task sample, but the retrieved sample cannot truly represent the current target task sample

[0010] The choice of public feature space will obviously affect the effect of final model learning, but finding the best public feature space is transfer learning rather than the concern of the present invention

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0039] In this embodiment, the technical solution provided by the present invention is applied to obtain labeling data for model training for a machine vision task of recognizing pictures of cats and dogs.

[0040] Preparation and instructions before implementing the program:

[0041] Source task: CIFAR-10 classification.

[0042] The source task has annotated sample set: the open dataset CIFAR-10, there are 10 different object categories (including cats and dogs), and each category has 6000 color images of size 32*32.

[0043] Source task model: A 10-classification model trained on the CIFAR-10 dataset. The model is a deep neural network model with VGG16 structure.

[0044] Objective task: To accurately determine whether the animal in an input picture from an Internet pet website is a cat, a dog, or something else (divided into 3 categories).

[0045] Target task unlabeled sample set: 50,000 colored pet pictures of different sizes downloaded from the pet website, most of w...

Embodiment approach

[0051] start the implementation (e.g. figure 2 shown):

[0052] (1) The pictures of cats and dogs in the labeled sample set of the source task are taken out to form an alternative sample set S. Another feasible approach is to remove the non-animal categories (airplanes, cars, boats, trucks) in the labeled sample set of the source task, and then change the categories of non-cats and dogs (birds, deer, frogs, horses) to "Other" uniformly. Alternative sample set S.

[0053] (2) Use the common feature mapping function to map all samples in the surrogate sample set to the common feature space. That is, each sample in the replacement sample set is used as the input of the source task model, and then the corresponding output is extracted from the output of the feature extraction network module of the source task model, and all the outputs form a new set.

[0054] (3) Select N=ρ*T samples from the target task unlabeled sample set to form the target task sample set U to be labeled....

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a training sample annotation cost reduction method for transfer learning. The method is used for carrying out distinguishing processing on a to-be-annotated sample of a targettask. The method comprises the following steps: annotating a target task sample which is not located in a source task sample space; and for a target task sample located in the source task sample space, automatically finding the optimal substitution from the labeled sample set of the source task. According to the method, the annotation quantity of the training samples can be reduced, repeated annotation of the samples is avoided, the annotation sample quantity is increased under the condition that the annotation cost is not increased, and the actual sample annotation cost can be controlled by adjusting parameters. Besides, the method can keep the model performance stable under the condition of compressing a large annotation cost, and can be matched with any sample selection method for use.Meanwhile, the method is also suitable for application scenarios in which the model needs to be updated and a personalized model is established on the basis of a universal model.

Description

technical field [0001] The invention relates to the field of migration learning of artificial intelligence, in particular to a method for reducing the labeling cost of target task samples by utilizing source task knowledge and existing labeling data. Background technique [0002] Current successful machine learning models attach great importance to data, and their high performance relies on a large amount of labeled data. Machine learning models have been successfully applied to tasks that have accumulated or can easily obtain a large amount of labeled data (such as computer vision, text translation, speech recognition, etc.), however, more application scenarios have not accumulated or can not easily obtain a large amount of labeled data. To cope with the long tail of the distribution of application scenarios, the task knowledge that has been learned needs to be transferable to new tasks. [0003] The goal of transfer learning is to transfer the knowledge learned from the s...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G06K9/00G06K9/62G06N20/00

CPCG06N20/00G06V40/10G06F18/214

Inventor 曹永锋刘大鹏苏彩霞王鹏举

Owner GUIZHOU NORMAL UNIVERSITY

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Training sample annotation cost reduction method for transfer learning

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment approach

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology