Data sampling method and device

A data sampling and sample technology, applied in the field of image recognition, can solve the problems of low image recognition accuracy and incomplete feature learning, and achieve good network training effect

Active Publication Date: 2020-01-21
NANJING KUANYUN TECH CO LTD +2
View PDF8 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The above-mentioned data sampling method to solve the long-tail problem, from the beginning of the network training to the end of the network training, the sampled samples are the same in each network iteration process, so the training samples of the network training are the same, which leads to the incomplete feature learning of the network. low recognition accuracy

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data sampling method and device
  • Data sampling method and device
  • Data sampling method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0035] The principle and spirit of the present invention will be described below with reference to several exemplary embodiments. It should be understood that these embodiments are given only to enable those skilled in the art to better understand and implement the present invention, rather than to limit the scope of the present invention in any way.

[0036] It should be noted that although expressions such as "first" and "second" are used herein to describe different modules, steps, data, etc. of the embodiments of the present invention, expressions such as "first" and "second" are only for A distinction is made between different modules, steps, data, etc., without implying a particular order or degree of importance. In fact, expressions such as "first" and "second" can be used interchangeably.

[0037] At present, the main way to solve the long-tail problem is to uniformly sample samples of each category in the training data set, or to set a fixed weight for each sample in...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to the technical field of image recognition and provides a data sampling method and a data sampling device for solving the problem of long tail of a training data set in the prior art, and aims to solve the problem of incomplete network feature learning caused by the fact that training samples are the same in each network training process from the beginning to the end of network training, and the method comprises the following steps: acquiring the current iteration round of a network; updating the sample weight of each type of samples based on the current iteration roundof the network and the sample number of each type of samples, with the increase of the current iteration round of the network, the weight of the tail type of samples being gradually increased; and according to the updated sample weight of each type of samples, sampling samples meeting a preset condition as target samples. Through the increase of network iteration turns, the sample weights in the training data set are increased, but the increase amplitude of the weights of the tail class samples is large, the probability that the tail class samples are selected as the training samples is increased, the problem of long tails can be effectively relieved, and the network feature learning effect is good.

Description

technical field [0001] The present invention generally relates to the technical field of image recognition, and in particular relates to a data sampling method and device. Background technique [0002] Data sampling, that is, selecting some or all pictures from the training data set as training samples for image recognition network training. Most of the training data sets used in the current image recognition network training process have long-tail problems. The long tail problem is that a small number of class samples contain a large number of samples in the training data set, which are called head class samples, and most class samples contain a small number of samples in the training data set, called tail class samples. Data sampling is performed based on the training data set with long-tail problems. During the data sampling process, head-like samples are often sampled as training samples for image recognition network training. Therefore, the trained network tends to pre...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/62
CPCG06F18/214
Inventor 周博言崔权宋仁杰赵博睿陈钊民谢烟平魏秀参
Owner NANJING KUANYUN TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products