Annotation data generation method and device and computer readable storage medium

A technology for labeling data and computer programs, applied in the field of data models, can solve problems such as insufficient dispersion of features, lower recognition rate, model overfitting, etc., and achieve faster training speed and accuracy, expanded size and richness, and strong data randomness Effect

Active Publication Date: 2018-12-07
BLACKSHARK TECH NANCHANG CO LTD
View PDF4 Cites 21 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] 1. The length and width of the original image data corresponding to some models are small, and the amount of data that can be expanded by random cropping is limited
[0005] 2. When the original sample data is small, the data obtained by these methods is easy to overfit the model because the characteristics are not scattered enough
[0006] 3. Some models are sensitive to data stretching, and the recognition rate decreases significantly after stretching;
[0007] 4. Manually collecting and labeling data will consume a lot of manpower and energy

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Annotation data generation method and device and computer readable storage medium
  • Annotation data generation method and device and computer readable storage medium
  • Annotation data generation method and device and computer readable storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0076] refer to image 3 , in a preferred embodiment, the annotation data generation method also includes the following steps:

[0077] S700: Determine whether the amount of data in the labeled data set A' is greater than or equal to an expected amount of data;

[0078] First of all, it is necessary to judge the amount of data in the currently obtained labeled data set A', and whether the quality of the training model meets the expected amount of data, or the expected quality. That is to say, when analyzing and discriminating the data in the complete data set U, Whether the training model can accurately label the data can be verified by experiments to determine how well the training model fits the labeled data with the labeled data.

[0079] S800: When the amount of data in the labeled dataset A is less than the expected amount of data, take the union of the training dataset T and the labeled dataset A, and execute steps S500-S600 again.

[0080] When it is determined that t...

Embodiment 2

[0082] In another embodiment, the label data generation method further includes the following steps:

[0083] S700: Determine whether the amount of data in the labeled data set A' is greater than or equal to an expected amount of data;

[0084] First of all, it is necessary to judge the amount of data in the currently obtained labeled data set A', and whether the quality of the training model meets the expected amount of data, or the expected quality. That is to say, when analyzing and discriminating the data in the complete data set U, Whether the training model can accurately label the data can be verified by experiments to determine how well the training model fits the labeled data with the labeled data.

[0085] S800': When the amount of data in the labeled dataset A is less than the expected amount of data, replace the data in the fake dataset F with the data in the labeled dataset A, and execute steps S300-S600 again

[0086] Different from the trust in the training res...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides an annotation data generation method and device and a computer readable storage medium. The method comprises following steps of S100: acquiring a data universal set and annotated annotation data sets included in the data universal set; S200: analyzing data characteristics of the annotation data sets and manufacturing pseudo-data sets satisfying the data characteristics according to the data characteristics; S300: based on a GAN neural network, expanding the pseudo-data sets so as to form an expanded data sets; S400: identifying whether data in the expanded data sets needs to be annotated, and screening annotated data so as to form a training data set; S500: carrying out neural network trainings on the training data set so as to form a training model; and S600: basedon the training model, cleaning data in the data universal set except for the data in the annotation data sets, annotating data satisfying the training model and arranging the data in the annotation data sets. Thus, based on little data, a training data set which highly matches the sample data and is highly random can be rapidly and highly efficiently generated, and data quantity of the annotationdata is improved.

Description

technical field [0001] The present invention relates to the field of data models, in particular to a method, device and computer-readable storage medium for generating labeled data. Background technique [0002] With the rapid development of applications on smart terminals, and artificial intelligence technology based on applications, it has entered people's lives more and more widely. Whether it is daily use, games, work, etc., it will need to learn based on original sample data to understand the usage habits in this field, so as to make intelligent judgments. [0003] For the learning of original sample data, deep neural network technology can be used. The deep neural network technology has developed rapidly in recent years, and has achieved far-beyond-expected accuracy in the field of image recognition, and has achieved gratifying applications in many fields. However, in practical engineering applications, many special image recognition requirements lack training data s...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06N3/04G06K9/62
CPCG06N3/045G06F18/214
Inventor 郑斌徐晖
Owner BLACKSHARK TECH NANCHANG CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products