Optimization method and device for data annotation

An optimization method and a technology for optimizing devices, which are applied in the field of data analysis, can solve problems such as low data quality, low accuracy of data labeling, and no use value, and achieve the effect of improving accuracy

Active Publication Date: 2018-07-03
BEIJING GRIDSUM TECH CO LTD
View PDF4 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] At present, when data is sorted and marked, it is mainly done manually. Data editors manually mark all the data that needs to be marked one by one. However, when there is a large amount of data, it needs to consume huge manpower and material resources. Labeling, and some data after manual labeling will have low data quality and useless data in the process of training the model, resulting in low accuracy of data labeling

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Optimization method and device for data annotation
  • Optimization method and device for data annotation
  • Optimization method and device for data annotation

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0025] Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. Although exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited by the embodiments set forth herein. Rather, these embodiments are provided for more thorough understanding of the present disclosure and to fully convey the scope of the present disclosure to those skilled in the art.

[0026] An optimization method for data labeling provided by an embodiment of the present invention, such as figure 1 As shown, the method includes:

[0027] 101. Select the data to be labeled for feature vectorization processing.

[0028] Specifically, the selection of the data to be labeled may be selecting part of the data to be labeled from the data set to be labeled, and the feature vectorization process may be performing feat...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an optimization method and device for data annotation, relates to the technical field of data analysis, and solves the problem of low data annotation accuracy. The method comprises the following steps that: firstly, selecting data to be annotated for carrying out feature vectoring processing; then, clustering the data to be annotated, which is subjected to the feature vectoring processing; according to a clustering result, carrying out temporary annotation on the data to be annotated; classifying the data subjected to the temporary annotation; and according to a classification result and a preset condition, determining data used for model training in the data subjected to the temporary annotation, and carrying out corresponding annotation. The method is suitable fordata annotation.

Description

technical field [0001] The invention relates to the technical field of data analysis, in particular to an optimization method and device for data labeling. Background technique [0002] In recent years, with the rapid development of the Internet, the application of data training models has become more and more extensive. Data labeling is the basis of many supervised machine learning techniques. Users select a certain amount of suitable data to train the model according to the labeled data, and get models that can be used. For marked data, users can grab and collect from the Internet according to actual needs, and obtain it after sorting and marking the captured data. [0003] At present, when data is sorted and marked, it is mainly done manually. Data editors manually mark all the data that needs to be marked one by one. However, when there is a large amount of data, it needs to consume huge manpower and material resources. Labeling, and some data after manual labeling wil...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06F16/00
Inventor 王天祎
Owner BEIJING GRIDSUM TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products