Data annotation method and terminal

A technology of data and data sets, which is applied in the computer field, can solve problems such as high cost and time-consuming, and achieve the effects of improving model effects, reducing costs, and improving quality

Active Publication Date: 2018-09-07
TENCENT TECH (SHENZHEN) CO LTD
View PDF9 Cites 24 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

In the existing technology, repeated labeling of data requires a lot

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data annotation method and terminal
  • Data annotation method and terminal
  • Data annotation method and terminal

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0028] Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numerals in different drawings refer to the same or similar elements unless otherwise indicated. The implementations described in the following exemplary examples do not represent all implementations consistent with the present invention. Rather, they are merely examples of apparatuses and methods consistent with aspects of the invention as recited in the appended claims.

[0029] In order to improve the quality of data labeling in the prior art, each embodiment of the present invention proposes a data labeling method that consumes a lot of manpower and material resources, is costly, and takes a long time to repeatedly label data.

[0030] The data labeling method provided by the embodiment of the present invention first obtains the labeled data set D all , where ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a data annotation method a terminal. The data annotation method comprises the steps of obtaining an annotated data set Dall, wherein the data set Dall includes N pieces of data and annotation labels respectively corresponding to the N pieces of data, and N is a positive integer; dividing the N pieces of data into K parts so as to generate K first training samples, whereineach first training sample includes (K-1) parts of data, and the data in any two first training samples is not identical; training the data in each first training sample, and generating K first classification models; performing predicted annotation on data J by using the first classification model Mi, and determining a predicted label of the data J; and performing repeated annotation on the data Jif the predicted label of the data J is determined to be inconsistent with the annotation label. Therefore, the cost of performing repeated annotation on the annotated data is reduced, the time consumed by repeated annotation is reduced, the quality of data annotation is improved, and an excellent promotion role is played on improving the model effect.

Description

technical field [0001] The invention relates to the field of computer technology, in particular to a data labeling method and a terminal. Background technique [0002] As neural networks have made breakthroughs in various fields in recent years, more and more machine learning tasks have begun to be transferred to neural network-related models. Neural network-related models need to use labeled data as training data for machine learning, so the quality of data labeling has an important impact on the performance of the model. [0003] In the prior art, data can be marked by manual marking or automatic marking. However, no matter how the data is labeled, there are inevitable errors. In order to improve the quality of data labeling, it is necessary to label the data repeatedly. In the prior art, repeated labeling of data requires a lot of manpower and material resources, which is costly and time-consuming. Contents of the invention [0004] The present invention aims to sol...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06K9/62G06N3/08
CPCG06N3/08G06F18/24G06F18/214
Inventor 谭翊章王兴光
Owner TENCENT TECH (SHENZHEN) CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products