Annotated data processing method and device

A technology for labeling data and processing methods, applied in the field of data labeling, can solve the problems of reducing labeling efficiency, huge labeling information, and unbalanced data to be labelled, and achieves the effect of improving training effect, ensuring accuracy, and improving prediction effect.

Pending Publication Date: 2019-12-24
大箴(杭州)科技有限公司
View PDF0 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] Existing labeling systems and labeling methods support multi-person real-time labeling systems, all of which import labeling data at one time and hand it over to labelers. Since the distribution of labeling data is not known before labeling, the extracted data to be labeled i

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Annotated data processing method and device
  • Annotated data processing method and device
  • Annotated data processing method and device

Examples

Experimental program
Comparison scheme
Effect test

Example Embodiment

[0056] Hereinafter, exemplary embodiments of the present disclosure will be described in more detail with reference to the accompanying drawings. Although exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure can be implemented in various forms and should not be limited by the embodiments set forth herein. On the contrary, these embodiments are provided to enable a more thorough understanding of the present disclosure and to fully convey the scope of the present disclosure to those skilled in the art.

[0057] The embodiment of the present invention provides a method for processing annotated data, which can update the data to be annotated based on the annotation result of real-time statistics, and improve the training effect of the model, such as figure 1 As shown, the method includes:

[0058] 101. Obtain sample data under each category randomly selected from machine data as data to be labeled.

[0059] Among ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an annotated data processing method, an annotated data processing device, computer equipment and a computer storage medium, relates to the technical field of data annotation, and aims to adjust the distribution of to-be-annotated data and improve the effect of annotated data on model training. The method comprises the steps of obtaining sample data under each category randomly extracted from machine data to serve as to-be-labeled data; in a process of labeling the to-be-labeled data based on a labeling platform, counting labeled data under each category, and judging whether the labeled data under each category respectively reaches a training standard preset for a classification prediction model or not; if yes, taking the labeled data reaching the training standard category as training data, and inputting the training data into the network model for training to obtain a classification prediction model; and updating the to-be-labeled data according to the prediction probability of the classification prediction model for the test data.

Description

technical field [0001] The present invention relates to the technical field of data labeling, in particular to a processing method, device, computer equipment and computer storage medium for labeling data. Background technique [0002] In recent years, with the continuous development of computer and Internet technology, various intelligent applications have emerged one after another, and tools such as big data and artificial intelligence have been gradually applied to practice. Natural language processing is a direction of artificial intelligence, enabling computers to understand human language and understand the content, thoughts and emotions expressed in the language. [0003] Since the mainstream technology of natural language technology processing is mainly based on statistical machine learning, these technologies mainly rely on two aspects, one is the statistical model and optimization algorithm for different tasks; the other is the corresponding large-scale corpus. Th...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06K9/62G06N3/04G06N3/08
CPCG06N3/08G06N3/045G06F18/241
Inventor 刘逸哲
Owner 大箴(杭州)科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products