Data label correcting method and device and storage medium

A data label and data technology, which is applied in the direction of electrical digital data processing, special data processing applications, natural language data processing, etc., can solve the problems of loss, loss of data information, and low reliability of data labels, so as to improve reliability and improve Accuracy, solve the effect that is difficult to guarantee

Active Publication Date: 2018-11-27
GUANGZHOU DUOYI NETWORK TECH +2
View PDF7 Cites 13 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] In the process of implementing the embodiment of the present invention, the inventors found that because manual labeling depends on the knowledge and energy of the staff, cross-entropy screening loses part of the data information, information retrieval depends on a certain test set, and the quality of the retrieved data is difficult to guarantee. However, data discarding has the risk of important data loss, which leads to low reliability of data labeling of existing data, and the prediction accuracy of the obtained machine learning model is not high.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data label correcting method and device and storage medium
  • Data label correcting method and device and storage medium
  • Data label correcting method and device and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0043] The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some, not all, embodiments of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without creative efforts fall within the protection scope of the present invention.

[0044] see figure 1 , is a schematic flowchart of a method for correcting a data label provided by an embodiment of the present invention. The correction method provided in Embodiment 1 of the present invention includes step S110 to step S150.

[0045]S110. Load the data set to be corrected; wherein, the data set to be corrected includes a training set and a test set, and the data in the training set and the test set are marked with preset data labels.

[0046] W...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a data label correcting method, and relates to the field of machine learning. The method comprises the steps of loading a data set to be corrected; training a machine learningmodel to obtain a matched model; adopting data of the testing set as an input data input matching model, and obtaining matching results output by the matching model so as to update a data label of each piece of input data; when the number of the matching results does not reach a preset value, on the basis of the data of the data set to be corrected, constructing a novel training set and a novel testing set; when the number of the matching results reaches a preset value, combining the obtained matching results and preset data labels, for each piece of data in the data set to be corrected, calculating a correction confidence degree of each label so as to correct the data label of each piece of data in the data set to be corrected. The invention further provides a data label correcting deviceand a storage medium, the reliability of data labels of data can be effectively improved, and therefore the prediction accuracy degree of the obtained machine learning model is increased.

Description

technical field [0001] The present invention relates to the field of machine learning, in particular to a data label correction method, device and storage medium. Background technique [0002] In supervised learning tasks, the training of machine learning or deep learning systems requires the use of large amounts of data labeled with corresponding data labels. Generally speaking, the more data with higher labeling quality is used to train the model, the more the trained model can reflect the real situation, and the more reliable the prediction result of unknown data is. [0003] In order to improve the labeling quality of data, it is necessary to find labels that match the data. In the existing technology, in order to improve the quality of data labeling, technical means commonly used include manual labeling, cross-entropy screening, information retrieval, and data discarding. Among them, manual labeling manually labels the data; cross-entropy screening divides the origina...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30G06F17/21
CPCG06F40/117
Inventor 徐波
Owner GUANGZHOU DUOYI NETWORK TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products