Unlock instant, AI-driven research and patent intelligence for your innovation.

Intelligent auxiliary annotation method and system for text data

A text data and intelligent technology, applied in the computer field, can solve the problems of high labor cost investment, poor reliability, low efficiency and accuracy of data labeling, and achieve the effect of low labor cost, high reliability and reduced labeling complexity

Pending Publication Date: 2022-07-29
中译语通信息科技(上海)有限公司
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The present invention provides an intelligent auxiliary labeling method and system for text data, aiming to solve the problems of data labeling methods in the prior art, high labor cost input and poor reliability, and low efficiency and accuracy of data labeling

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Intelligent auxiliary annotation method and system for text data
  • Intelligent auxiliary annotation method and system for text data
  • Intelligent auxiliary annotation method and system for text data

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0062] It should be understood that the specific embodiments described herein are only used to explain the present invention, but not to limit the present invention.

[0063] The main technical problems solved by the embodiments of the present invention are as follows:

[0064] The existing data labeling method is usually marked manually. This labeling method requires manual extraction of text features from a large amount of data, and manual feature labeling of the text to obtain a training set and a verification set. After training, the existing preprocessed original files can be labeled. In this process, the labor cost is high and the reliability is poor. The efficiency and accuracy of the neural network model for data labeling are relatively low.

[0065] In order to solve the above problems, the following embodiments of the present invention provide an intelligent auxiliary labeling scheme for data. By acquiring a data set of a related item, and then according to a pre-def...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses an intelligent auxiliary annotation method and system for text data, and the method comprises the steps: obtaining a to-be-annotated data set; according to a data annotation rule, automatically pre-annotating the text data in the data set to obtain a pseudo-annotated data set; obtaining manual modification behavior information of the pseudo-annotation data set, and performing dichotomy training on the pseudo-annotation data set according to the manual modification behavior information to obtain an evaluation result of the pseudo-annotation data set; and filtering the pseudo-annotation data set according to an evaluation result to obtain high-quality annotation data. According to the technical scheme, the problems that in the prior art, a data annotation mode is high in labor cost input, poor in reliability and low in text data annotation efficiency and accuracy can be solved.

Description

technical field [0001] The present invention relates to the field of computers, in particular to a method and system for intelligent auxiliary annotation of text data. Background technique [0002] Data annotation is one of the necessary processes in the construction of computer algorithm models, and it also determines the upper limit of the algorithm model. The existing data annotation is often manual annotation, and the cost of labeling is relatively high. In order to reduce the cost of data labeling, in the prior art, a machine learning model is usually constructed by artificially combining with a neural network to label the data. [0003] Specifically, the existing data labeling methods usually use pre-built machine learning models to automatically label the preprocessed original files, while the building process of the machine learning model requires manual extraction of text features from a large amount of data, and manual adjustment of text features. The text featur...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F40/279G06F40/216G06F16/35
CPCG06F40/279G06F40/216G06F16/35
Inventor 杨万征蔡超武学敏董乐乐
Owner 中译语通信息科技(上海)有限公司