Supercharge Your Innovation With Domain-Expert AI Agents!

Text data labeling method, device, electronic equipment and storage medium

A text data and text technology, applied in the field of artificial intelligence, can solve problems such as mislabeling, and achieve the effect of ensuring accuracy and comprehensiveness

Active Publication Date: 2022-05-31
北京创新乐知网络技术有限公司 +1
View PDF13 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The accuracy of these annotation methods for text data annotation results is getting higher and higher, but there are still some obvious errors in the annotation of some text data.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Text data labeling method, device, electronic equipment and storage medium
  • Text data labeling method, device, electronic equipment and storage medium
  • Text data labeling method, device, electronic equipment and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0049] Step 102, obtain the text data to be marked.

[0050] In this embodiment, the text data to be marked includes a text type and a text title. The text data to be marked is required

[0061] Further, the first quantity and the second quantity corresponding to different text types may be the same or different. that is

[0062] For the label extraction of the text title, the word vector word2vec is mainly used, and the word vector word2vec is used to calculate

[0070] Wherein, the number of the reserved labels can not be limited, that is, if the correlation of all the candidate labels reaches

[0075] Wherein, the tag-related word can be the tag vocabulary itself, and can also be a vocabulary similar to the meaning of the tag. for example,

[0077] The electronic device sums the correlations corresponding to all m times (k=1, 2, ... m) to obtain a non-normalized correlation

[0079] In one embodiment, for the same label, the weights corresponding to different text types are n...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The present application discloses a text data tagging method, device, electronic device and storage medium, by using a tag prediction model to generate a first candidate tag of text data, and using a word vector model to generate a second candidate tag for a text title of the text data, Then calculate the correlation between the first candidate tag and the second candidate tag with respect to the text data, and extract the candidate tags whose correlation exceeds the corresponding correlation threshold as the machine-labeled data of the text data. This application can improve the accuracy and comprehensiveness of text data labeling.

Description

Text data annotation method, device, electronic device and storage medium technical field The application belongs to artificial intelligence technical field, specifically, relate to a kind of text data labeling method, device, computer Subdevices and storage media. Background technique [0002] Usually, on some text information exchange platforms, there will be a large number of texts with creative content such as blog posts and questions and answers. Data, in order to facilitate user navigation, retrieval and classification, the content created under the platform will be tagged. For the labeling processing of text data, algorithms such as artificial intelligence or machine learning are generally used to Can label, such as using TextCNN model, BERT model, etc. to automatically label text data. These notation methods are The accuracy of the labeling results of this data is also getting higher and higher, but there are still labels that produce obvious errors for some...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F40/258G06F40/279G06F16/35G06N3/08
Inventor 陈龙范飞龙
Owner 北京创新乐知网络技术有限公司
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More