Text labeling method and device based on teacher supervision

A text and teacher technology, applied in the field of text annotation based on teacher supervision, can solve the problems that the deep learning model is insufficient to meet the accuracy requirements, the recall rate of the deep learning model is insufficient or even zero, and the generalization of the deep learning model is poor. The effect of avoiding label curing, improving recall rate, improving accuracy and reliability

Active Publication Date: 2019-08-16
CHINANETCENT TECH
View PDF5 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Due to the continuous development of natural language processing technology, the existing deep learning model based on character granularity is not enough to meet the increasing accuracy requirements of natural language processing technology for text annotation
Moreover, when a well-trained deep learning model is applied to a new field, the recall rate of the deep learning model is insufficient or even zero, resulting in poor generalization of the deep learning model and easy curing of word edge labels

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Text labeling method and device based on teacher supervision
  • Text labeling method and device based on teacher supervision
  • Text labeling method and device based on teacher supervision

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0064] In order to make the object, technical solution and advantages of the present invention clearer, the implementation manner of the present invention will be further described in detail below in conjunction with the accompanying drawings.

[0065] An embodiment of the present invention provides a teacher-supervised text tagging method, the execution body of the method may be a text tagging device, and the text tagging device may use a character granularity-based deep learning model (which may be called a character tagging model) to a A large number of texts to be labeled in the text labeling task are labeled, and then the same text to be labeled is subjected to word segmentation processing (called word segmentation processing) through a language model based on word granularity (which can be called a word segmentation model), and then the word The result of the segmentation (which may be called the word segmentation result) is checked and corrected against the preliminary t...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The embodiment of the invention provides a text labeling method and device based on teacher supervision, and belongs to the technical field of natural language processing. The method comprises: usinga character labeling model to label a to-be-labeled text, and generating a character labeling result containing labeling words; carrying out word segmentation processing on the to-be-labeled text through a preset word segmentation model to generate a word segmentation result containing word segmentation words; and according to the similarity between each annotated word and each segmented word, carrying out character annotation on the character annotation result again based on the segmented words to obtain a fusion annotation result, and outputting the fusion annotation result. By adopting themethod and the device, the text labeling accuracy and recall rate can be improved.

Description

technical field [0001] The invention relates to the technical field of natural language processing, in particular to a teacher-supervised text labeling method and equipment. Background technique [0002] Natural language processing (Natural Language Processing, NLP) technology can efficiently systematically analyze, understand and extract text data, so that computers can understand natural language and generate natural language, and then realize effective communication between humans and computers using natural language. Interaction (e.g. use of applications such as message auto-reply, voice assistant, etc.). Among them, text annotation technology provides a basis for the industrial application of natural language processing. [0003] Traditional machine learning (Machine Learning, ML) can learn a certain amount of text data, combined with keywords (Seed Words) to mine the associated features between texts, obtain a traditional machine learning model, and use the traditiona...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/27G06N3/04G06N3/08
CPCG06N3/08G06F40/295G06F40/289G06N3/044G06N3/045G06F40/284G06F40/53G06N20/00G06N7/01G06F40/58G06F18/251G06F18/2148
Inventor 蔡子健李金锋
Owner CHINANETCENT TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products