Multi-label text classification processing method and system and information data processing terminal

A text classification and processing method technology, applied in the multi-label text classification processing method and system, and the field of information data processing terminals, can solve the problems of increasing training error, low correlation, inability to flexibly adjust the length of context text, etc., to improve accuracy , the effect of avoiding errors

Pending Publication Date: 2020-07-17
XIDIAN UNIV +1
View PDF4 Cites 48 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The existing methods have the following technical problems: (1) When extracting text semantic information, when expressing the semantics of words in a text sequence, the length of the contextual text cannot be flexibly adjusted according to the length of the sentence. Even if the context is considered in the text, word order is rarely considered ; (2) During model training, the correlation between tags and some keywords in the text sequence often plays a very important role, and most existing models ignore this relationship; (3) When predicting tags, it is

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Multi-label text classification processing method and system and information data processing terminal
  • Multi-label text classification processing method and system and information data processing terminal
  • Multi-label text classification processing method and system and information data processing terminal

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0037] In order to make the object, technical solution and advantages of the present invention more clear, the present invention will be further described in detail below in conjunction with the examples. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention.

[0038] Aiming at the problems existing in the prior art, the present invention provides a multi-label text classification processing method and system, and an information data processing terminal. The present invention will be described in detail below with reference to the accompanying drawings.

[0039] Such as figure 1 As shown, the multi-label text classification processing method provided by the embodiment of the present invention includes the following steps:

[0040] S101: Obtain a data set including a text sequence and a label space;

[0041] S102: Preprocessing the data, removing meaningless words, converting tra...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention belongs to the technical field of natural language processing, and discloses multi-label text classification processing method and system and an information data processing terminal. Themethod comprises the steps of: obtaining a data set; preprocessing the data set and dividing the data set into a training set and a test set; finely adjusting and extracting global feature vectors ofwords in the text sequence through a BERT pre-training model, and aggregating the global feature vectors by adopting a convolutional neural network to obtain semantic vectors of the words in the textsequence; constructing an attention weight coefficient matrix, and respectively weighting the semantic vector of each word and a weight coefficient vector in the optimal weight coefficient matrix toobtain an attention vector of the label; and normalizing the attention vectors of the tags to obtain the probability of each tag, and selecting several tags with the maximum probability as the category of the text. According to the method, global and local features of the text sequence are extracted, the influence of keywords in the text on tag categories is considered, and the classification accuracy is improved.

Description

technical field [0001] The invention belongs to the technical field of natural language processing, and in particular relates to a multi-label text classification processing method and system, and an information data processing terminal. Background technique [0002] Text classification is the process of classifying texts into a certain category or categories according to a certain classification system or standard, which is of great significance in the fields of natural language processing and text mining. At present, text classification has been widely used in many fields such as information retrieval, automatic classification of Web documents, automatic summarization, and text filtering. Multi-label text classification, which is different from traditional binary classification or multi-classification, deals with the task of multiple categories of text in real life, which is a complex and challenging task in natural language processing. [0003] At present, multi-label te...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F16/35G06K9/62G06N3/04G06N3/08
CPCG06F16/35G06N3/08G06N3/045G06F18/2411G06F18/25G06F18/214
Inventor 裴庆祺王玉燕马立川肖阳
Owner XIDIAN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products