A text classification method based on a local and global mutual attention mechanism

A text classification and attention technology, applied in the direction of text database clustering/classification, text database query, unstructured text data retrieval, etc., can solve the problems of model deepening, no attempt to learn interaction, gradient disappearance, etc.

Active Publication Date: 2019-06-18
SOUTH CHINA UNIV OF TECH
View PDF10 Cites 40 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

C-LSTM can capture global long-term dependencies and local semantic features, but these two kinds of information are connected in a cascading manner, which makes the m

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A text classification method based on a local and global mutual attention mechanism
  • A text classification method based on a local and global mutual attention mechanism
  • A text classification method based on a local and global mutual attention mechanism

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0074] like figure 1 As shown, the present embodiment discloses a text classification method based on a local and global mutual attention mechanism, and the method includes the following steps:

[0075] Step S1. Obtain a text data set, preprocess the data, and map each word in the text sequence into a word vector.

[0076] Get sixteen datasets from benchmark text classification datasets like SUBJ, TREC, CR, 20Newsgroups, MovieReview and Amazon product reviews, given dataset Among them, W n =w 1 ,w 2 ,...w T is a text sequence, y n is its corresponding label, T is the length of the text sequence, and N is the number of samples in the dataset. make x i ∈ R d is the i-th word w in the text sequence i The corresponding d-dimensional word vector, here uses a 300-dimensional pre-trained word2vec word vector, the input text sequence can be expressed as an embedding matrix:

[0077]

[0078] in is a concatenation operation, and x 1:T ∈ R T×d .

[0079] Step S2, usin...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a text classification method based on a local and global mutual attention mechanism, and the method comprises the following steps: obtaining text data, carrying out the preprocessing of the text data, and expressing text words through a pre-trained word vector; Capturing global long-term dependence of the text sequence by using a long-short-term memory network, and acquiring local semantic features of the text sequence by using a multi-scale convolutional neural network; Taking the global long-term dependency and the local semantic features as input of a local and global mutual attention mechanism to obtain weighted global long-term dependency and weighted local semantic features; Carrying out weighted pooling to obtain a final global representation vector and a final local representation vector; And inputting the vectors to a full connection layer to fuse the global representation vector and the local representation vector, and then inputting the vectors to a classification layer to classify. According to the method, global long-term dependency and local semantic features are captured in parallel, interaction between the two features is learned in an explicit mode, better text global and local feature representation is obtained, and the text classification precision is further improved.

Description

technical field [0001] The invention relates to the technical field of text classification, in particular to a text classification method based on a local and global mutual attention mechanism. Background technique [0002] Text classification is a fundamental problem in natural language processing that requires the assignment of one or more predetermined categories to a sequence of text. The core of text classification is to learn a sequence representation to deal with issues such as sentiment analysis, question classification and topic classification. [0003] At present, to learn a sequence representation, the common method is to model the long-term dependent representation of the sequence or the local semantic features of the sequence. The convolutional neural network can better extract the local semantic features of the text sequence through the convolution kernel. Y.Kim proposed a multi-channel convolutional neural network, using static word vectors from word2vec and...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/27G06F16/33G06F16/35G06N3/04
Inventor 马千里余柳红陈子鹏田帅
Owner SOUTH CHINA UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products