Text classification method based on cyclic convolution network

A text classification and circular convolution technology, applied in special data processing applications, instruments, electrical digital data processing, etc., can solve problems such as adverse effects on results, sensitivity to window size, noise, etc., to reduce data sparsity and improve performance Effect

Active Publication Date: 2015-04-29
INST OF AUTOMATION CHINESE ACAD OF SCI
View PDF2 Cites 48 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

For example, when a recurrent neural network constructs text semantics, it needs to first construct a tree structure. This step may depend on the accuracy of the syntax tree; In fact, not all the key information of the text is in the last part; the convolutional neural network needs to manually set a window to capture the context

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Text classification method based on cyclic convolution network
  • Text classification method based on cyclic convolution network
  • Text classification method based on cyclic convolution network

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0029] In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be described in further detail below in conjunction with specific embodiments and with reference to the accompanying drawings.

[0030] The basic idea of ​​the present invention is to construct a better context representation, so that words can be disambiguated, and then a better text representation can be obtained for text classification.

[0031] For text classification, the core problem is text representation. Traditional methods often lose word order information, and their improved methods also suffer from data sparsity. In view of these two points, this method proposes to use a recurrent network to model the context, retain as long as possible word order information, and optimize the representation of the current word; and use the largest pooling technology to extract the most useful words and phrases for text classification.

[0032] Accordi...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a text classification method based on a cyclic convolution network. The text classification method comprises the following steps: step 1, representing context vectors of all words by using a bidirectional cyclic network; step 2, combining the context vectors and a word vector of a current word into the representation of the current word; step 3, extracting the most important context information by using a maximum pool technology to obtain text representation; step 4, carrying out text classification by using the text representation. According to the method disclosed by the invention, more word order information in a text can be kept and a long-distance text dependence relation is captured; semantics of the words can be accurately described and the words and phrases, which have the greatest influences on the text classification, are found by the maximum pool technology, so that the text classification accuracy rate is effectively improved. A test shows that the efficiency of the method is averagely improved by 1% on the aspect of a plurality of text classification data sets.

Description

technical field [0001] The invention relates to the technical field of natural language processing, in particular to a text classification method based on a circular convolution network. Background technique [0002] Text classification technology is an important technology in natural language processing, and it is a key step in tasks such as web page retrieval, sentiment analysis, and spam identification. The goal of text classification is to give a set of classified texts, use these texts to learn a classification method, and classify other texts into known categories. [0003] The key issue in text classification is feature representation, and the most commonly used feature representation method is the bag-of-words model. In the bag of words model, the most commonly used features are words, binary phrases, multivariate phrases (n-grams), and some manually extracted template features. After feature representation, traditional models often use methods such as word frequen...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30G06F17/27
CPCG06F16/353
Inventor 徐立恒刘康赵军来斯惟
Owner INST OF AUTOMATION CHINESE ACAD OF SCI
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products