Text classification algorithm based on a circulating neural network variant and a convolution neural network

A convolutional neural network and cyclic neural network technology, applied in the field of text classification algorithms based on cyclic neural network variants and convolutional neural networks, can solve problems such as poor classification effect and difficulty in extracting key semantic features.

Inactive Publication Date: 2019-02-22
XI'AN POLYTECHNIC UNIVERSITY
View PDF5 Cites 26 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] The purpose of the present invention is to provide a text classification algorithm based on the variant of the cyclic neural network and the convolutional neural network, and combine the variant of the cyclic neural network with the convolutional neural network to solve the problem of long texts existing in the prior art in text classification. It is difficult to extract semantic key features at the same time, and the classification effect is poor

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Text classification algorithm based on a circulating neural network variant and a convolution neural network
  • Text classification algorithm based on a circulating neural network variant and a convolution neural network
  • Text classification algorithm based on a circulating neural network variant and a convolution neural network

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0063] The present invention will be described in detail below with reference to the drawings and specific embodiments.

[0064] The present invention is based on the text classification algorithm of the cyclic neural network variant and the convolutional neural network, and the process is as follows figure 1 As shown, follow the steps below:

[0065] Step 1. Preprocess the data set SogouC and data set THUCNews, and divide the preprocessed data set SogouC and data set THUCNews into training set and test set, and train the text data in each training set and test set Is a sentence vector;

[0066] Specifically: use jieba Chinese word segmentation to segment the data set SogouC and the data set THUCNews, remove stop words and punctuation, and then divide the preprocessed data set SogouC and data set THUCNews into training and test sets, respectively, and training set The ratio of the amount of text data in the test set to the amount of text data in the test set is 7:3, and then throug...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a text classification algorithm based on a circulating neural network variant and a convolution neural network, The algorithm includes step 1, preprocessing that data set SogouC and the data set THUCNews, dividing the two data set into a training set and a test set, and training the text data in the respective training set and the test set into sentence vectors; step 2, using the training set text of the two datasets in Step 1 to establish the BGRU-CNN hybrid model; step 3, establishing an objective function, and using a random gradient descent method to train that BGRUestablish in the step 2; step 4, inputting the text sentence vector of the test set into the BGRU trained in step 3 in the two data sets. The classification results are obtained. The invention solvesthe problems that the difficulty of extracting the semantic key features is great and the classification effect is poor when the long text is classified in the prior art.

Description

Technical field [0001] The invention belongs to the technical field of natural language processing methods, and relates to a text classification algorithm based on a cyclic neural network variant and a convolutional neural network. Background technique [0002] At present, the Internet is developing rapidly, producing a large amount of text information every moment. How to effectively classify and manage a large amount of text, and then quickly understand the value of information, is the focus of many researchers. Among them, the long text contains many different keywords. Therefore, in text classification, keeping the structure of the long text intact, maintaining the order between text words, and learning the text context semantics can improve the text classification effect of the long text. [0003] Text classification mainly includes text representation, selection and training of classifiers, evaluation and feedback of classification results. The text representation is a key...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/35G06F16/332G06N3/04G06N3/08G06F17/27
CPCG06N3/084G06F40/30G06F40/289G06N3/045
Inventor 李云红梁思程汤汶慕兴张轩张欢欢聂梦瑄
Owner XI'AN POLYTECHNIC UNIVERSITY
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products