A text classification method based on a bidirectional cyclic attention neural network

A neural network, two-way loop technology, applied in the field of natural language processing and learning, can solve the problems of character disturbance, text corpus processing, and inability to guarantee calculation accuracy, and achieve the effect of improving accuracy and performance.

Active Publication Date: 2019-03-15
ANHUI UNIVERSITY OF TECHNOLOGY
View PDF3 Cites 63 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

But the weak point of the present invention is: (1) although the text corpus of collection has been preprocessed, after text corpus is divided, just directly applied, and text corpus is not done further processing, in In the later application, the characters that are not effecti

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A text classification method based on a bidirectional cyclic attention neural network
  • A text classification method based on a bidirectional cyclic attention neural network
  • A text classification method based on a bidirectional cyclic attention neural network

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0082] This embodiment provides a text classification method based on a bidirectional recurrent attention neural network, figure 1 is the flow chart of this embodiment, such as figure 1 shown, the process includes the following steps:

[0083] (1) Data preprocessing, the specific process is as follows:

[0084] (1.1) Data cleaning to remove noise and irrelevant data.

[0085] (1.2) Data integration, combining multi-source data and storing it in a unified data warehouse.

[0086] (1.3) Construct the experimental data set, select 80% of the data as the training set, and the remaining 20% ​​of the data as the test set.

[0087] (1.4) Perform word segmentation processing on the data set by words. In this embodiment, the open source jieba word segmentation algorithm is used for Chinese word segmentation. Suppose a text D is composed of n words, and the word sequence after word segmentation processing is D= {w 1 , w 2 ,...,w n}.

[0088] (1.5) Remove stop words, and remove w...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a text classification method based on a bidirectional cyclic attention neural network, and belongs to the technical field of learning and natural language processing. The method comprises the following steps: step 1, preprocessing data; Step 2, according to the preprocessed data, generating and training a word vector of each word through a Word2vec method; Step 3, performing text semantic feature extraction on the word vector according to the word vector, fusing an attention mechanism and a bidirectional recurrent neural network, calculating the overall weight of each word, and converting the weight into an output value Y (4) of a model; And step 4, taking the feature vector Y (4) as the input of a softmax classifier according to the feature vector Y (4), and carrying out classification identification. According to the method, the attention mechanism is fused in the text feature learning model, the effect of keywords can be effectively highlighted, the performance of the model is greatly improved, and the text classification accuracy is further improved.

Description

technical field [0001] The invention belongs to the technical field of learning and natural language processing, and in particular relates to a text classification method based on a bidirectional cyclic attention neural network. Background technique [0002] In recent years, with the rapid development of the Internet, more and more information is generated, such as text, image, audio, video and other information, among which text information has the largest amount of data, so the processing of text data has become more and more More and more important, how to quickly classify these massive text data has become an urgent problem that we need to solve, which also gave birth to the generation of text classification technology. Text classification technology is intended to achieve fast and automatic classification of text information, thereby providing an effective text information classification method. [0003] The research of traditional text-based classification methods is ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/27G06F16/35G06N3/04G06N3/08
CPCG06N3/084G06F40/289G06F40/30G06N3/048
Inventor 秦锋杨照辉洪旭东郑啸
Owner ANHUI UNIVERSITY OF TECHNOLOGY
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products