Unlock instant, AI-driven research and patent intelligence for your innovation.

Long and short hybrid text classification optimization method based on integrated neural network

A text classification and neural network technology, applied in the field of long-short hybrid text classification optimization based on integrated neural network, can solve problems such as lack of high-accuracy classification algorithms

Inactive Publication Date: 2020-06-19
BEIJING UNIV OF TECH
View PDF5 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] Aiming at the lack of a general high-accuracy classification algorithm in the long-short mixed text data classification scenario, the present invention proposes a long-short mixed text classification optimization method based on an integrated neural network

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Long and short hybrid text classification optimization method based on integrated neural network
  • Long and short hybrid text classification optimization method based on integrated neural network
  • Long and short hybrid text classification optimization method based on integrated neural network

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0103] The following is a detailed step-by-step description of the long and short mixed text classification optimization method based on the integrated neural network. For the sake of illustration, some sample data are simulated, as shown in Table 1:

[0104] Table 1 sample data

[0105]

[0106] Step 1: Initialize

[0107] Initialize algorithm parameters, dictionary table size N v =10000, the number of categories C=5, the truncation threshold r=200, the word embedding dimension k=8, the maximum number of iterations P=20, the number of training rounds ended early E s = 3, batch size S b =1, convolution window range S c ={2,3,4}, the number of convolution kernels N c =8, the number of neurons in the recurrent layer N r =8, the number of neurons in the fully connected layer N f =8, Dropout ratio D r =0.0, the expected accuracy rate is 0.9, and the window increment is 2. To initialize the data structure required by the algorithm, the text data set D={}, the dictionary...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

A long and short hybrid text classification optimization method based on an integrated neural network belongs to the field of natural language processing, and comprises six steps of initialization, preprocessing, construction of a long text classification algorithm, construction of a short text classification algorithm, construction of an integrated classification algorithm and stopping of iteration. The method comprises the following steps: firstly, constructing dual-channel representation of text data by using a pre-training word vector based on prediction and a pre-training word vector based on statistics; secondly, on the basis of double-channel text representation, a convolution optimization algorithm fused with channel features is provided, and the spatial feature extraction capacityof a traditional convolution algorithm on text data is improved; designing independent algorithms suitable for long text classification and short text classification based on the optimized convolution algorithm; and finally, an integration strategy is used for carrying out automatic evaluation and weighted fusion on the independent algorithm, and the integrated algorithm shows excellent performance in a mixed text data classification scene, and has higher classification accuracy and classification stability compared with an existing classical algorithm.

Description

technical field [0001] The invention belongs to the field of natural language processing and application, in particular to a long-short mixed text classification optimization method based on an integrated neural network. Background technique [0002] Neural network algorithms can automatically extract data features, and are widely used in text classification tasks, achieving good classification performance. Classic neural network algorithms commonly used in text classification include one-dimensional convolutional neural network (CNN) and recurrent neural network (RNN). CNN-based algorithms are good at extracting local spatial features of text, and are suitable for short text classification problems with sparse features and weak timing. The RNN-based algorithm is good at extracting the global timing features of text, and is suitable for long text classification problems with rich context information and strong timing. Due to the fact that long and short texts are often mix...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/35G06N3/04G06N3/08
CPCG06F16/355G06F16/353G06N3/08G06N3/044G06N3/045
Inventor 韩永鹏苏航陈彩梁毅孙志冉
Owner BEIJING UNIV OF TECH