Hybrid neural network text classification method capable of blending abstract with main characteristics

A hybrid neural network and text classification technology, applied in the fields of natural language processing and data mining, can solve the problems of not being able to make good use of the text organization structure and not taking the organization structure into account

Active Publication Date: 2018-09-28
FUZHOU UNIV
View PDF9 Cites 70 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] At present, the deep neural network model for the document level generally uses the words in the document to form a sentence, and the sentence forms a hierarchical structure of the document to build the network model. However, these models do not take into account the obvious organizational structure of some specific documents in the text. Features, for example, the text can usually be divided into the organizational structure of the article such as the abstract, the main body, etc., and different article structures have different effects on the text category: the text abstract part is a high-level summary of the text content, which contains the main body of the event, the event The key information such as the results of the content; the main body of the text details the content, describes the cause of the content, and has the characteristics of context and timing
At present, the deep neural network model for the document level generally directly inputs the entire text into the network for unified processing, and cannot make good use of the role of different organizational structures of the text.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Hybrid neural network text classification method capable of blending abstract with main characteristics
  • Hybrid neural network text classification method capable of blending abstract with main characteristics
  • Hybrid neural network text classification method capable of blending abstract with main characteristics

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0074] The present invention will be further described in detail below in conjunction with the accompanying drawings and specific embodiments.

[0075] The present invention provides a hybrid neural network text classification method that combines abstract and subject features, such as figure 1 shown, including the following steps:

[0076] Step A: Extract a summary of each text in the training set. Specifically include the following steps:

[0077] Step A1: For any text D, perform sentence segmentation and word segmentation processing, and use the word embedding tool to convert the words in the text into word vector form. The calculation formula is as follows:

[0078] v=W·v'

[0079] Among them, each word in the text is randomly initialized as a d'-dimensional real number vector, namely v'; W is the word embedding matrix, W∈R d×d′ , which is obtained from a large-scale corpus trained in a neural network language model, and is used to project d'-dimensional real number ve...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a hybrid neural network text classification method capable of blending an abstract with main characteristics. The method comprises the following steps that: step A: extractingan abstract from each text in a training set; step B: using a convolutional neural network to learn the key local features of the abstract obtained in the step A; step C: using a long short-term memory network to learn context time sequence characteristics on the main content of each text in the training set; step D: carrying out cascade connection on two types of characteristics obtained in thestep B and the step C to obtain the integral characteristics of the text, inputting the integral characteristics of each text in the training set into a full connection layer, using a classifier to calculate a probability that each text belongs to each category to train a network, and obtaining a deep neural network model; and step E: utilizing the trained deep neural network model to predict thecategory of a text to be predicted, and outputting the category with a highest probability as a prediction category. The method is favorable for improving text classification accuracy based on the deep neural network.

Description

technical field [0001] The invention relates to the fields of natural language processing and data mining, in particular to a hybrid neural network text classification method which combines abstract and subject features. Background technique [0002] Text classification (text categorization) technology is an important basis for information retrieval and text mining, and its main task is to determine its category according to the text content under the pre-given set of category labels (label). Text classification has a wide range of applications in natural language processing and understanding, information organization and management, content information filtering and other fields. In recent years, the research idea of ​​using deep learning to build a language model has gradually matured, which has greatly improved the feature quality of the text. Some scholars first proposed a sentence classification model based on convolutional neural network, which extracts features from ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30G06K9/62G06N3/04G06N3/08
CPCG06N3/08G06N3/045G06F18/241G06F18/253
Inventor 陈羽中张伟智郭昆林剑
Owner FUZHOU UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products