Supercharge Your Innovation With Domain-Expert AI Agents!

A Text Classification Method Based on Graph Kernel and Convolutional Neural Network

A convolutional neural network and text classification technology, applied in the field of data mining and information retrieval, can solve the problem of missing text semantic structure information, and achieve the effect of solving complex and tedious processing process and reasonable node information

Inactive Publication Date: 2021-09-28
BEIJING INSTITUTE OF TECHNOLOGYGY
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0007] For the problem of text classification, the existing technology mainly expresses the text as a vector space model, which loses the semantic structure information of the text. The present invention proposes a text classification method based on graph kernel and convolutional neural network, which can Effectively preserve the semantic structure of the text and improve classification accuracy

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A Text Classification Method Based on Graph Kernel and Convolutional Neural Network
  • A Text Classification Method Based on Graph Kernel and Convolutional Neural Network
  • A Text Classification Method Based on Graph Kernel and Convolutional Neural Network

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0051] Such as figure 1 As shown, this embodiment is divided into five steps altogether, specifically as follows:

[0052] Step A, convert the text into a graph structure, such as figure 2 shown.

[0053] A.1 Firstly, word segmentation is performed on the text. In Chinese texts, words are written consecutively, unlike Western texts, where words are naturally separated. Therefore, it is first necessary to divide Chinese articles into word sequences. The mainstream Chinese word segmentation algorithms include forward maximum matching method, reverse maximum matching method, best matching method, word-by-word traversal method, optimal path method, etc. The algorithm used in this paper is maximum string matching, which is a segmentation method based on statistics. When the adjacent co-occurrence probability of two words is higher than a threshold, it is considered that this word group may constitute a word.

[0054] A.2 Remove stop words, punctuation, and numbers in the text, ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to a text classification method based on a graph kernel and a convolutional neural network, belonging to the technical fields of data mining and information retrieval. The core idea is: firstly, the text is preprocessed into a graph structure representation, in which the nodes in the graph correspond to the words in the text; then the weights of the nodes are calculated based on the graph structure, and then the graph structure is decomposed into multiple graph structures using the community discovery algorithm. Then use graph kernel technology to map the graph to a high-dimensional space to obtain the tensor representation of the graph, and finally input the tensor representation to the convolutional neural network to deeply mine the graph features and output the text category. Compared with the prior art, the present invention can make full use of the internal structure and context semantics of the text, so that the text content can be fully expressed; the node information is more reasonable; and the complex and cumbersome processing process in the text classification is effectively solved.

Description

technical field [0001] The invention relates to a text classification method, in particular to a text classification method based on a graph kernel and a convolutional neural network, and belongs to the technical fields of data mining and information retrieval. Background technique [0002] With the advent of the era of big data, the amount of information has exploded, and the way of information processing has gradually transitioned from traditional manual processing to automated processing. As an important task of information processing, text classification aims to automatically classify unlabeled documents into a predetermined category set, which can solve the phenomenon of information clutter to a large extent, and then realize efficient management of massive information. Text classification technology has been It is widely used in information filtering, information retrieval, topic detection and tracking and other fields. [0003] There are three main types of text clas...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/35G06F16/901G06F16/36G06F40/289
CPCG06F16/353G06F16/355G06F16/36G06F16/9024G06F40/289
Inventor 郭平张璐璐辛欣
Owner BEIJING INSTITUTE OF TECHNOLOGYGY
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More