Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Chinese short text classification method based on graph attention network

A classification method and network text technology, which are applied in the field of Chinese short text classification based on graph attention network, can solve the problems of too many feature classifications with high classification value and insufficient short text information, and achieve the effect of improving the accuracy rate

Pending Publication Date: 2021-03-02
JINAN UNIVERSITY
View PDF0 Cites 12 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The main purpose of the present invention is to overcome the shortcomings and deficiencies of the prior art and provide a method for classifying Chinese short texts based on graph attention networks. The present invention solves the short text information shortage by using the method of building maps for the classification of Chinese short texts , and use the graph attention mechanism to solve the problem that the existing classification methods do not focus on the features that are of great value to the classification, resulting in more redundant features in the classification, thus overcoming the current Chinese short text classification method limitations

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Chinese short text classification method based on graph attention network
  • Chinese short text classification method based on graph attention network
  • Chinese short text classification method based on graph attention network

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0054] Such as figure 1 As shown, the main steps of a Chinese short text classification method based on a graph attention network in the present invention are: text data preprocessing, text feature extraction, text and words as nodes to establish a heterogeneous graph, input graph attention network The classification model performs category classification and outputs text categories.

[0055] The steps are described in detail below:

[0056] The first step, text data preprocessing

[0057] The preprocessing process of text data mainly includes noise information removal, word segmentation processing and stop word processing.

[0058] S1.1 Noise information removal

[0059] For short Chinese texts obtained from social platforms and e-commerce platforms that need to be classified, the text data is likely to contain noise information such as user nicknames, URLs, and garbled characters that have nothing to do with classification. Regular expressions are used to preprocess the t...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a Chinese short text classification method based on a graph attention network, and the method comprises the following steps: preprocessing text data, and obtaining a word listset corresponding to a text; text feature extraction: carrying out word embedding processing on the word list set corresponding to the text by adopting a feature embedding tool to obtain a corresponding word vector; carrying out mapping by adopting a graph structure, and establishing a heterogeneous graph by taking the text and words in the text as graph nodes; establishing a graph attention network text classification model; adopting a Chinese short text data set with category annotation of a network open source as a training language data set, and adopting a heterogeneous graph training graph attention network text classification model; outputting the category to which the text belongs; processing the node features through a softmax classification layer to obtain a final classification category; according to the invention, the text features can be fully extracted under the condition that the short text information amount is insufficient, information with high value for text classification is focused on, and the classification accuracy is effectively improved.

Description

technical field [0001] The invention relates to the research field of computer natural language processing, in particular to a Chinese short text classification method based on a graph attention network. Background technique [0002] In recent years, with the rapid development of computer technology, the Internet and its affiliated industries, countless text-based data are generated on the Internet every day, showing the characteristics of big data. How to quickly classify and analyze massive messy texts is an urgent problem to be solved. Text classification is an important task in natural language processing tasks. It organizes and classifies text resources. At the same time, it is also a key link to solve the problem of text information overload. It is widely used in digital libraries, information retrieval and other fields. Using the correct text classification technology to extract the effective semantic information contained in a large amount of text data, and then mini...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06K9/62G06F40/289G06N3/04G06N3/08
CPCG06F40/289G06N3/08G06N3/045G06F18/2415
Inventor 黄斐然贝元琛
Owner JINAN UNIVERSITY
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products