Document classification method based on hierarchical multi-attention network

A document classification and attention technology, applied in text database clustering/classification, biological neural network model, unstructured text data retrieval, etc. Weight and other issues
CN109558487AInactive Publication Date: 2019-04-02SOUTH CHINA NORMAL UNIVERSITY

Patent Information

Authority / Receiving Office
CN Β· China
Current Assignee / Owner
SOUTH CHINA NORMAL UNIVERSITY
Publication Date
2019-04-02
Estimated Expiration
Not applicable Β· inactive patent

Smart Images

  • Figure 1
    Figure 1
  • Figure 2
    Figure 2
  • Figure 3
    Figure 3
Patent Text Reader

Abstract

The invention discloses a document classification method based on a hierarchical multi-attention network. The method comprises the following steps of utilizing a Bi-GRU sequence model for carrying outword-sentence and sentence-to-document modeling on the document; using Bi-GRU sequence model to encode each word, obtaining the context information in the sentence, and using the Soft attention to carry out the weight distribution on each word; for the process from the sentences to the document, introducing the CNN attention, and obtaining the local relevant characteristics between the sentencesin the window by utilizing a CNN model, so that the attention weight of each sentence is further obtained. Modeling can be carried out from words to sentences and from sentences to documents accordingto document characteristics, and the hierarchical structure of the documents is fully considered. Meanwhile, aiming at the word level and the sentence level, different attention mechanisms are respectively adopted to properly distribute the weights of the related contents, so that the document classification accuracy is improved.
Need to check novelty before this filing date? Find Prior Art

Description

technical field

[0001] The invention belongs to the field of natural language processing technology and sentiment analysis, in particular to a document classification method based on a hierarchical multi-attention network. Background technique

[0002] Text classification is one of the important topics in the field of natural language processing. With the continuous improvement of the amount of data and the computing power of hardware, the theory and method of text classification are playing an increasingly important role and have received widespread attention. Early text classification research was mainly based on the method of knowledge engineering system, requiring experts in a certain field to customize classification rules for texts in this field, but this method requires a lot of manpower to expand or modify the rules and do a lot of maintenance work. Later, with the development of machine learning technology, text classification methods based on machine learning grad...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More