BERT-based multi-feature fusion fuzzy text classification model

A text classification and model technology, applied in text database clustering/classification, biological neural network model, unstructured text data retrieval, etc., can solve problems such as insufficient semantic understanding, incomplete feature acquisition, etc., to eliminate polysemy The effects of improving performance, improving representation ability, and improving classification accuracy

Active Publication Date: 2021-04-30
HEBEI UNIV OF TECH
View PDF7 Cites 11 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] In order to achieve a more accurate classification of fuzzy text and solve the problems of insufficient semantic understanding and incomplete feature acquisition in fuzzy text classification, this paper proposes a BERT-based multi-feature fusion fuzzy text classification model (Multi-feature Fusion Fuzzy Text Classification Model Based On BERT,BERT_MFFM)

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • BERT-based multi-feature fusion fuzzy text classification model
  • BERT-based multi-feature fusion fuzzy text classification model
  • BERT-based multi-feature fusion fuzzy text classification model

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0026] In order to illustrate the technical solutions of the present invention more clearly, the present invention will be further described in detail below in conjunction with the accompanying drawings and examples. The embodiments of the present invention and their descriptions are only for explaining the present invention, and are not intended to limit the present invention.

[0027] The structure diagram of a BERT-based multi-feature fusion fuzzy text classification model in the embodiment of the present invention is as follows figure 1 As shown, the specific implementation steps are as follows:

[0028] S1: Organize the abstracts of similar papers from HowNet, and use them as fuzzy text classification datasets after data preprocessing.

[0029] In a large category (under the same theme), find similar subcategories belonging to the large category. The number of each subcategory is almost equal, and the difference in the number of samples between different subcategories do...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a BERT-based multi-feature fusion fuzzy text classification model. The method comprises the following steps of preparing a fuzzy text classification original data set; a BERTMFFM model is constructed, the BERTMFFM model comprises a BERT model, a convolutional neural network, a bidirectional long-short memory network and a SelfAttention module, the input of the BERT model is a fuzzy text, the output of the BERT model is connected with the convolutional neural network, the bidirectional long-short memory network and the SelfAttention module, and local features, sentence semantic features and syntax structure features of the fuzzy text are extracted; splicing the output of the BERT model with the output of the bidirectional long-short memory network at the same time, and then screening out optimal sentence semantic features by using maximum pooling operation; and fusing the local features, the optimal sentence semantic features and the syntactic structure features by adopting a parallel splicing mode, and performing fuzzy text classification on a fusion result through a SoftMax function to finish the construction of the BERTMFFM model. The problem of incomplete feature collection is solved, so that the classification accuracy is improved.

Description

technical field [0001] The technical solution of the present invention relates to the technical field of natural language processing, specifically a BERT-based multi-feature fusion fuzzy text classification model. Background technique [0002] With the development of network technology, information has exploded, especially text data. The objective world contains a large amount of text information, such as journal documents, current news, emails, text messages, chat messages, and e-books. Due to the diversity and complexity of Chinese texts, fuzzy texts with overlapping content, similar content, high similarity between categories, and unclear boundaries have also increased in large numbers. In the mass of text information, how to efficiently manage and analyze a large amount of fuzzy text, and quickly obtain effective information from it has become an important task in the field of text classification. [0003] In text classification, text representation and feature extract...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/35G06F40/211G06F40/30G06N3/04G06N3/08
CPCG06F16/35G06F40/211G06F40/30G06N3/08G06N3/043G06N3/044G06N3/045
Inventor 梁艳红张萌萌李欣泽刘芃辰
Owner HEBEI UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products