Chinese text classification method based on BERT and CNN hierarchical connection

A text classification, Chinese technology, applied in the direction of text database clustering/classification, unstructured text data retrieval, semantic analysis, etc., can solve problems such as time-consuming and labor-intensive, and the impact of quantitative results

Active Publication Date: 2020-05-19
DONGHUA UNIV
View PDF1 Cites 18 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Traditional deep learning models rely on quantifying words or words in sentences as model input, but this method is sometimes affected by quan

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Chinese text classification method based on BERT and CNN hierarchical connection
  • Chinese text classification method based on BERT and CNN hierarchical connection
  • Chinese text classification method based on BERT and CNN hierarchical connection

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0023] Below in conjunction with specific embodiment, further illustrate the present invention. It should be understood that these examples are only used to illustrate the present invention and are not intended to limit the scope of the present invention. In addition, it should be understood that after reading the teachings of the present invention, those skilled in the art can make various changes or modifications to the present invention, and these equivalent forms also fall within the scope defined by the appended claims of the present application.

[0024] The specific embodiment of the present invention relates to a Chinese text classification method based on the hierarchical connection between BERT and CNN. The Chinese text classification method includes pre-training the BERT model through the Chinese text data set of Wiki Encyclopedia, and obtaining all the text in the BERT model. Parameters and save; use the CNN model and the BERT model for hierarchical connection to o...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a Chinese text classification method based on BERT and CNN hierarchical connection. The method is mainly used for solving the text classification problems of sentiment analysis, core sentence recognition, relation recognition and the like of Chinese texts. According to the method, hierarchical connection is carried out by using a CNN model and a BERT model to obtain a newmodel BERT-CNN. Due to the fact that the BERT-CNN model is added, sentence features extracted by the BERT model can be further extracted, and more effective sentence semantic representation is obtained. Therefore, in the text classification task, a better classification effect can be obtained.

Description

technical field [0001] The invention belongs to the technical field of natural language processing, in particular to a Chinese text classification method based on deep learning model BERT and CNN hierarchical connection. Background technique [0002] With the rapid development of the economy and the Internet, more and more people will choose to express various opinions online. Faced with a large amount of text data on the Internet, how to efficiently obtain valuable data from these data has become a research hotspot. Question-answering robots, search, machine translation, and sentiment analysis are all key application areas of natural language processing, and these technologies are inseparable from text classification technology, which is the basis of these technologies. It is precisely because text classification technology is a foundation that the accuracy requirements for it are relatively high. Therefore, over the years, text classification technology has been a resear...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F16/35G06F40/30G06N3/04
CPCG06F16/35G06N3/045
Inventor 马强赵鸣博孔维健王晓峰孙嘉瞳邓开连
Owner DONGHUA UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products