Text enhancement semantic classification method and system based on convolutional neural network

A technology of convolutional neural network and classification method, which is applied in the field of natural language processing, can solve problems such as low analysis accuracy and unbalanced number of label samples, and achieve the effects of expanding sample size, improving accuracy, and improving robustness

Pending Publication Date: 2020-03-24
USTC SINOVATE SOFTWARE
View PDF4 Cites 14 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] The technical problem to be solved by the present invention is: how to solve the problems of low analysis accuracy and unbalanced number of label samples in the classification and analysis of government affairs and pub

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Text enhancement semantic classification method and system based on convolutional neural network
  • Text enhancement semantic classification method and system based on convolutional neural network
  • Text enhancement semantic classification method and system based on convolutional neural network

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0057] The present embodiment provides a technical solution: a text enhanced semantic classification method based on a convolutional neural network, comprising the following steps:

[0058] The present invention solves the above-mentioned technical problems through the following technical solutions, and the present invention comprises the following steps:

[0059] S1: Collect training samples

[0060] Crawl the articles on the target website through the web crawler method, and perform manual classification and labeling based on the content of the articles, and store the labeling labels, article titles and article texts in the database as training samples;

[0061] S2: Preprocessing

[0062] Preprocess the article title and article body content in the database; preprocessing includes removing irrelevant short title links that are crawled by some samples at the same time when crawling, and removing noise words on this basis, including punctuation Symbols, English letters, pers...

Embodiment 2

[0099] like figure 1 As shown, the present embodiment provides a technical solution: a text-enhanced semantic classification method based on a convolutional neural network, comprising the following steps:

[0100] S1: Collect training text.

[0101] Specifically, the text content of the Chinese articles on the target webpage is crawled by using the web crawler technology. Then manually classify. Optionally, in this embodiment, the classification labels include 'engineering construction', 'administration according to law', 'government procurement and bidding', 'cultural, sports, education, scientific research intellectual property rights', 'public security credibility', 'circulation field' , 'Taxation', 'Production Field', 'Civil Service Integrity', 'Environmental Protection and Energy Conservation', 'Trust and Procuracy', 'Court Procuratorate Credibility', 'Price', 'Tourism', 'Social Security', 'Financial Sector ', 'Medical and Health', 'Social Security', 'Social Civilizati...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a text enhancement semantic classification method and system based on a convolutional neural network, and belongs to the technical field of natural language processing, and themethod comprises the following steps: S1, collecting a training sample; s2, performing pretreatment; S3, segmenting words S4, constructing a word segmentation matrix; S5, enhancing data S6, trainingby using model. A new text word vector matrix with the same label can be generated so that a small amount of label data in an original data set is enhanced to a great extent, the sample capacity is expanded, and the effects of improving the robustness of a subsequent model, improving the accuracy, the accuracy rate and the recall rate and the like are achieved; a model is trained through the improved convolutional neural network, and texts under government affair public opinion Chinese text labels can be effectively classified and judged; the invention is suitable for solving the semantic category classification problem of the Chinese text, and is also suitable for solving other classification problems such as sentiment dichotomy.

Description

technical field [0001] The present invention relates to the technical field of natural language processing, in particular to a text enhanced semantic classification method and system based on a convolutional neural network. Background technique [0002] With the rapid popularization of the Internet and smartphones, the speed and breadth of information dissemination have increased exponentially in just a few years. News media, one of the important carriers of information, has developed rapidly on the Internet with the emergence of new technologies such as WeChat Moments, Weibo We-Media, and Toutiao Push. The development of network media has promoted people's acquisition and discussion of these news events, which makes network media one of the important carriers reflecting public opinion. The analysis of the text data of online media can help people better obtain the information behind the news, such as people's opinions and emotions, and help people grasp the trend of public...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F16/35G06F16/951G06N3/04
CPCG06F16/35G06F16/951G06N3/045
Inventor 王正宇王平平王周焱丁磊杨鹏飞钱伟韦贾计
Owner USTC SINOVATE SOFTWARE
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products