Webpage text classification method based on enhanced capsule network and storage medium

A text classification and capsule technology, applied in network data retrieval, network data indexing, neural learning methods, etc., can solve the problems of large loss and low overall accuracy, improve robustness, improve learning ability, and eliminate gradient disappearance problem effect

Active Publication Date: 2020-07-28
CHINESE ACAD OF SURVEYING & MAPPING
View PDF13 Cites 8 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] In order to overcome the problems of low overall accuracy and loss of a large amount of important information in the process of feature extraction in the existing technology, the present invention proposes a text classification method for social public safety event web pages based on enhanced capsule network

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Webpage text classification method based on enhanced capsule network and storage medium
  • Webpage text classification method based on enhanced capsule network and storage medium
  • Webpage text classification method based on enhanced capsule network and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0047] The present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, but not to limit the present invention. In addition, it should be noted that, for the convenience of description, only some structures related to the present invention are shown in the drawings but not all structures.

[0048] Specifically, see figure 1 , shows the basic flow chart of the web page text classification method based on the enhanced capsule network of the present invention, including the following steps:

[0049] Data acquisition and processing step S110:

[0050] Crawl the text data of social public safety event webpages from domestic mainstream media websites such as Sina News, Netease News, Sina Weibo, etc., clean and structure the obtained text data, and finally obtain the experimental corpus. The training set an...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a webpage text classification method based on an enhanced capsule network and a storage medium thereof, and the method comprises the steps: crawling webpage text data in a specific field, carrying out the cleaning and data structuralization of the obtained text data, and finally obtaining an experiment corpus; setting an architecture of an enhanced capsule network, whereinthe architecture sequentially comprises a dense convolutional network, a main capsule layer and a digital capsule layer; and training the enhanced capsule network by taking the training data in the training set as the input of the enhanced capsule network to obtain a classifier, and verifying the accuracy of the classifier by using the test data of the test set. According to the method, the denseconvolutional network is introduced to extract the feature information, so that the features are more discriminative, and the learning ability of the model on a data set is improved. And the main capsule layer is further encoded by adopting a dynamic routing mechanism, so that the obtained features are more directional, and the capsule network is more robust.

Description

technical field [0001] The present invention relates to the technical field of natural language processing, and specifically relates to a web page text classification method and storage medium based on an enhanced capsule network, and the method is particularly suitable for related fields such as social public security incidents. Background technique [0002] With the development of Internet technology, the amount of data related to social public security events on the Internet has exploded. Public security incidents are incidents that endanger the life, health, and property of the majority (not all people, nor individuals), and may cause a series of public problems, which in turn lead to the collapse of the value system and social order disorder. Public safety incidents are usually divided into natural disasters, accident disasters, public health, and social security. Collecting a large amount of relevant webpages and information of social public security event data from t...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F40/289G06F16/951G06K9/62G06N3/04G06N3/08
CPCG06F16/951G06N3/08G06N3/045G06F18/24Y02D10/00
Inventor 石丽红朱鹏赵习枝张福浩仇阿根
Owner CHINESE ACAD OF SURVEYING & MAPPING
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products