Network harmful information keyword extraction method and harmful keyword library construction method

A technology of harmful information and extraction methods, applied in digital data information retrieval, text database indexing, unstructured text data retrieval, etc., can solve the problems of harmful information, social stability, and daily life of residents

Inactive Publication Date: 2022-07-22
西安知了科技有限公司
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

At present, my country's monitoring and grading of Internet information is still in the development stage, and there are still many loopholes in the official supervision system
In the face of massive data on the Internet, ordinary residents, especially minors who lack social experience, lack the ability to distinguish information and are easily affected by harmful information. A large amount of harmful information on the Internet will poison the daily life of residents and affect social stability.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Network harmful information keyword extraction method and harmful keyword library construction method
  • Network harmful information keyword extraction method and harmful keyword library construction method
  • Network harmful information keyword extraction method and harmful keyword library construction method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0051] The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only a part of the embodiments of the present invention, but not all of the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative efforts shall fall within the protection scope of the present invention.

[0052] In order to accurately identify and extract keywords for harmful information in network content to assist official agencies in conducting harmful information inspections, embodiments of the present invention provide a method for extracting keywords from harmful information on the network and a method for constructing a harmful keyword database .

[0053] It should be noted that, the execution subject of a method for extracting ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a network harmful information keyword extraction method and a harmful keyword library construction method. The network harmful information keyword extraction method comprises the following steps: acquiring initial text data information from the Internet; performing hierarchical division by taking the segmented word as the minimum unit, and performing attribute standardized description corresponding to the level on each unit divided at each level to obtain a judgment word; matching the judgment words with an original harmful keyword library by utilizing the attributes of the judgment words, and determining the harmfulness of each judgment word; and by utilizing each determined harmful segmented word, searching a segmented word which simultaneously appears with the harmful segmented word and the occurrence frequency of which meets a preset requirement in the initial text data information, and extracting the segmented word as a suspected harmful segmented word. In addition, the suspected harmful segmented words can be imported into the word bank, the performance of the word bank is checked to determine the judged harmful segmented words, and the updated harmful keyword bank is obtained by utilizing the judged harmful segmented words and the original harmful keyword bank. According to the method, the harmful information words in the internet environment can be accurately identified, and the process of constructing the keyword library is rapid in convergence and low in misjudgment rate.

Description

technical field [0001] The invention belongs to the technical field of data mining, and in particular relates to a method for extracting keywords of network harmful information and a method for constructing a harmful keyword database. Background technique [0002] With the rapid development of Internet technology, my country has fully entered the information age, the storage cost of information is lower, and the dissemination is more convenient, which makes the amount of network information increase exponentially. [0003] Abundant information not only brings a lot of convenience to our life, but also provides a channel for the birth and dissemination of harmful information. At present, my country's monitoring of Internet information and the classification of information are still in the development stage, and there are still many loopholes in the official supervision system. In the face of massive data on the Internet, ordinary residents, especially minors who lack social ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F40/289G06F40/216G06F16/332G06F16/31
CPCG06F40/289G06F40/216G06F16/332G06F16/31
Inventor 赵舰波李帅刘怀亮杨斌张善庄
Owner 西安知了科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products