Bad webpage detection method and device

A detection method and detection device technology, applied in the field of network security, can solve the problems of website misdetection, poor detection effect, bad website detection, etc., and achieve the effect of improving the detection effect

Active Publication Date: 2012-06-27
CHINA INTERNET NETWORK INFORMATION CENTER
View PDF3 Cites 13 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, purely relying on keywords to detect bad websites will cause

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Bad webpage detection method and device
  • Bad webpage detection method and device
  • Bad webpage detection method and device

Examples

Experimental program
Comparison scheme
Effect test

Example Embodiment

[0019] Embodiment one

[0020] figure 1 It is a flow chart of the bad web page detection method provided by Embodiment 1 of the present invention. like figure 1 As shown, the method for detecting bad webpages provided in this embodiment can be specifically applied to the detection of bad websites, and bad websites can specifically include pornographic, gambling, violent and reactionary websites. It can be implemented by a bad web page detection device, and the bad web page detection device can be implemented specifically by means of software and / or hardware.

[0021] The bad web page detection method provided in this embodiment specifically includes:

[0022] Step 10. Obtain the suspected bad webpage corresponding to the bad keyword according to the bad keyword, obtain the original address corresponding to the suspected bad web page, and generate a list of bad URLs including the original address;

[0023] Specifically, the bad keywords may include bad information such as p...

Example Embodiment

[0029] Embodiment two

[0030] figure 2 It is a flow chart of the bad webpage detection method provided by Embodiment 2 of the present invention. like figure 2 As shown, the bad web page detection method provided in this embodiment is based on the first embodiment, and further, in step 20, the suspected bad web page is analyzed, and after the analysis result is generated, the following steps may also be included:

[0031] Step 40, when it is identified that there is hidden cheating in the suspected bad webpage according to the analysis result, delete the original address corresponding to the suspected bad webpage from the list of bad URLs.

[0032] Specifically, hidden text that cannot be directly seen by human eyes in the suspected bad webpage is hidden cheating. Sites that usually hide cheating are not porn, gambling, etc. sites. The suspected bad webpage is analyzed, and if there is hidden cheating in the suspected bad webpage, the original address of the suspected ba...

Example Embodiment

[0050] Embodiment Three

[0051] image 3 It is a schematic structural diagram of a device for detecting bad webpages provided by Embodiment 3 of the present invention. like image 3 As shown, the device for detecting bad web pages provided in this embodiment can specifically implement each step of the method for detecting bad web pages provided in any embodiment of the present invention, which will not be repeated here.

[0052] The device for detecting bad web pages provided in this embodiment specifically includes a bad URL list generation module 11 , an analysis module 12 and a first deletion module 13 . The bad URL list generation module 11 is used to obtain the suspected bad web pages corresponding to the bad keywords according to the bad keywords, obtain the original addresses corresponding to the suspected bad web pages, and generate a bad URL list including the original addresses. The parsing module 12 is used for parsing suspected bad web pages and generating pars...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a bad webpage detection method and device. The bed webpage detection method comprises the following steps of: acquiring suspected bad webpages corresponding to bad keywords according to the bad keywords, acquiring an original address corresponding to each suspected bad webpage and generating a bed website list containing the original addresses; analyzing the suspected bad webpages to generate an analysis result; and when vicious skips existing in the bad webpages are identified according to the analysis result, deleting the original addresses corresponding to the suspected bad webpages from the bad website list. According to the bad webpage detection method and device provided by the invention, because the suspected bad webpages acquired according to the bad keywords are further analyzed, the webpages with vicious skips are deleted, and the detection effect on the bad webpages is improved.

Description

technical field [0001] The invention relates to network security technology, in particular to a method and device for detecting bad webpages. Background technique [0002] The rapid development of Internet technology has promoted the continuous development of the information society, and the Internet has become an indispensable part of social activities. However, the Internet has also become a medium for the dissemination of pornography and other bad information, seriously affecting the normal use of the Internet by netizens, especially teenagers, and hindering the healthy and orderly development of the Internet. [0003] For the detection of bad websites such as pornographic websites, keyword filtering is a simple, easy-to-implement, and distributed processing Internet pornographic information detection technology, which is widely used. However, purely relying on keywords to detect bad websites will cause false detection of many websites, and the detection effect is not go...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): H04L12/26H04L29/06
Inventor 王利明耿光刚洪博
Owner CHINA INTERNET NETWORK INFORMATION CENTER
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products