Method and device for filtering uniform resource locator url of webpage

A technology of resource locators and filtering methods, which is applied in the field of URL filtering of uniform resource locators, can solve problems such as URLs that cannot filter spam webpages, and achieve the effects of improving efficiency and accuracy

Active Publication Date: 2019-06-07
TENCENT TECH (SHENZHEN) CO LTD
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] Embodiments of the present invention provide a method and device for filtering uniform resource locator URLs of webpages, to at least solve the technical problem that URLs of spam webpages cannot be filtered due to the prior art

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for filtering uniform resource locator url of webpage
  • Method and device for filtering uniform resource locator url of webpage
  • Method and device for filtering uniform resource locator url of webpage

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0022] According to an embodiment of the present invention, a method for filtering uniform resource locator URLs of webpages is provided, and the method for filtering uniform resource locator URLs of webpages above can be applied to such as figure 1 In the shown hardware environment, wherein, the filtering server 102 that is used to filter the uniform resource locator URL of the webpage can establish a link with the webpage server 104 where the above-mentioned webpage is located through the network, and send by the above-mentioned webpage server 104 to be processed URLs for filtering. Wherein, the above-mentioned network includes but not limited to: a wide area network, a metropolitan area network or a local area network.

[0023] Optionally, as in figure 2 As shown, the filtering method of the URL of the webpage in the present embodiment comprises:

[0024] S202. Obtain a set of URLs to be processed, where the set of URLs to be processed includes URLs of multiple webpages ...

Embodiment 2

[0082] According to an embodiment of the present invention, a filtering device for a uniform resource locator URL of a webpage is provided, and the above-mentioned filtering device for a uniform resource locator URL of a webpage can be applied to such as figure 1 In the shown hardware environment, wherein the above-mentioned device is located in the filter server 102 for filtering the uniform resource locator URL of the webpage, the filter server 102 can establish a link with the webpage server 104 where the above-mentioned webpage is located through the network, and provide The URLs to be processed sent by the web server 104 are filtered. Wherein, the above-mentioned network includes but not limited to: a wide area network, a metropolitan area network or a local area network.

[0083] According to an embodiment of the present invention, there is also provided a filtering device for the uniform resource locator URL of a webpage, such as Figure 7 As shown, the device includes...

Embodiment 3

[0142] According to an embodiment of the present invention, a server for filtering the uniform resource locator URL of the above-mentioned webpage is also provided, such as Figure 8 As shown, the server includes:

[0143] 1) The memory 802 is configured to store the configuration file for filtering the uniform resource locator URL of the webpage and the URL after filtering.

[0144] Optionally, in this embodiment, the content stored in the above storage 802 may also be obtained from other servers except the filtering server 102, which is not limited in this embodiment.

[0145] Optionally, in this embodiment, the memory 802 may also be used to store other data stored in the filtering process in Embodiment 1 above.

[0146] 2) Processor 804, configured to perform the following operations on each module in the filtering device of the Uniform Resource Locator URL of the above-mentioned webpage;

[0147] S1. Obtain a set of URLs to be processed, where the set of URLs to be proc...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The present invention discloses a web page uniform resource locator URL filtering method and apparatus. The method comprises: obtaining a to-be-processed URL set, wherein the to-be-processed URL set comprises a plurality of to-be-processed web page URLs; and performing the following filtering operation on each URL in the to-be-processed URL set, wherein a URL currently subjected to the filtering operation in the to-be-processed URL set is a current URL: determining whether the current URL is a to-be-tested URL according to a filtering identifier in a preset configuration file; if the URL is a to-be-tested URL, matching the current URL according to a filtering field in the configuration file; and if the current URL is successfully matched according to the filtering field, filtering the current URL out of the to-be-processed URL set. The method and apparatus provided by the present invention solve a technical problem in the prior art that URLs of spam web pages cannot be filtered, so that Web security scanning is performed after URLs of spam web pages are filtered out, thereby improving the efficiency of Web security scanning.

Description

technical field [0001] The invention relates to the computer field, in particular to a method and device for filtering uniform resource locator URLs of web pages. Background technique [0002] When performing a web security scan on a common gateway interface (CGI, Common Gateway Interface), it is usually necessary to collect all the CGI as much as possible, and filter out the junk pages therein, so as to improve the efficiency of the web security scan. At present, the methods for collecting CGI by those skilled in the art mainly include the following two methods: one is to crawl URLs on the Internet through a web crawler; the other is to obtain CGI by bypassing WAF traffic. Yet above-mentioned two kinds of methods of obtaining CGI all inevitably collect a lot of spam webpages, and wherein, above-mentioned spam webpage can be the webpage that cannot visit or does not exist, and these spam webpages are meaningless to Web security scanning, even It affects the efficiency of We...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/9535G06F16/955
Inventor 何双宁董昭马杰
Owner TENCENT TECH (SHENZHEN) CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products