Unlock instant, AI-driven research and patent intelligence for your innovation.

Dynamic URL filtering method and device

A filtering method and filtering device technology, applied in special data processing applications, using information identifiers to retrieve web data, instruments, etc., can solve the problems of long process, slow speed, resource consumption, etc., achieve fast processing speed, reduce storage, The effect of saving processing time and computing resources

Inactive Publication Date: 2015-04-29
NAT COMP NETWORK & INFORMATION SECURITY MANAGEMENT CENT
View PDF4 Cites 11 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, this method is relatively slow, the process is relatively long, and it consumes more resources

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Dynamic URL filtering method and device
  • Dynamic URL filtering method and device
  • Dynamic URL filtering method and device

Examples

Experimental program
Comparison scheme
Effect test

no. 1 example

[0039] In the first embodiment of the present invention, a dynamic URL filtering method, such as figure 1 shown, including the following specific steps:

[0040] Step S101, create an information dictionary based on the URL annotation set, and the content of the information dictionary includes two types: character string features and statistical features.

[0041]Specifically, the statistical features and the character string features are derived from all URLs in the URL annotation set, and the statistical features include at least the normalized value of one of the following items: the number of occurrences of the set punctuation marks, the path depth , the number of digits in the domain name and / or path, the length of the longest character string in the domain name and / or path, the length of the suffix, and the conversion frequency between numbers and characters. For example: the method of determining the normalized value of the number of occurrences of the set punctuation m...

no. 3 example

[0061] The third embodiment of the present invention, this embodiment is based on the above-mentioned embodiments, taking the dynamic and static classification of URL collections by using the linear logistic regression classification algorithm as an example, combined with the attached Figure 3-7 An application example of the present invention is introduced.

[0062] Different from the traditional method of classifying static / dynamic URLs with MD5 values, the application example of the present invention classifies URLs based on a linear logistic regression classification algorithm and a new feature set. The flow of the whole classification process is as follows image 3 shown.

[0063] In the application example of the present invention, the linear logistic regression classification algorithm is applied to solve the dynamic URL filtering problem. In addition, although the present invention follows the idea of ​​classification by logistic regression, the feature extraction st...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides dynamic URL filtering method and device. The method comprises the steps of creating an information dictionary based on an URL marker set; generating a corresponding characteristic vector of each URL in the URL marker set according to the information dictionary; forming a characteristic matrix through all the characteristics vectors corresponding the URL in the URL marker set; classifying the URL characteristic matrix into a character weight vector and a binary classification threshold; performing characteristic extraction for the URL to be predicated based on information fields; generating the characteristic vector of the URL to be predicated based on the extracted characteristics; correspondingly multiplying and then adding the characteristic vectors of the URL to be predicated and the characteristic weight vector to obtain a target value; comparing the target value with the binary classification threshold to determine whether the URL to be predicated is dynamic URL or static URL. With the adoption of the method, the processing can be performed offline, and the network is not accessed, so that the storage is reduced, the processing time and computing resources can be saved.

Description

technical field [0001] The invention relates to the technical field of URL filtering, in particular to a dynamic URL filtering method and device. Background technique [0002] On January 16, 2014, China Internet Network Information Center (CNNIC) released the 33rd "Statistical Report on Internet Development in China" in Beijing. The "Report" shows that as of December 2013, the number of Chinese netizens reached 618 million, and the Internet penetration rate was 45.8%. Among them, the scale of mobile Internet users reached 500 million, and continued to maintain a steady growth. What follows is a large amount of data generated by daily Internet activities, of which web browsing accounts for the vast majority of the proportion, that is to say, the carrying capacity of http (hypertext transfer protocol, hypertext transfer protocol) is very large. Then there will inevitably be a large-scale URL. However, often only part of the URL makes sense. A certain number of URLs (Unifor...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30
CPCG06F16/955G06F16/9566
Inventor 钮艳易立段东圣赵淳璐鲁睿刘晓辉王晶翟羽佳潘进
Owner NAT COMP NETWORK & INFORMATION SECURITY MANAGEMENT CENT
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More