Webpage classifying method, system and device

A webpage classification and classification algorithm technology, applied in the field of network security, can solve problems such as high development costs, limited search engine performance, and inability to achieve real-time detection, and achieve the effect of improving accuracy

Active Publication Date: 2018-05-15
GUANGDONG UNIV OF TECH
View PDF6 Cites 9 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Among the above methods, the method based on the black and white list needs constant manual maintenance of the list; the method using search engines is often limited by the performance of the search engine, and cannot achieve real-time detection; the method based on visual similarity is more likely to be affected by the accuracy of target recognition. Impact; the method of using webpage DNS requires a third-party service to provide DNS information, and the development cost is relatively high

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Webpage classifying method, system and device
  • Webpage classifying method, system and device
  • Webpage classifying method, system and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0040] The technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only a part of the embodiments of the present invention, rather than all the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of the present invention.

[0041] The embodiment of the present invention discloses a webpage classification method, which improves the accuracy of webpage classification without relying on search engines or third-party services.

[0042] See figure 1 , A flow chart of a web page classification method disclosed in an embodiment of the present invention, such as figure 1 Shown, including:

[0043] S101: Obtain N-dimensional current features of a webpage to be clas...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a webpage classifying method. The method includes the steps of obtaining N-dimensional current features of a to-be-classified webpage, wherein N is a positive integer; inputting the N-dimensional current features in a trained stacking model for feature expansion to obtain (N+n)-dimensional features of the to-be-classified webpage, wherein the stacking model is a model formed by conducting q-layer stacking on p basic classifying models, n is the product of p and q, and n, p and q are positive integers; obtaining a classifying result of the to-be-classified webpage through a classifying algorithm according to the (N+n)-dimensional features. By means of the method, the N-dimensional current features of the to-be-classified webpage are expanded through the stacking model, and the webpage classifying accuracy is improved on the premise of not depending on a search engine or third-party service. The invention further discloses a webpage classifying system, a webpage classifying device and a computer readable storage medium. The same technical effects can be achieved.

Description

Technical field [0001] The present invention relates to the technical field of network security, and more specifically, to a web page classification method and system, a web page classification device and a computer-readable storage medium. Background technique [0002] Phishing is a kind of online fraud. It means that criminals use various methods to imitate the URL address and page content of the real website to defraud the user's important account number, bank or credit card account number, password and other private information. Criminals usually design the page of a phishing website to be completely consistent with the interface of the real website, enticing visitors to submit their account and password. [0003] In recent years, many researchers have designed practical solutions to the anti-phishing problem. These solutions mainly have the following aspects: (1) Blacklist and whitelist-based methods; (2) Extract text, image or URL characteristics from web pages, and use sea...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06F16/9566G06F16/957G06F16/958
Inventor 刘文印黎宇坤陈旭袁华平杨振国
Owner GUANGDONG UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products