Phishing webpage detection method based on machine learning

A phishing web page, machine learning technology, applied in network data retrieval, website content management, other database retrieval and other directions, can solve the problem of few features, do not consider the accuracy of search engines, etc., to reduce the number of detections, good generalization. Effect

Inactive Publication Date: 2019-12-13
HANGZHOU ANHENG INFORMATION TECH CO LTD
View PDF4 Cites 9 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

This method mainly regards the webpage as an inseparable whole from the link relevance, search relevance, text relevance and the overall relevance of the webpage embedded in the webpage, and compresses it. This detection method extracts fewer features. , without considering search engine accuracy

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Phishing webpage detection method based on machine learning
  • Phishing webpage detection method based on machine learning
  • Phishing webpage detection method based on machine learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0041] Embodiment 1, the phishing web page detection method based on machine learning, such as Figure 1-2 shown, including the following steps:

[0042] S1), method for filtering legitimate webpages based on search engines:

[0043] First extract the title tag of the web page as the search keyword. The Title tag in the web page is used to define the title of the web page document in the Head tag of Html, because the title of the content indexed by the search engine is often the content of the title of the web page, so the Title tag It often contains the most core keywords in the web page. The internal strategy of search engines can help us quickly find the most relevant legitimate webpages, but it cannot filter phishing webpages most effectively. Therefore, we only use search engines to detect whether relevant webpages are legitimate webpages, and whether they are Phishing webpages are not judged. With the help of search engines, we can reduce the detection time of legitima...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a phishing webpage detection method based on machine learning, which comprises the following steps: S1, judging whether a webpage to be detected is a legal webpage, and if not,executing the step S2; S2, extracting the URL of the webpage obtained in the step S1; and S3, judging whether the webpage obtained in the step 2 is a legal webpage or a phishing webpage by a phishingwebpage detection method based on a logistic regression algorithm. The invention provides a phishing webpage detection algorithm based on machine learning. The phishing webpage detection algorithm adopts a webpage feature set construction technology, a webpage filtering technology and a logistic regression classification algorithm to realize detection of phishing webpages. The detection method caneffectively reduce the detection quantity of legal webpages, and realizes good detection of phishing webpages of an escape technology.

Description

technical field [0001] The invention relates to a method for detecting phishing webpages, in particular to a method for detecting phishing webpages based on machine learning. Background technique [0002] The existing dark link detection methods include: 1. A detection method and device for jumping phishing webpages; 2. A phishing detection method based on webpage relevance. [0003] A detection method and device for jumping phishing webpages. The detection method is to only consider the characteristics of the URL of the webpage and the characteristics of the URL after the jump, and combine the clustering entity set corresponding to the URL set to be tested with the preset clustering entity set. Whether there is the same cluster entity in the class information base to detect phishing webpages. The detection method has a single dimension, and it does not have a good identification of new phishing attack webpages. [0004] A phishing detection method based on webpage relevan...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): H04L29/06G06F16/955G06F16/958
CPCG06F16/9566G06F16/986H04L63/1483
Inventor 范如范渊
Owner HANGZHOU ANHENG INFORMATION TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products