Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Method and device for detecting web page dark chain based on machine learning

A dark link detection and machine learning technology, applied in the field of network security, can solve the problems of high false alarm rate, small scope of application, and long time-consuming manual detection

Inactive Publication Date: 2019-01-15
HANGZHOU ANHENG INFORMATION TECH CO LTD
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] The false alarm rate is high through the keyword detection method, and the manual detection takes a long time; the HTTP header access detection method can only detect special dark links, and the scope of application is small

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for detecting web page dark chain based on machine learning
  • Method and device for detecting web page dark chain based on machine learning
  • Method and device for detecting web page dark chain based on machine learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0062] figure 1 It is a flowchart of a method for detecting dark links in webpages based on machine learning according to the first embodiment of the present invention.

[0063] Reference figure 1 , The method includes the following steps:

[0064] Step S101, obtaining webpage source code data, the webpage source code data includes first source code data and second source code data, the first source code data includes dark links, and the second source code data does not include dark links;

[0065] Here, the web page task to be detected can be input on the user interface of the client, and the client sends the web page task to be detected to the server. When the server receives the web page task to be detected, the server responds with response information, and the response information includes web page source data. The webpage source code data is used as the training set of the classification model for training. Enter the webpage source code data into the following steps to obtain ...

Embodiment 2

[0095] Figure 5 It is a schematic diagram of a webpage dark link detection device based on machine learning provided in the second embodiment of the present invention.

[0096] Reference Figure 5 The device includes: an acquisition unit 10, a generation unit 20, a preprocessing unit 30, a matching unit 40, a webpage negative score calculation unit 50, a determination unit 60, a dark link division score calculation unit 70, and a comparison unit 80.

[0097] The obtaining unit 10 is configured to obtain webpage source code data, the webpage source code data includes first source code data and second source code data, the first source code data includes dark links, and the second source code data does not include dark links;

[0098] The generating unit 20 is configured to generate the dark chain negative text vocabulary according to the negative emotion degree of the text according to the first source code data;

[0099] The preprocessing unit 30 is used to preprocess the webpage so...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a method and a device for detecting a dark chain of a web page based on machine learning, includes the following steps: according to the web page source code data with dark chain and the web page source code data without dark chain, when the negative score of the web page is greater than the scores of the dark chain partition, the dark chain is included in the source code data of the web page, so that the recognition effect of the high-hybrid dark chain code is good, and the traditional manual detection mode is replaced, and the dark chain recognition automation is realized.

Description

Technical field [0001] The present invention relates to the technical field of network security, in particular to a method and device for detecting dark links in web pages based on machine learning. Background technique [0002] As the number of websites increases, there are more and more tasks to detect dark links on web pages. At present, by crawling the website and performing keyword detection on the crawled website; and by visiting the website through different HTTP headers, it is judged whether the content returned by the two visits is consistent. [0003] Through the keyword detection method, the false positive rate is high, and manual detection is time-consuming; the HTTP header access detection method can only detect special dark links, and the scope of application is small. Summary of the invention [0004] In view of this, the purpose of the present invention is to provide a method and device for detecting dark links in webpages based on machine learning, which can recogn...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F16/953G06F16/955G06F17/27
CPCG06F40/289G06F40/30
Inventor 史卓颖范渊曾建东金海俊王世晋王世有王辉徐丽丽
Owner HANGZHOU ANHENG INFORMATION TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products