Webpage recognition method, device and system

A recognition method and web page technology, applied in the computer field, can solve the problem of low recognition rate of fraudulent web pages, and achieve the effect of improving the recognition rate, recognition efficiency and security.

Active Publication Date: 2012-07-18
TENCENT TECH (SHENZHEN) CO LTD
View PDF3 Cites 36 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0007] The purpose of the embodiments of the present invention is to provide a webpage identification method, device and system, aiming to solve the problem of the identification rate of fraudulent webpages due to the use of webpage matching methods in the prior art (for example, by manually inputting malicious webpages) to identify fraudulent webpages. lower question

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Webpage recognition method, device and system
  • Webpage recognition method, device and system
  • Webpage recognition method, device and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0037] Hotlinking webpage is a kind of fraudulent webpage. Hotlinking refers to the content that the webpage provider itself does not provide services, and selectively provides the service content of other service providers to end users on its own website through technical means to defraud end users. Views and click-through rates. A large number of phishing webpages, such as fake Taobao, fake online banking, etc., obtain CSS style sheets, images, flash and other elements of official webpages through hotlinking technology and output them in webpages with counterfeit official websites. The content is very similar to the official webpage, luring users The hyperlinks in the payment and submit buttons will jump to the pages set up to defraud the user’s private information such as user account number and password. Site pages that can obtain economic benefits such as payment websites and online banking.

[0038] figure 1 The implementation flow of the webpage identification method ...

Embodiment 2

[0045] The embodiment of the present invention calculates the probability that the input webpage belongs to the fraudulent webpage category according to the fraudulent webpage category preset by the user, and judges whether the input webpage is a fraudulent webpage, thereby realizing the identification of fraudulent webpages, effectively improving the identification rate of fraudulent webpages, and network security.

[0046] figure 2 The implementation flow of the webpage identification method provided by the second embodiment of the present invention is shown, and the details are as follows:

[0047] In step S201, the characteristic entries of the input webpage and the characteristic entries of the link webpage corresponding to the hyperlink in the input webpage are obtained.

[0048] In the embodiment of the present invention, when a client (such as a browser) requests to access a webpage, or grabs a webpage through a web crawler program (Crawler), the webpage that is requ...

Embodiment 3

[0065] In the embodiment of the present invention, the hotlink analyzer is used to analyze and judge whether the webpage is a fraudulent webpage of the hotlinking webpage type such as a phishing webpage. list) to judge the similarity between the input webpage and the legal webpage, so as to determine whether the webpage is a hotlink webpage.

[0066] In the embodiment of the present invention, when the analyzer is a hotlink analyzer, by performing hotlink analysis on the input webpage, hotlink type fraudulent webpages such as phishing webpages are identified, and correct information is obtained by analyzing information on hotlink webpages. Legitimate web page information (official web page), providing users with correct legal web page information.

[0067] image 3 The implementation flow of the webpage identification method provided by the third embodiment of the present invention is shown, and the details are as follows:

[0068] In step S301, the information of the input ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention is suitable for the field of computer technology, and provides a webpage recognition method, a device and a system. The method comprises acquiring the page information of an inputted webpage; and analyzing the page information by a pre-constructed analyzer in a preset time period, and outputting information representing whether the webpage is a fraud webpage or not. By analyzing the page information through the pre-constructed analyzer and outputting the information representing whether the webpage is the fraud webpage or not, the method provided by the invention solves the problems caused by adopting a webpage matching method to recognize the fraud webpage in the prior art, such as low fraud webpage recognition rate and low recognition efficiency, and can improve the recognition rate and recognition efficiency, thereby enhancing the network security.

Description

technical field [0001] The invention belongs to the technical field of computers, and in particular relates to a web page identification method, device and system. Background technique [0002] As the value of the Internet continues to increase, Internet security issues have become the focus of users' attention. At present, information theft methods represented by fraudulent websites such as phishing have become the focus of Internet security products. However, the existing technology mainly uses Malicious webpage matching method (for example, by manually inputting malicious webpages) to identify, for example, using cosine similarity, webpage deduplication algorithm (such as shingle algorithm) and other malicious seed page matching, keyword matching, etc., thereby identifying fraudulent webpages, the existing The identification technology of fraudulent web pages by technology has the following problems: [0003] (1) Similarity matching needs to continuously add a large numb...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F21/00G06F17/30
Inventor 孙炜冯庆磊黄利华刘松
Owner TENCENT TECH (SHENZHEN) CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products