Detection method and system for malevolence injection script web page

A detection method and web page technology, applied in the field of computer networks, can solve the problem of unable to find dynamic content web pages, etc.

Inactive Publication Date: 2009-07-01
BEIJING VENUS INFORMATION TECH
View PDF0 Cites 83 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The method and system for detecting maliciously injected script web pages described in the present invention overcome the traditional web security vulnerability scanning method and system, which can only find webpages containing script injection vulnerabilities and cannot discover t

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Detection method and system for malevolence injection script web page
  • Detection method and system for malevolence injection script web page
  • Detection method and system for malevolence injection script web page

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0059]实施例1为所述的对动态内容网页集合进行聚类分析的例子。这里假设网页爬虫从被扫描网站下载了8个网页,其对应的URL分别如下:

[0060]1) / cgi-bin / bbs / printpost.asp?pid=123

[0061]2) / cgi-bin / bbs / printpost.asp?pid=140

[0062]3) / documents / teaching / chapterl.htm

[0063]4) / cgi-bin / authors / authorsdetail?aid=1400

[0064]5) / documents / teaching / chapter3.html

[0065]6) / images / teaching / logo.TIF

[0066]7) / cgi-bin / authors / authorsdetail.asp?aid=1450

[0067]8) / documents / pdf / introduction.pdf

[0068]首先,按照URL请求的Web对象文件扩展名过滤掉与静态Web对象请求相关的网页;这里,那些文件扩展名为”.pdf”,”.htm”,”.TIF”,”.html”的URL所请求的Web对象明显为静态Web对象,因此过滤掉URL3、URL5、URL6和URL8,只剩下URL1、URL2、URL4、URL7所对应的网页为动态内容网页。

[0069]然后,对这四个URL按目录结构和文件名进行聚类,得到两个初始的网页簇:网页簇1为{URL1,URL2};网页簇2为{URL4,URL7};

[0070]最后,对各初始网页簇,按URL参数格式进行再次聚类。很容易发现:网页簇1的两个URL的参数分别为”pid=123”和”pid=140”,它们具有相同的URL参数格式”pid=integer”,因此,URL1和URL2所对应的动态内容网页属于同一个网页簇;网页簇2的两个URL的参数分别为”aid=1400”和”aid=1450”,它们具有相同的URL参数格式”aid=integer”,因此URL4和URL7所对应的动态内容网页属于同一个网页簇。按照此方法构造的动态内容网页簇如附图4所示,这里,printpost.asp节点450表示动态内容网页簇1,它包含URL1和URL2;Adetail.asp节点460表示动态内容网页簇2,它包含URL4和URL7。

[0071]如附图5所示,所...

Embodiment 2

[0077]所述的将表1中示例的动态内容网页转换为文档对象模型树的一个实施例如附图6所示。

[0078]表1 一个动态内容网页实例

[0079]

[0080]BBS group

[0081]

[0082]

[0083]Good Morning.Alice!

[0084]

[0085]

[0086]

[0087]附图6中,每个HTML标签表示为文档对象模型树中的一个节点,各HTML标签之间的层次关系在文档对象模型树中表示为子树和子节点关系。

Embodiment 3

[0089]所述的从文档对象模型树Tm和Tk中提取最大共用文档对象模型树Tg的一个实施例如附图7所示。如附图7中所示,文档对象模型树Tm包含8个节点,文档对象模型树Tk包含9个节点;提取的最大共用文档对象模型树Tg包含7个节点。

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a method for detecting a web page embedded with malicious scripts, and a system thereof, belonging to the technical field of computing network. The method comprises the following steps: traversing with a web page crawler and downloading all the web pages from a website to be scanned; performing cluster analysis on the downloaded web pages and extracting a web page cluster template; and detecting whether the web pages in the cluster contain embedded malicious scripts by using the web page cluster template. The system comprises a web page crawler module, a dynamic web page content flirtation module, a dynamic web page content clustering module, a web page cluster template extraction module and a embedded malicious script detection module.

Description

technical field [0001] The invention relates to a method and a system for detecting a maliciously injected script webpage, belonging to the technical field of computer networks. Background technique [0002] With the development of Internet technology and Web technology, the Web no longer only provides static content services for Internet users, but can provide various dynamic Web content services according to user needs. Due to the advantages of easy deployment and use of Web services, many traditional client / server applications have begun to be transformed into Web-based applications, including those applications such as electronic banking and electronic securities that have very high security requirements. [0003] While web applications bring convenience to people's life and work, they also bring many security problems, and script injection attacks are the most important security problems among these security problems. The root cause of script injection attacks is that ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): H04L12/26H04L29/06G06F17/30
Inventor 叶润国胡振宇朱钱杭李博骆拥政牛妍萍
Owner BEIJING VENUS INFORMATION TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products