Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Webpage malicious JavaScript code recognition and anti-obfuscation method based on hybrid analysis

A JS code and malicious technology, applied in the software and computer fields, can solve problems such as poor readability, unknown code attacks, no longer applicable detection of malicious JS code, etc., and achieve the effect of small operating overhead

Pending Publication Date: 2019-11-26
NANJING UNIV
View PDF3 Cites 12 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] At present, many detection methods have been proposed to detect malicious JS codes in web pages. However, with the wide application of obfuscation technology in JS codes, many traditional detection methods are no longer suitable for detecting malicious JS codes.
For example, most Internet users choose to use anti-virus software to detect malicious JS code, but it has been found that since most popular anti-virus software uses signature-based detection schemes, some of them even use exact matching to detect malicious JS Code, for the obfuscated malicious JS code, the average accuracy of anti-virus software detection is less than 50%
In addition, because the malicious JS code is poorly readable after being obfuscated, even if it is detected correctly, people do not know what method the code uses to attack

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Webpage malicious JavaScript code recognition and anti-obfuscation method based on hybrid analysis
  • Webpage malicious JavaScript code recognition and anti-obfuscation method based on hybrid analysis
  • Webpage malicious JavaScript code recognition and anti-obfuscation method based on hybrid analysis

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0012] The present invention specifically comprises the following steps:

[0013] 1) First collect a large number of website source files containing malicious JS and a large number of benign web pages that do not contain malicious JS, and extract all JS scripts embedded in HTML documents and JS codes stored in JS files as a data set.

[0014] 2) Perform program analysis and feature recognition on the data set extracted in step 1), construct an abstract syntax tree in the semantic analysis stage, and perform semantic level analysis.

[0015] 3) Perform dynamic instrumentation of JS code, monitor runtime status, extract execution path and other runtime features. Combining semantic features and execution features into feature vectors.

[0016] 4) Using a classification model based on the random forest algorithm, a malicious JS detection system is formed by training a high-precision classifier model.

[0017] 5) On the basis of the dynamic instrumentation analysis in step 3), by...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to a webpage malicious JavaScript code recognition and anti-obfuscation method based on hybrid analysis. The webpage malicious JavaScript code recognition and anti-obfuscation method comprises the following steps: firstly, collecting related webpage source codes, and extracting malicious JS files in the source codes and malicious JS codes embedded in HTML documents; then, constructing an abstract syntax tree in the syntax analysis stage, and expressing nodes as conventional JS objects for program analysis and feature extraction; then, carrying out instrumentation on the JS code, carrying out overwriting on basic operation, needing to be monitored, during running, of the JS code, dynamically monitoring the state and information during JS execution, and extracting an execution track and dynamic characteristic information during running; then, rewriting the dynamic and static features into feature vectors, and training a malicious JS code recognition model based on arandom forest algorithm model; and then, based on a dynamic instrumentation method, monitoring and recording memory overwriting related operations, and carrying out effective anti-obfuscation on obfuscated malicious JS codes.

Description

technical field [0001] The invention belongs to the field of computer technology, especially the field of software technology. The present invention proposes a method and an anti-obfuscation method for detecting malicious JavaScript (JS) codes in webpages based on hybrid analysis (combination of dynamic and static analysis), which can effectively identify and intercept malicious JS codes in current webpages, and at the same time, confuse malicious JS codes. Do effective deobfuscation. Background technique [0002] As one of the most popular scripting languages ​​in the world, JS plays an important role in web-based applications and services, and is used by millions of web pages to optimize interface design, verify form data, check browser information, and respond to browsing server events, control cookies, and more. [0003] Many features of JS bring great convenience to the development of browser client and server. First of all, as a typical dynamic programming language,...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F21/56
CPCG06F21/563G06F21/566
Inventor 许蕾何欣程查春柳陈林徐宝文
Owner NANJING UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products