Source code vulnerability detection method based on deep learning

A vulnerability detection and deep learning technology, applied in the field of source code vulnerability detection based on deep learning, can solve the problems of not being able to cover all the execution paths of the program, the difficulty of solving the overhead, the high rate of false negatives and false negatives, etc., to reduce the Dependence, improve the detection effect, improve the effect of efficiency

Active Publication Date: 2019-07-12
SUN YAT SEN UNIV
View PDF3 Cites 20 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The main problem of the existing relatively mature static analysis methods is that they rely heavily on the knowledge of domain experts, requiring domain experts to spend a lot of time and energy analyzing the source code of the program, and the rate of false negatives and false positives is relatively high; The dynamic analysis method judges whether there are vulnerabilities in the program by analyzing the information generated during program execution, and is usually used to analyze executable files.
The mainstream program dynamic analysis includes taint analysis and symbolic execution. The taint analysis of the program cannot cover all the execution paths of the program. It also requires domain experts to spend a lot of time and energy analyzing the program and has a high rate of false positives. Symbolic execution is performed by inputting the program into symbols. It can theoretically cover all execution paths, but it is difficult to be practically applied due to the high cost of solving

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Source code vulnerability detection method based on deep learning
  • Source code vulnerability detection method based on deep learning
  • Source code vulnerability detection method based on deep learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0032] Please refer to figure 1 , a source code vulnerability detection method based on deep learning, comprising the following steps:

[0033] S1. Calculate the code metrics of the functions in the source code file and integrate them into a code metrics vector V cm ;

[0034] S2. Using the function in the source code as the basic unit, use the deep learning method to complete the automatic extraction of function features; extract the abstract syntax tree AST, control flow graph CFG, and program dependency graph PDG in the source code;

[0035] S21. Using a bidirectional long short-term memory network BLSTM to automatically extract features from the function AST;

[0036] S211. Perform a depth-first traversal operation on the AST, and store the signs in the AST in a sign vector in order;

[0037] S212. Use the set of symbols in all symbol vectors as a lexicon, perform word embedding on the symbols and convert the AST symbol vector into a numerical vector, and select an appr...

Embodiment 2

[0047] This embodiment is consistent with the content of Embodiment 1. The prerequisite for implementation is that there is an available large-scale software vulnerability database and the type of vulnerability and the location of the vulnerability in the source code can be clearly known from the vulnerability database. From this database, Collect source codes that contain a certain type of vulnerability and have the same programming language as a dataset.

[0048] A method for detecting source code vulnerabilities based on deep learning, comprising the following steps:

[0049] S1. Calculate the code metrics of the functions in the source code, where the code metrics of the functions include statistical metrics and complexity metrics. Statistical indicators of the number of lines in the statistical indicators include: total number of lines, number of code lines, number of blank lines, number of comment lines, number of preprocessed code lines, number of inactive lines, ratio ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a source code vulnerability detection method based on deep learning, and the method comprises the steps: automatically completing the feature extraction of a source code based on deep learning, and constructing a vulnerability detection model by using a random deep forest algorithm in combination with a code measurement index and an automatically extracted source code feature. The source code vulnerability detection method based on deep learning provided by the invention has higher degree of automation, reduces dependence on domain expert knowledge, greatly reduces the code auditing cost and improves the code auditing efficiency. Compared with other methods for vulnerability detection by using deep learning, , grammar and semantic information of the code are reservedto the maximum extent by combining multiple representations of the code, the code can be better depicted by the characteristics automatically extracted by the deep learning algorithm, and meanwhile,the detection effect is further improved by combining common code measurement indexes as detection characteristics.

Description

technical field [0001] The invention relates to the technical field of network security, and more specifically, to a method for detecting source code vulnerabilities based on deep learning. Background technique [0002] In today's highly informatized environment, all aspects of people's lives are closely related to various software. In daily life, people communicate through instant messaging software, shop online through shopping software, and use payment software to complete payment; and software also plays an important role in various organizations, such as school financial systems, various institutions, etc. self-service systems, database management systems in enterprises, etc. Due to possible errors in software design, implementation, and use, most software inevitably has loopholes. Once software loopholes are exploited by criminals, it will not only directly damage the interests of software users, but also indirectly affect the interests of software development compan...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): H04L29/06G06K9/62
CPCH04L63/1433G06F18/214
Inventor 金舒原吴跃隆
Owner SUN YAT SEN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products