A source code vulnerability detection method based on deep learning

A vulnerability detection and deep learning technology, which is applied in the field of source code vulnerability detection based on deep learning, can solve the problems of not being able to cover all execution paths of the program, the difficulty of solving the overhead, the high rate of false negatives and false negatives, etc., to reduce the Dependence, improve the detection effect, improve the effect of efficiency

Active Publication Date: 2021-04-02
SUN YAT SEN UNIV
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The main problem of the existing relatively mature static analysis methods is that they rely heavily on the knowledge of domain experts, requiring domain experts to spend a lot of time and energy analyzing the source code of the program, and the rate of false negatives and false positives is relatively high; The dynamic analysis method judges whether there are vulnerabilities in the program by analyzing the information generated during program execution, and is usually used to analyze executable files.
The mainstream program dynamic analysis includes taint analysis and symbolic execution. The taint analysis of the program cannot cover all the execution paths of the program. It also requires domain experts to spend a lot of time and energy analyzing the program and has a high rate of false positives. Symbolic execution is performed by inputting the program into symbols. It can theoretically cover all execution paths, but it is difficult to be practically applied due to the high cost of solving

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A source code vulnerability detection method based on deep learning
  • A source code vulnerability detection method based on deep learning
  • A source code vulnerability detection method based on deep learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0032] Please refer to figure 1 , a source code vulnerability detection method based on deep learning, comprising the following steps:

[0033] S1. Calculate the code metrics of the functions in the source code file and integrate them into a code metrics vector V cm ;

[0034] S2. Using the function in the source code as the basic unit, use the deep learning method to complete the automatic extraction of function features; extract the abstract syntax tree AST, control flow graph CFG, and program dependency graph PDG in the source code;

[0035] S21. Using a bidirectional long short-term memory network BLSTM to automatically extract features from the function AST;

[0036] S211. Perform a depth-first traversal operation on the AST, and store the signs in the AST in a sign vector in order;

[0037] S212. Use the set of symbols in all symbol vectors as a lexicon, perform word embedding on the symbols and convert the AST symbol vector into a numerical vector, and select an appr...

Embodiment 2

[0047] This embodiment is consistent with the content of Embodiment 1. The prerequisite for implementation is that there is an available large-scale software vulnerability database and the type of vulnerability and the location of the vulnerability in the source code can be clearly known from the vulnerability database. From this database, Collect source codes that contain a certain type of vulnerability and have the same programming language as a dataset.

[0048] A method for detecting source code vulnerabilities based on deep learning, comprising the following steps:

[0049] S1. Calculate the code metrics of the functions in the source code, where the code metrics of the functions include statistical metrics and complexity metrics. Statistical indicators of the number of lines in the statistical indicators include: total number of lines, number of code lines, number of blank lines, number of comment lines, number of preprocessed code lines, number of inactive lines, ratio ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The present invention proposes a source code vulnerability detection method based on deep learning. Based on deep learning, the feature extraction of source code is automatically completed, and combined with code measurement indicators and automatically extracted source code features, a vulnerability detection model is constructed using a random deep forest algorithm. The invention provides a source code vulnerability detection method based on deep learning, which has a higher degree of automation, reduces dependence on domain expert knowledge, greatly saves code audit costs, and improves code audit efficiency. Compared with other methods that use deep learning for vulnerability detection, this method combines multiple representations of code to preserve the grammatical and semantic information of the code to a greater extent, so that the features automatically extracted by the deep learning algorithm can better describe the code. At the same time, it combines commonly used code metrics as detection features to further improve the detection effect.

Description

technical field [0001] The invention relates to the technical field of network security, and more specifically, to a method for detecting source code vulnerabilities based on deep learning. Background technique [0002] In today's highly informatized environment, all aspects of people's lives are closely related to various software. In daily life, people communicate through instant messaging software, shop online through shopping software, and use payment software to complete payment; and software also plays an important role in various organizations, such as school financial systems, various institutions, etc. self-service systems, database management systems in enterprises, etc. Due to possible errors in software design, implementation, and use, most software inevitably has loopholes. Once software loopholes are exploited by criminals, it will not only directly damage the interests of software users, but also indirectly affect the interests of software development compan...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): H04L29/06G06K9/62
CPCH04L63/1433G06F18/214
Inventor 金舒原吴跃隆
Owner SUN YAT SEN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products