Unlock instant, AI-driven research and patent intelligence for your innovation.

Dual-granularity lightweight vulnerability code slice quality evaluation method

A quality assessment, lightweight technology, applied in the field of information security, can solve the problems of sacrificing the interpretability of the original code, losing the semantic information of code slices, and high technical complexity, so as to improve the prediction accuracy and generalization ability, and improve the interpretability. performance, optimize the effect of slicing methods

Active Publication Date: 2022-01-07
YANSHAN UNIV
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] 1. Incomplete extraction of code slice information: Traditional machine learning models such as support vector machines and random forests have fast convergence speed and small memory footprint
However, code slices in the form of text need to go through a complex word embedding process to serve as input to such models
Deep learning solutions such as ELMo and Bert integrate the word embedding process themselves, but training requires a large amount of labeled data and high-performance computing support, and a certain amount of model fine-tuning time is required when using it
[0006] 2. High technical complexity and poor generalization ability: Solving the word embedding problem is the premise of inventing the vulnerability code slice quality assessment technology
Its technical means sacrifices the interpretability of the original code and loses the semantic information of code slicing. Researchers can only judge whether the new slicing method is effective based on the black-box evaluation model, but cannot obtain why the new method is effective and how to improve the new method. Relevant information, it is difficult to clarify the direction of manual verification and improvement

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Dual-granularity lightweight vulnerability code slice quality evaluation method
  • Dual-granularity lightweight vulnerability code slice quality evaluation method
  • Dual-granularity lightweight vulnerability code slice quality evaluation method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0037] The present invention will be further described in detail below with reference to the accompanying drawings and examples:

[0038] like figure 1 As shown, a double-grained lightweight vulnerability code slice mass assessment method includes the following steps:

[0039] Step 1, classify, preprocess the vulnerability code slice sample.

[0040] Classification is based on the vulnerability type included in the vulnerability code slice, which does not include a vulnerability as a type; the method of preprocessing the vulnerability code slice sample is to delete all operators in the code, and peeled all identifiers in the code slice; The slice of tabs is a positive integer greater than or equal to 50 less than or equal to 200.

[0041] This embodiment is from the NVD data set and the SARD data set, using the VuldeePecker slice method, 10400 code slits containing the cache area overflow vulnerability, 39753 code slips that do not contain a vulnerability, and a total of 50,153.

...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a dual-granularity lightweight vulnerability code slice quality evaluation method, and belongs to the technical field of information security. The method comprises the following steps: classifying and preprocessing vulnerability code slice samples; segmenting code slices by using different granularities of words and characters of a window; establishing an evaluation feature vector; calculating the code slice statistical characteristics, and establishing a slice data set; establishing a lightweight assessment model; inputting the slice data set into the lightweight assessment model, and outputting assessment features and assessment indexes. According to the method, code slices are segmented through multiple sizes of windows at word and character levels, a vulnerability detection vector space is constructed by using statistical features, and implicit vulnerability features in the code slices are extracted, so that the problem of embedding of unregistered words in a vulnerability detection technology based on the code slices is solved, a heterogeneous integrated lightweight assessment model is constructed, and the evaluation features and the multi-dimensional evaluation indexes are output to replace a black box model in the traditional technology, so the research, development and iteration efficiency of a code slicing method by researchers is improved.

Description

Technical field [0001] The present invention relates to the field of information security technology, in particular a two-granular lightweight vulnerability code slice mass assessment method. Background technique [0002] Vulnerability code slice, by breaking down the large-scale project source code, contains only a smaller code slice of the vulnerability related code, eliminating interference of the disconnect detection results in complex software items. The validity of the new method of vulnerability code slice needs to be proven by model assessment. [0003] In the existing vulnerability detection scenario, the code slice method evaluation technology is divided into coding model assessment technology, machine learning model assessment technology, deep learning model assessment technology three categories: coding model assessment technology mainly uses Word2Vec, phrase bags, TF-IDF and other models; Machine learning model assessment technology mainly uses models such as support...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F21/57G06K9/62
CPCG06F21/577G06F2221/033G06F18/254Y02P90/30
Inventor 张炳文峥赵宇轩赵旭阳任家东
Owner YANSHAN UNIV