Vulnerability mining technology based on reinforcement learning

A reinforcement learning and vulnerability mining technology, applied in the field of reinforcement learning-based vulnerability mining technology, can solve problems such as inability to arrange and partition products, and achieve the effect of reducing scale, improving performance, and reducing additional resource consumption.

Inactive Publication Date: 2022-06-21
南京泛函智能技术研究院有限公司
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The purpose of the present invention is to provide a vulnerability mining technology based on reinforcement learning to solve the problem of inability to arrange and isolate products in the pre-stage of the above-mentioned background technology

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Vulnerability mining technology based on reinforcement learning
  • Vulnerability mining technology based on reinforcement learning
  • Vulnerability mining technology based on reinforcement learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0044] according to figure 1 , the embodiment of the present invention provides a semantic reasoning mechanism flow, using reinforcement learning to generate input containing the required semantics, in order to make it more practical, we learn from the different influences of symbols and operations defined in the grammar To start, understanding these semantics can more reasonably generate inputs that satisfy the properties provided by the vulnerability description. Combining fuzzification with reinforcement learning has another benefit. The large execution time provides rich data and precise predictions for learning to understand the semantics. , the learned model is continuously updated during the fuzzification process, providing the possibility for more efficient inputs that may trigger vulnerabilities. State-of-the-art fuzzing frameworks such as AFL use these states to guide input generation, however, how to properly mix different criteria can affect Input generation, our r...

Embodiment 2

[0046] according to figure 2 and image 3, the embodiment of the present invention provides a process of guiding the fuzzification process by a symbolic analysis engine. Symbolic execution is one of the most powerful methods for detecting vulnerabilities, and it can accurately generate an input that guides execution to a specific program point, for example, A single condition x*x=100 may be difficult to randomly mutate to generate a feasible value x=10. For an integer variable x, the probability is less than 10 to 20. As the number of path conditions increases, large-scale programs exacerbate this problem, and constraint solving makes up for it To address the shortcomings of random mutation in handling complex path conditions, the combination of the two methods improves the ability of vulnerability detection. However, the effectiveness of constraint solving is mainly constrained by well-known performance problems in constraint solving, and our proposed method exploits the adv...

Embodiment 3

[0049] Combined with the content of the first and second embodiments above, the embodiment of the present invention uses the "exploration-utilization" model method to guide the mutation selection strategy of the fuzzer. The model framework can be used to evaluate the effect of mutation benefit, automatically adjust the mutation operation strategy, replace the random strategy in the original AFL mutation algorithm, and realize the optimization of AFL mutation performance;

[0050] There are multiple operations in AFL mutation, which are represented by n. Each mutation operation can be regarded as a different gambling machine. For each mutation operation, θn is used to represent the probability that the subsample generated by this mutation operation will generate a new path. Finally, by calculating the previous mutation, similar to the way of selecting the "best" rocker k, the mutation operation with the highest probability is selected for the next round of mutation. The probabil...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a vulnerability mining technology based on reinforcement learning. The vulnerability mining technology based on reinforcement learning is characterized in that the technology comprises the following steps; a semantic reasoning mechanism process; guiding the fuzzification process by the symbolic analysis engine; optimizing a fuzzy device variation process; according to the method, a reinforcement learning method is used for optimizing a variation strategy, a multi-rocker arm bandit problem is used as a model, execution effects of input generated in different variation modes in a target program are recorded, probability distribution conditions of variation operation results are adaptively learned by using an exploration-utilization algorithm, and variation operation strategy adjustment is intelligently performed; according to the technology, a mutation operation strategy can be automatically adjusted, test input with the high coverage rate is effectively generated, the method is feasible, extra resource consumption is small, the fuzzy test workflow based on component analysis is optimized, the scale of a program is reduced, the efficiency is guaranteed, and meanwhile, the test efficiency is improved. And unnecessary program fragments are saved for later verification.

Description

technical field [0001] The invention relates to the related field of vulnerability mining methods, in particular to a vulnerability mining technology based on reinforcement learning. Background technique [0002] Fuzzing is one of the most powerful vulnerability detection techniques available today and has been widely used to detect various software security issues, for example, OSS Fuzz has found more than 9,000 vulnerabilities in widely used third-party libraries since 2016 , Microsoft's obfuscation service found a third of Windows 7 security flaws, saving millions Concrete inputs can be generated to trigger vulnerabilities, so it is easy to form a proof of concept (PoC) for developers to understand and solve the problem, the simple intuition of using fast random mutation to generate a large number of inputs to evaluate the target program enables fuzzing to be applied to all type of item; [0003] However, in large-scale programs, fuzzing is ineffective for detecting vul...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F11/36G06N5/04
CPCG06F11/3604G06N5/048
Inventor 伍贵宾熊永平
Owner 南京泛函智能技术研究院有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products