Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Intelligent defense decision-making method and device based on reinforcement learning and attack and defense games

An offensive and defensive game and reinforcement learning technology, applied in the field of network security, can solve the problems of unfavorable individual members' real-time strategy selection, etc., to achieve the effect of improving real-time performance and intelligence, compressing the game state space, and improving the learning speed

Active Publication Date: 2019-08-23
PLA STRATEGIC SUPPORT FORCE INFORMATION ENG UNIV PLA SSF IEU
View PDF5 Cites 14 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

In the evolutionary game, there are too many information exchanges between the participants and the research is mainly on the adjustment process, trend and stability of the offensive and defensive group strategies, which is not conducive to guiding the real-time strategy selection of individual members.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Intelligent defense decision-making method and device based on reinforcement learning and attack and defense games
  • Intelligent defense decision-making method and device based on reinforcement learning and attack and defense games
  • Intelligent defense decision-making method and device based on reinforcement learning and attack and defense games

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0034] In order to make the purpose, technical solution and advantages of the present invention more clear and understandable, the present invention will be further described in detail below in conjunction with the accompanying drawings and technical solutions. The technical term involved in the embodiment is as follows:

[0035] Reinforcement learning is a classic online learning method. Participants learn independently through environmental feedback. Compared with biological evolutionary learning methods, the learning speed is fast, and it is in line with the characteristics of fast offensive and defensive transitions and strong timeliness. The characteristics of the game, such as non-cooperation, goal antagonism, and strategy dependence, all conform to the basic characteristics of network attack and defense. Embodiment of the present invention, see figure 1 As shown, an intelligent defense decision-making method based on reinforcement learning and attack-defense game is pr...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention belongs to the technical field of network security and particularly relates to an intelligent defense decision-making method and device based on reinforcement learning and attack and defense games. The method comprises the steps that an attack and defense game model is constructed under finite constraint, an attack and defense graph used for extracting the network state and the attack and defense action in the game model is generated, the attack and defense graph is set to take a host as the center, attack and defense graph nodes extract the network state, and the attack and defense graph edge analyzes the attack and defense action; and when the network state transition probability of the defender is unknown, the defender obtains the defense revenue through online learning, so that the defender automatically makes the selection of the optimal defense strategy when facing different attackers. The game state space is effectively compressed, and the storage and operation expenditure is reduced. Adefender performs reinforcement learning according to environmental feedback in confrontation with an attacker, and can adaptively make an optimal choice when facing different attacks; therefore, the learning speed of the defender is improved, the defense income is improved, the dependence on historical data is reduced, and the instantaneity and the intelligence of the defender during decision making are effectively improved.

Description

technical field [0001] The invention belongs to the technical field of network security, in particular to an intelligent defense decision-making method and device based on reinforcement learning and attack-defense game. Background technique [0002] In recent years, information security incidents have become more frequent, which has brought huge losses to network security. According to statistics, Alibaba Cloud suffered about 1.6 billion attacks every day in 2017. For different attackers, each attack and defense scenario may only It will happen once, but for defenders represented by Alibaba Cloud, they have to face a lot of the same offensive and defensive scenarios every day. Considering the limited hardware resources of network equipment, how to comprehensively consider the cost and benefit of defense, with the goal of maximizing the benefit of defense, so that the defender can achieve a balance between risk and investment, and how to make the defender respond to the same ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): H04L29/06H04L12/24
CPCH04L41/145H04L63/20
Inventor 胡浩张玉臣杨峻楠谢鹏程刘玉岭马博文冷强张畅陈周文林野
Owner PLA STRATEGIC SUPPORT FORCE INFORMATION ENG UNIV PLA SSF IEU
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products