Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Deep reinforcement learning model poisoning detection method and device based on time sequence neural pathway

A technology of reinforcement learning and detection methods, applied in neural learning methods, biological neural network models, neural architectures, etc., can solve problems such as being easily poisoned and difficult to detect, and achieve the effects of effective poisoning attacks, effective detection, and good applicability

Pending Publication Date: 2021-08-27
ZHEJIANG UNIV OF TECH
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] Aiming at the problem that the current deep reinforcement learning model is easily poisoned and difficult to detect after being poisoned, the present invention provides a method and device for detecting poisoning of a deep reinforcement learning model based on a sequential neural pathway, which can be optimized by neurons on the temporal neural pathway. Approximate poisoning test samples, through the approximate poisoning test samples to detect whether the deep reinforcement learning model is poisoned

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Deep reinforcement learning model poisoning detection method and device based on time sequence neural pathway
  • Deep reinforcement learning model poisoning detection method and device based on time sequence neural pathway
  • Deep reinforcement learning model poisoning detection method and device based on time sequence neural pathway

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0024] In order to make the objectives, technical solutions and advantages of the present invention clearer, the present invention will be further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are only used to explain the present invention, and do not limit the protection scope of the present invention.

[0025] The technical idea of ​​the present invention is as follows: in view of the problem that the deep reinforcement learning model is difficult to detect after being maliciously poisoned, the embodiments of the present invention provide a deep reinforcement learning poisoning defense method and device based on the time series neural pathway. According to the learning characteristics of deep reinforcement learning, a time-series neural pathway for the deep reinforcement learning model is defined. The time-series neural pathway can correlate the input of the previou...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a deep reinforcement learning model poisoning detection method and device based on a time sequence neural pathway. The method comprises the following steps: defining a time sequence neural pathway of deep reinforcement learning, constructing a time sequence neural pathway of a deep reinforcement learning model comprising a first part comprising a convolutional layer and a pooling layer and a second part comprising a full connection layer through definition according to a time sequence nerve, namely obtaining Top-c neurons of the first part through multiple times of searching, putting the Top-c neurons and all neurons of the second part into a neuron pool, and constructing a time sequence neural pathway of deep reinforcement learning according to the neuron pool; inputting the sample data into the deep reinforcement learning model, generating disturbance by using back propagation of the constructed time sequence neural pathway, and adding the disturbance to the input sample to obtain a poisoning sample; and inputting the poisoning sample into the deep reinforcement learning model, and detecting whether the deep reinforcement learning model is poisoned or not according to the decision action change of the deep reinforcement learning model.

Description

technical field [0001] The invention belongs to the field of moderate detection, and in particular relates to a deep reinforcement learning model poisoning detection method and device based on sequential neural pathways. Background technique [0002] Deep reinforcement learning (DRL) is a new research hotspot in the field of artificial intelligence. Since its inception, deep reinforcement learning methods have achieved substantial breakthroughs in many tasks that require perception of high-dimensional raw input data and decision control. DRL has been widely used in different fields, including gaming, autonomous driving, healthcare, financial transactions, robot control, cybersecurity, computer vision, and more. [0003] Artificial intelligence technology replaces human beings to make autonomous decision-making in many fields, but recent research shows that deep reinforcement learning models are vulnerable to different types of malicious attacks, and the security vulnerabili...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06N3/04G06N3/08G06F21/56
CPCG06N3/049G06N3/08G06N3/084G06F21/562G06N3/045
Inventor 陈晋音王雪柯章燕胡书隆
Owner ZHEJIANG UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products