Deep reinforcement learning method and device based on visual converter

A technology of reinforcement learning and converter, applied in the field of artificial intelligence, to achieve the effect of effective learning and training and improving interpretability

Pending Publication Date: 2021-06-29
INFORMATION SCI RES INST OF CETC
View PDF13 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, transformers have not yet had a corresponding research in the field of reinforcement learning

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Deep reinforcement learning method and device based on visual converter
  • Deep reinforcement learning method and device based on visual converter
  • Deep reinforcement learning method and device based on visual converter

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0057] In order to enable those skilled in the art to better understand the technical solutions of the present invention, the present invention will be further described in detail below in conjunction with the accompanying drawings and specific embodiments.

[0058] First, refer to figure 1 An example electronic device for implementing the apparatus and method of the embodiments of the present invention will be described.

[0059] like figure 1 As shown, the electronic device 200 includes one or more processors 210, one or more storage devices 220, one or more input devices 230, one or more output devices 240, etc., these components are connected via a bus system 250 and / or other The form of the connecting mechanism interconnects. It should be noted that figure 1 The shown components and structure of the electronic device are exemplary rather than limiting, and the electronic device may also have other components and structures as required.

[0060] The processor 210 can b...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention belongs to the technical field of artificial intelligence, and provides a deep reinforcement learning method and device based on a visual converter, and the method comprises the steps: constructing a deep reinforcement learning network structure based on the visual converter, wherein the visual converter comprises a multi-layer perceptron and a conversion encoder, and the conversion encoder comprises a multi-head attention layer and a feedforward network; initializing the weight of the deep reinforcement learning network, and constructing an experience playback pool according to the memory capacity; generating empirical data and putting the empirical data into an empirical playback pool through interaction between a greedy strategy and an operating environment; when the number of samples in the experience playback pool meets a preset value, randomly extracting a batch of training sample images from the training sample images, preprocessing the training sample images, and inputting the preprocessed training sample images into a deep reinforcement learning network for training; and when the deep reinforcement learning network satisfies a convergence condition, obtaining a reinforcement learning model. According to the method and device, the blank of application of the visual converter in the reinforcement learning field can be filled, the interpretability of the reinforcement learning method is improved, and learning training is more effectively carried out.

Description

technical field [0001] The invention belongs to the technical field of artificial intelligence, and in particular relates to a deep reinforcement learning method and device based on a visual converter. Background technique [0002] In recent years, reinforcement learning has gradually become a research hotspot in the field of machine learning. An agent learns a strategy to maximize rewards or achieve a certain goal by interacting with the environment. Through the combination with deep learning methods, deep reinforcement learning methods have made breakthroughs in many artificial intelligence tasks, such as game games, robot control, group decision-making, automatic driving, etc. [0003] At present, deep reinforcement learning methods mainly include methods based on value functions, methods based on policy gradients, and methods based on the Actor-Critic framework. In the existing reinforcement learning network framework, the network structures adopted are mainly convolut...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/62G06N3/04G06N3/08
CPCG06N3/08G06N3/047G06F18/214
Inventor 金丹王昭龙玉婧
Owner INFORMATION SCI RES INST OF CETC
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products