Deep reinforcement learning method and device based on visual converter

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology of reinforcement learning and converter, applied in the field of artificial intelligence, to achieve the effect of effective learning and training and improving interpretability

Pending Publication Date: 2021-06-29

INFORMATION SCI RES INST OF CETC

View PDF13 Cites 4 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

However, transformers have not yet had a corresponding research in the field of reinforcement learning

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0057] In order to enable those skilled in the art to better understand the technical solutions of the present invention, the present invention will be further described in detail below in conjunction with the accompanying drawings and specific embodiments.

[0058] First, refer to figure 1 An example electronic device for implementing the apparatus and method of the embodiments of the present invention will be described.

[0059] like figure 1 As shown, the electronic device 200 includes one or more processors 210, one or more storage devices 220, one or more input devices 230, one or more output devices 240, etc., these components are connected via a bus system 250 and / or other The form of the connecting mechanism interconnects. It should be noted that figure 1 The shown components and structure of the electronic device are exemplary rather than limiting, and the electronic device may also have other components and structures as required.

[0060] The processor 210 can b...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention belongs to the technical field of artificial intelligence, and provides a deep reinforcement learning method and device based on a visual converter, and the method comprises the steps: constructing a deep reinforcement learning network structure based on the visual converter, wherein the visual converter comprises a multi-layer perceptron and a conversion encoder, and the conversion encoder comprises a multi-head attention layer and a feedforward network; initializing the weight of the deep reinforcement learning network, and constructing an experience playback pool according to the memory capacity; generating empirical data and putting the empirical data into an empirical playback pool through interaction between a greedy strategy and an operating environment; when the number of samples in the experience playback pool meets a preset value, randomly extracting a batch of training sample images from the training sample images, preprocessing the training sample images, and inputting the preprocessed training sample images into a deep reinforcement learning network for training; and when the deep reinforcement learning network satisfies a convergence condition, obtaining a reinforcement learning model. According to the method and device, the blank of application of the visual converter in the reinforcement learning field can be filled, the interpretability of the reinforcement learning method is improved, and learning training is more effectively carried out.

Description

technical field [0001] The invention belongs to the technical field of artificial intelligence, and in particular relates to a deep reinforcement learning method and device based on a visual converter. Background technique [0002] In recent years, reinforcement learning has gradually become a research hotspot in the field of machine learning. An agent learns a strategy to maximize rewards or achieve a certain goal by interacting with the environment. Through the combination with deep learning methods, deep reinforcement learning methods have made breakthroughs in many artificial intelligence tasks, such as game games, robot control, group decision-making, automatic driving, etc. [0003] At present, deep reinforcement learning methods mainly include methods based on value functions, methods based on policy gradients, and methods based on the Actor-Critic framework. In the existing reinforcement learning network framework, the network structures adopted are mainly convolut...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G06K9/62G06N3/04G06N3/08

CPCG06N3/08G06N3/047G06F18/214

Inventor 金丹王昭龙玉婧

Owner INFORMATION SCI RES INST OF CETC

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Deep reinforcement learning method and device based on visual converter

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology