Robot reinforcement learning method based on adaptive model

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
An adaptive model and reinforcement learning technology, applied in the field of artificial intelligence, can solve problems such as data distribution deviation, achieve high accuracy, excellent progressive performance, and small feature distribution distance

Active Publication Date: 2021-02-02

SHANGHAI JIAO TONG UNIV

View PDF6 Cites 2 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

In order to solve the problem of data distribution shift in domain adaptation, an effective method is to learn features with invariant properties

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0036] The following describes the preferred embodiments of the present application with reference to the accompanying drawings to make the technical content clearer and easier to understand. The present application can be embodied in many different forms of embodiments, and the protection scope of the present application is not limited to the embodiments mentioned herein.

[0037]The idea, specific structure and technical effects of the present invention will be further described below to fully understand the purpose, features and effects of the present invention, but the protection of the present invention is not limited thereto.

[0038] For an environment model built by a neural network, we can regard its first few layers as feature extractors, and the latter few layers as decoders. Given a data input (s, a), the state s is the position and velocity of each part of the robot, and the action a is the force applied to each part. First, the hidden layer feature h is obtained ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention provides a robot reinforcement learning method based on an adaptive model. According to the robot reinforcement learning method, the step of model self-adaption is added while an environment model is learned normally, and therefore the effect of improving the accuracy of virtual data can be achieved. Specifically, when the model is of a neural network structure, the model adaptivelyimproves the accuracy of the model on virtual data by reducing the feature distribution of real data and virtual data on a network hidden layer. Experiments prove that in the fields of robot control and the like, compared with a previous model-based reinforcement learning method, the method has higher sampling efficiency and final performance.

Description

technical field [0001] The invention relates to the field of artificial intelligence, in particular to a robot reinforcement learning method. Background technique [0002] In the field of robot control, we define the state as the position and speed of each part of the robot, and the action as the force applied to each part. The goal is to make the robot move as far as possible without falling over, and at the same time make the robot move as far as possible. The control force should be as small as possible, and the reinforcement learning method can be used to complete the above tasks. [0003] Reinforcement learning can be divided into model-free reinforcement learning and model-based reinforcement learning according to whether the environment is modeled. Among them, model-free reinforcement learning uses (state, action, next action, reward) tuple data sampled in the real environment to directly train a policy or value function, while model-based reinforcement learning meth...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): B25J9/16G06K9/62G06N3/04

CPCB25J9/161B25J9/1664G06N3/045G06F18/214Y02T10/40

Inventor 张伟楠沈键赵晗

Owner SHANGHAI JIAO TONG UNIV

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Robot reinforcement learning method based on adaptive model

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology