Reinforcement learning apparatus, control apparatus, and reinforcement learning method

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
a technology of reinforcement learning and control apparatus, applied in the direction of electric programme control, program control, instruments, etc., can solve the problems of impeded learning, increased or decreased learning results, and increased difficulty in trade-off problems

Active Publication Date: 2014-11-11

ATR ADVANCED TELECOMM RES INST INT +1

View PDF4 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Benefits of technology

Enables robot motor learning with a simple reward function, achieving quick and stable obstacle avoidance and reusability of learning results, even when obstacle positions or shapes change, thus improving the efficiency and stability of the learning process.

Problems solved by technology

Conventional techniques, however, have a problem in that the reward function for a complex motion trajectory is often expressed by the sum of various terms, and a trade-off occurring between the terms impedes learning (this is called the “trade-off problem”).

If the ratio of these two elements is not set appropriately, then the speed of learning results is extremely increased or decreased, resulting in undesirable motion trajectories.

This trade-off problem becomes more challenging if requirements such as obstacle avoidance, in addition to a reaching movement, are further imposed.

Too small a negative reward given upon contact with an obstacle results in collision of the robot arm with the obstacle, and too large a negative reward leads to a learning result in which the robot arm does not move from the starting point.

When the reward function has become too complex, the designer has to empirically adjust the balance between the elements, compromising the advantage of reinforcement learning.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

embodiment 1

[0038]In this embodiment, a description will be given of a reinforcement learning apparatus and the like that have the function of generating a virtual external force. In this embodiment, a description will also be given of a reinforcement learning apparatus in which a reinforcement learner and a virtual external force generator that constitute the reinforcement learning apparatus are separated. In this embodiment, a description will also be given of a reinforcement learning apparatus capable of performing automatic switching from the virtual external force generator to the virtual external force approximator. In this embodiment, a description will also be given of a reinforcement learning apparatus that can improve the reusability of the virtual external force generator. Furthermore, in this embodiment, a description will also be given of a reinforcement learning apparatus capable of performing model selection for the function approximator.

[0039]FIG. 1 is a schematic diagram illust...

embodiment 2

[0113]In this embodiment, a description will be given of a reinforcement learning apparatus and the like in which the reinforcement learner and the virtual external force generator are not separated.

[0114]FIG. 9 is a block diagram showing a reinforcement learning system B according to this embodiment.

[0115]The reinforcement learning system B includes the control object 1 and a reinforcement learning apparatus 3. The difference between the reinforcement learning apparatus 3 and the reinforcement learning apparatus 2 lies in whether the reinforcement learner and the virtual external force generator are separated or not. In the reinforcement learning apparatus 3, the reinforcement learner and the virtual external force generator are not separated.

[0116]The reinforcement learning apparatus 3 includes the reward function storage unit 211, the first-type environment parameter obtaining unit 212, the control parameter value calculation unit 213, the control parameter value output unit 214,...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

It is possible to perform robot motor learning in a quick and stable manner using a reinforcement learning apparatus including: a first-type environment parameter obtaining unit that obtains a value of one or more first-type environment parameters; a control parameter value calculation unit that calculates a value of one or more control parameters maximizing a reward by using the value of the one or more first-type environment parameters; a control parameter value output unit that outputs the value of the one or more control parameters to the control object; a second-type environment parameter obtaining unit that obtains a value of one or more second-type environment parameters; a virtual external force calculation unit that calculates the virtual external force by using the value of the one or more second-type environment parameters; and a virtual external force output unit that outputs the virtual external force to the control object.

Description

[0001]This application claims priority under 35 U.S.C. §119 to Japanese Patent Application No. 2011-074694, filed Mar. 30, 2011, which is incorporated by reference.BACKGROUND OF THE INVENTION[0002]1. Field of the Invention[0003]The present invention relates to a reinforcement learning apparatus and the like that perform robot motor learning.[0004]2. Description of Related Art[0005]Reinforcement learning has been widely used as a robot motor learning technique because it can be implemented even if the dynamics of the control object or the environment are unknown and it autonomously performs learning by simply setting a reward function according to the task (see, for example, JP 2007-66242A).[0006]Conventional techniques, however, have a problem in that the reward function for a complex motion trajectory is often expressed by the sum of various terms, and a trade-off occurring between the terms impedes learning (this is called the “trade-off problem”). For example, the reward function...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Patents(United States)

IPC IPC(8): G05B13/02G06F19/00

CPCG05B13/0265Y10S901/03

Inventor SUGIMOTO, NORIKAZUUEDA, YUGOHASEGAWA, TADAAKIIBA, SOSHIAKATSUKA, KOJI

Owner ATR ADVANCED TELECOMM RES INST INT

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Reinforcement learning apparatus, control apparatus, and reinforcement learning method

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Benefits of technology

Problems solved by technology

Method used

Image

Examples

embodiment 1

embodiment 2

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology