Method for controlling intelligent equipment based on reinforcement learning strategy

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology of reinforcement learning and intelligent equipment, applied in the field of artificial intelligence control and reinforcement learning, it can solve the problems of training failure, lack of robustness of training, and inability to effectively control the disturbance of noise, so as to avoid training failure and lack of training. Robustness, the effect of improving robustness

Active Publication Date: 2021-06-15

北京云量数盟科技有限公司

View PDF5 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

However, the existing methods cannot effectively control the noise disturbance. Excessive disturbance will cause reinforcement learning to fail to train the optimal strategy, resulting in training failure; and limiting the disturbance will lead to less robust training.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0085] The difficulty of this method to support noise disturbance for the first time gradually increases; the process is realized multiple times and step by step; it is automated based on the confrontational generation network, and the experimental results of the Hopper robot control scene in the OpenAI gym initially show from the robustness score This method is feasible.

[0086] Assuming that the smart device involved in the method of the present invention is an unmanned vehicle, and the disturbance environment parameter is the friction coefficient of the road, the method includes:

[0087] Obtain the environmental parameter set P={0.95,0.90,0.88,0.85,0.76,0.72} of the unmanned vehicle in the current disturbance environment, and mark it to obtain the labeled environmental parameter set LP={0.95 :0,0.90:0,0.88:1,0.85:1,0.76:1,0.72:0}

[0088] Input the labeled environmental parameter set LP into the pre-trained adversarial generation network to obtain a new environmental par...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention belongs to the technical field of control and reinforcement learning of artificial intelligence, and particularly relates to a method for controlling intelligent equipment based on a reinforcement learning strategy. The method comprises the following steps: obtaining an environment parameter set of the intelligent equipment in a current disturbance environment, taking the environment parameter set as the disturbance environment of the intelligent equipment, and marking the environment parameter set to obtain an environment parameter set with a label; inputting the environment parameter set with the label into a pre-trained antagonism generation network to obtain a brand new environment parameter set; according to the brand-new environment parameter set, updating the current reinforcement learning strategy to obtain an updated reinforcement learning strategy adapted to the brand-new environment parameter set, and inputting the updated reinforcement learning strategy to the intelligent device; and enabling the intelligent equipment to execute an action corresponding to the current state of the intelligent equipment according to the updated reinforcement learning strategy to complete control of the intelligent equipment i0n the disturbance environment.

Description

technical field [0001] The invention belongs to the technical field of artificial intelligence control and reinforcement learning, and in particular relates to a method for controlling an intelligent device based on a strategy of reinforcement learning. Background technique [0002] Reinforcement learning is one of the core technologies of artificial intelligence. Through continuous interaction with the application environment, it can learn the optimal strategy and provide intelligent decision support for intelligent devices such as robots and unmanned vehicles. Public materials show that AlphaGo has been trained using reinforcement learning before the public competition. However, there are always differences between the training environment and the real environment, and the training environment and the real environment cannot be completely consistent due to the existence of noise. Therefore, how to reduce the impact on training due to the difference between the training en...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G05B13/04

CPCG05B13/042

Inventor 辛苗

Owner 北京云量数盟科技有限公司

Method for controlling intelligent equipment based on reinforcement learning strategy

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology