A method and system for AGV path planning based on reinforcement learning

A technology of path planning and reinforcement learning, applied in control/adjustment system, two-dimensional position/course control, vehicle position/route/height control, etc., can solve the problems of large time consumption and computing power cost, and achieve computing power Effects with low requirements and high generalizability

Active Publication Date: 2022-04-12
GUANGDONG UNIV OF TECH
View PDF11 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] In order to solve the problem that the existing AGV path planning method based on reinforcement learning consumes a lot of time and computing power costs, the present invention proposes an AGV path planning method and system that is easy to implement and low in cost

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A method and system for AGV path planning based on reinforcement learning
  • A method and system for AGV path planning based on reinforcement learning
  • A method and system for AGV path planning based on reinforcement learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0091] like figure 1 As shown, in this embodiment, a schematic flowchart of an AGV path planning method based on reinforcement learning is proposed, see figure 1 , the method includes:

[0092] S1. Build the AGV dynamics model, set the forward differential update step size, and determine the basic state update expression of the AGV based on the forward differential update step size and the AGV dynamics model;

[0093] Considering that reinforcement learning depends on the interaction between the agent and the environment, in multiple trials and errors, combined with a reasonable reward mechanism, the strategy learning of the current scene is carried out. When the training converges (generally refers to the entire trajectory of each AGV interaction with the environment) The obtained reward value converges), but the interaction data is directly collected in the real environment, and the loss to the AGV is relatively large. Therefore, a simulation model that can reflect the stat...

Embodiment 2

[0197] like image 3 As shown, the present application also proposes an AGV path planning system based on reinforcement learning, the system is used to implement the AGV path planning method, see image 3 , the system includes:

[0198] The AGV dynamics building module is used to build the AGV dynamics model, set the forward differential update step size, and determine the basic state update expression of the AGV based on the forward differential update step size and the AGV dynamics model;

[0199] The trajectory planning space design module takes the AGV as the intelligent body, the environmental information perceived by the AGV as the state information, considers the destination location and the obstacle location to design the state space, and designs the continuous action space and multiple reward mechanisms;

[0200] The Markov process modeling module, according to the AGV dynamics model and the AGV's basic state update expression, combined with the state space, continuo...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The present invention proposes an AGV path planning method and system based on reinforcement learning, which solves the problem that the existing AGV path planning method based on reinforcement learning consumes a lot of time and computing power costs. The environment information perceived by its driving is state information, and the state space is designed considering the destination position and obstacle position, as well as the continuous action space and multiple reward mechanisms; combining the state space, continuous action space and multiple reward mechanisms, Complete the Markov process modeling of path planning, in which the state space can be given any different starting point, target point, and obstacles at any position, with high generalizability, and then introduce the Actor-Critic framework for policy learning training and online operation It avoids the problem of large amount of calculation, requires low computing power, and realizes real-time decision-making control of AGV for arbitrary targets and obstacles.

Description

technical field [0001] The present invention relates to the technical field of AGV path planning, and more particularly, to an AGV path planning method and system based on reinforcement learning. Background technique [0002] Automated Guided Vehicle (AGV) refers to a transport vehicle equipped with automatic guidance devices such as electromagnetic or optical, which can travel along a prescribed guiding path, and has safety protection and various transfer functions. It is used in industrial applications. It does not require a driver's truck, and uses a rechargeable battery as its power source. [0003] AGV can be roughly divided into three types according to its control method and degree of autonomy Methods. The AGV in traditional applications determines the travel route by identifying the magnetic track laid on the ground, but this method is limited by the inflexibility of the magnetic track, and the expansion path is relatively complicated; vision + two-dimensional code...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G05D1/02
CPCG05D1/0221G05D2201/0216
Inventor 吴宗泽郭海森任志刚赖家伦王界兵
Owner GUANGDONG UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products