AGV path planning method and system based on reinforcement learning

A path planning and reinforcement learning technology, applied in control/adjustment systems, two-dimensional position/channel control, vehicle position/route/altitude control, etc. Effects with low requirements and high generalizability

Active Publication Date: 2021-10-08
GUANGDONG UNIV OF TECH
View PDF11 Cites 15 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] In order to solve the problem that the existing AGV path planning method based on reinforcement learning consumes a lot of tim...

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • AGV path planning method and system based on reinforcement learning
  • AGV path planning method and system based on reinforcement learning
  • AGV path planning method and system based on reinforcement learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0091] like figure 1 As shown, in this embodiment, a flow diagram of an AGV path planning method based on reinforcement learning is proposed, see figure 1 , the method includes:

[0092] S1. Construct the AGV dynamic model, set the forward differential update step size, and determine the basic state update expression of the AGV based on the forward differential update step size and the AGV dynamic model;

[0093] Considering that reinforcement learning depends on the interaction between the agent and the environment, in multiple trials and errors, combined with a reasonable reward mechanism, the policy learning of the current scene is carried out. After the training converges (generally refers to the entire trajectory of each AGV interacting with the environment The obtained reward value converges), but directly collecting interactive data in the real environment will cause a large loss to the AGV. Therefore, a simulation model that can reflect the real AGV state change is ne...

Embodiment 2

[0197] like image 3 As shown, the application also proposes an AGV path planning system based on reinforcement learning, which is used to implement the AGV path planning method, see image 3 , the system includes:

[0198] The AGV dynamics building block is used to construct the AGV dynamics model, set the forward differential update step size, and determine the basic state update expression of the AGV based on the forward differential update step size and the AGV dynamics model;

[0199] The trajectory planning space design module uses the AGV as the agent, and the environmental information perceived by the AGV as the state information, and considers the destination position and obstacle position to design the state space, as well as the continuous action space and multiple reward mechanisms;

[0200] The Markov process modeling module, based on the AGV dynamic model and the basic state update expression of the AGV, combines the state space, continuous action space and mult...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides an AGV path planning method and system based on reinforcement learning, and aims to solve the problem that an existing AGV path planning method based on reinforcement learning needs to consume a large amount of time and computing power cost. According to the method, an AGV dynamical model is constructed; an AGV is used as an intelligent agent, environment information sensed by the driving of the AGV is used as state information, a state space is designed with a destination position and an obstacle position considered, and a continuous action space and a multiple reward mechanism are designed; markov process modeling of path planning is completed on the basis of the state space, the continuous action space and the multiple reward mechanism, any different starting points, target points and obstacles at any positions can be given in the state space, generalization is high; an Actor-Critic framework is introduced, strategy learning training is carried out, online operation avoids the problem of large calculation amount; and calculation requirements are low, and the real-time decision control of the AGV for any target and obstacle can be realized.

Description

technical field [0001] The present invention relates to the technical field of AGV path planning, and more specifically, to an AGV path planning method and system based on reinforcement learning. Background technique [0002] Automated Guided Vehicle (AGV for short), refers to a transport vehicle equipped with automatic guidance devices such as electromagnetic or optical, capable of driving along a prescribed guiding path, with safety protection and various transfer functions, in industrial applications A truck that does not require a driver uses a rechargeable battery as its power source. [0003] AGV can be roughly divided into three types: remote control type, semi-autonomous type and autonomous type according to its control mode and degree of autonomy. Navigation based on multi-track type is the earliest path planning method adopted by AGV, and it is also adopted by most of the path planning of AGV at present. Methods. AGV in traditional applications determines the tra...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G05D1/02
CPCG05D1/0221G05D2201/0216
Inventor 吴宗泽郭海森任志刚赖家伦王界兵
Owner GUANGDONG UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products