Path planning Q-learning initial method of mobile robot

A technology of mobile robot and initialization method, which is applied in the field of Q-learning initialization of mobile robot path planning, can solve the problems of not being able to objectively reflect the environment state of the robot, algorithm instability, etc., and achieve the goal of improving learning ability, accelerating convergence speed, and improving learning efficiency Effect

Inactive Publication Date: 2012-12-12
山东大学(威海)
View PDF3 Cites 93 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The fuzzy rules established by this method are all artificially set according to the environmental information, w

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Path planning Q-learning initial method of mobile robot
  • Path planning Q-learning initial method of mobile robot
  • Path planning Q-learning initial method of mobile robot

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0054] The present invention will be further described below in conjunction with accompanying drawings and examples.

[0055] The invention initializes the robot reinforcement learning based on the artificial potential energy field, virtualizes the working environment of the robot into an artificial potential energy field, and constructs the artificial potential energy field by using prior knowledge, so that the potential energy value of the obstacle area is zero, and the target point has the global maximum potential energy At this time, the potential energy value of each state in the artificial potential energy field represents the maximum cumulative return that the corresponding state can obtain by following the optimal strategy. Then define the initial value of Q as the immediate reward of the current state plus the maximum discounted cumulative reward of the subsequent state. By initializing the Q value, the learning process converges faster and the convergence process is ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a reinforcing learning initial method of a mobile robot based on an artificial potential field and relates to a path planning Q-learning initial method of the mobile robot. The working environment of the robot is virtualized to an artificial potential field. The potential values of all the states are confirmed by utilizing priori knowledge, so that the potential value of an obstacle area is zero, and a target point has the biggest potential value of the whole field; and at the moment, the potential value of each state of the artificial potential field stands for the biggest cumulative return obtained by following the best strategy of the corresponding state. Then a Q initial value is defined to the sum of the instant return of the current state and the maximum equivalent cumulative return of the following state. Known environmental information is mapped to a Q function initial value by the artificial potential field so as to integrate the priori knowledge into a learning system of the robot, so that the learning ability of the robot is improved in the reinforcing learning initial stage. Compared with the traditional Q-learning algorithm, the reinforcing learning initial method can efficiently improve the learning efficiency in the initial stage and speed up the algorithm convergence speed, and the algorithm convergence process is more stable.

Description

technical field [0001] The invention belongs to the technical field of machine learning, in particular to a Q-learning initialization method for path planning of a mobile robot. Background technique [0002] With the continuous expansion of the application field of robots, the tasks faced by robots are becoming more and more complex. Although researchers can pre-program the repetitive behaviors that robots may perform in many cases, the behavior design changes to achieve the overall desired behavior It is becoming more and more difficult for designers to make reasonable predictions about all the behaviors of robots in advance. Therefore, an autonomous robot capable of perceiving the environment must be able to learn new behaviors online by interacting with the environment, so that the robot can choose the optimal action to achieve the goal according to a specific task. [0003] Reinforcement learning uses a trial-and-error method similar to human thinking to discover the op...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G05D1/02
Inventor 宋勇李贻斌刘冰王小利荣学文
Owner 山东大学(威海)
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products