Robot Plume Tracking Method Based on Reinforcement Learning in Continuous State Behavior Domain

A continuous state and reinforcement learning technology, applied in neural learning methods, instruments, manipulators, etc., can solve problems such as high cost, inability to adapt to the environment, and insufficient consideration
CN107729953BActive Publication Date: 2019-09-27TSINGHUA UNIV

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
TSINGHUA UNIV
Publication Date
2019-09-27

Smart Images

  • Figure 1
    Figure 1
  • Figure 2
    Figure 2
  • Figure 3
    Figure 3
Patent Text Reader

Abstract

The invention proposes a robot plume tracking method based on continuous state behavior domain reinforcement learning, which belongs to the field of underwater robot path planning. This method trains the path planning for the underwater robot to search for plume hydrothermal vents; the robot generates a state vector to input the current decision-making neural network in each plume tracking, and the neural network outputs the forward direction of the robot at this moment, and the robot moves at a constant speed. After running for a period of time, update the state vector at the new moment and judge whether the single plume tracking meets the termination condition: when the termination condition is met, the single plume tracking ends, and the robot regenerates a new initial position; if not Satisfied, the robot will continue to move forward at the next moment; in this process, the reinforcement learning algorithm is used to update the decision-making neural network at each moment until the algorithm converges. The invention has fast learning speed and good convergence, can improve the flexibility of the robot to track the plume flow hydrothermal nozzle, and reduce the search cost.
Need to check novelty before this filing date? Find Prior Art

Description

technical field

[0001] The invention belongs to the field of underwater robot path planning, in particular to a robot plume tracking method based on continuous state behavior domain reinforcement learning. Background technique

[0002] Deep-sea hydrothermal activities and their life phenomena are one of the major discoveries in marine science in the 20th century. Since deep-sea hydrothermal vents are closely related to seafloor spreading and polymetallic sulfide mineralization, and involve cutting-edge scientific issues such as the evolution of biological communities in hydrothermal environments, and the impact of hydrothermal activities on global climate change, etc. The study of deep-sea hydrothermal fluids has become a hot topic in ocean research.

[0003] In order to further study deep-sea hydrothermal fluids, it is necessary to explore the location of unknown hydrothermal vents in the deep sea. Researchers have found that deep-sea hydrothermal vents will emit hydrothe...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More