An unmanned ship path planning method based on a Q learning neural network

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A path planning and neural network technology, applied in the field of intelligent control of unmanned ships, can solve problems such as path planning in unknown fields

Inactive Publication Date: 2019-05-07

ZHEJIANG FORESTRY UNIVERSITY

View PDF5 Cites 45 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

However, this method needs to know the location of the environmental terrain and obstacles, and cannot perform path planning for unknown areas.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0078] A kind of unmanned ship path planning method based on Q learning neural network of the present embodiment comprises the following steps: a kind of unmanned ship path planning method based on Q learning neural network is characterized in that, comprises the following steps:

[0079] a), initializing storage area D;

[0080] b) Initialize the Q network, the initial value of the state and action; the Q network contains the following elements: S, A, P s,α , R, where S represents the set of system states the USV is in, A represents the set of actions that the USV can take, P s,α Represents the system state transition probability, R represents the reward function;

[0081] c), Randomly set the training target;

[0082] d), randomly select action a t , get the current reward r t , the next moment state s t+1 , will (s t ,a t ,r t ,s t+1 ) is stored in the storage area D;

[0083] e), randomly sample a batch of data from the storage area D for training, that is, a ba...

Embodiment 2

[0088] A kind of unmanned ship path planning method based on Q-learning neural network of the present embodiment, based on embodiment one, traditional Q-learning algorithm specifically is:

[0089] Q learning is based on the Markov decision process (Markov Decision Process) to describe the problem. The Markov decision process contains 4 elements: S, A, P s,a ,R. Among them, S represents the system state set where the USV is located, that is, the current state of the USV and the state of the current environment, such as the size and position of obstacles; A represents the set of actions that the USV can take, that is, the direction of rotation of the USV; P s,a Represents the system model, that is, the system state transition probability, P(s'|s,a) describes the probability of the system reaching state s after executing action a in the current state s; R represents the reward function, which has the current state and all The action taken decides. Think of Q-learning as an inc...

Embodiment 3

[0108] An unmanned ship path planning method based on the Q-learning neural network in this embodiment is based on the second embodiment, as long as the future TD deviation value is unknown, the above update cannot be performed. However, they can be calculated incrementally by using traces. η t (s, a) is defined as a characteristic function: when (s, a) occurs at time t, it returns 1, otherwise it returns 0. For simplicity, ignoring the learning efficiency, define a trace e for each (s, a) t (s,a)

[0109]

[0110]

[0111] Then at time t the online update is

[0112] Q(s,a)=Q(s,a)+α[δ' t n t (s,a)+

[0113] δ t e t (s,a)] (8)

[0114] Among them, the function Q(s, a) is to perform action a in state s, α is the learning rate, η t (s,a) is the characteristic function, e t (s,a) is the trace, δ' t Represents the bias value of past learning, δ 1 is the deviation value learned now, δ 1 is the deviation value δ' between the cumulative return R(s) and the current...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses an unmanned ship path planning method based on a Q learning neural network. The method comprises the following steps: a) initializing a storage area D; B) initializing a Q network, a state and an action initial value; C) randomly setting a training target; D) randomly selecting an action at a to obtain a current reward rt and a next moment state st + 1, and storing (st, at,rt and st + 1) into a storage area D; E) randomly sampling a batch of data from the storage area D for training, namely a batch (st, at, rt, st + 1), and considering the state when the USV reaches the target position or exceeds the maximum time of each round as the final state; F) if the st + 1 is not the final state, returning to the step d), if the st + 1 is the final state, updating Q networkparameters, returning to the step d), and repeating n rounds to finish the algorithm; And g) setting a target, and carrying out path planning by using the trained Q network until the USV reaches the target position. The decision-making time is short, the path is more optimized, and the real-time requirement of online planning can be met.

Description

technical field [0001] The invention belongs to the field of intelligent control of unmanned ships, and in particular relates to a path planning method for unmanned ships based on a Q-learning neural network. Background technique [0002] Water quality monitoring is the main method to evaluate water quality and prevent water pollution. With the increase of industrial wastewater, the problem of water pollution is becoming more and more serious, and the demand for dynamic monitoring of water pollution is urgent. However, because the traditional water quality monitoring method has many steps and takes a long time, the diversity and accuracy of the obtained data are far from meeting the needs of decision-making. According to the above problems, a variety of water quality monitoring methods have been proposed. For example, Cao Lijie and others proposed to obtain a more accurate water quality inversion model by establishing a sensor network. Tian Ye et al proposed to invert the ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G06Q10/04G06N3/08

Inventor 冯海林吕扬民方益明周国模

Owner ZHEJIANG FORESTRY UNIVERSITY

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

An unmanned ship path planning method based on a Q learning neural network

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

Embodiment 3

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology