Parallel learning automaton optimization method based on partial mean value fusion

An automaton and fusion algorithm technology, which is applied in the field of information processing and can solve problems such as difficulty in expansion and poor robustness.

Inactive Publication Date: 2018-11-02
SHANGHAI JIAO TONG UNIV +1
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] Aiming at the defects that the prior art cannot be applied in a noisy environment, and the robustness is poor; only the performance of a single automaton is considered, and it is difficult to expand, etc., the present invention proposes a parallel learning automata optimization method based on partial mean fusion. The stage retains its own learning direction, and can be continuously revised during the learning process

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Parallel learning automaton optimization method based on partial mean value fusion
  • Parallel learning automaton optimization method based on partial mean value fusion
  • Parallel learning automaton optimization method based on partial mean value fusion

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0027] This embodiment adopts two groups of environments with 10 actions, and the reward probabilities of the first group of environments for actions are E A :{0.70, 0.50, 0.30, 0.20, 0.40, 0.50, 0.40, 0.30, 0.50, 0.20}, the reward probabilities of the second group of environments to actions are E B :{0.10,0.45,0.84,0.76,0.20,0.40,0.60,0.70,0.50,0.30}. The learning rules of the learning automata adopt the most typical DP RI Algorithm and DGPA algorithm. In the following examples, two kinds of learning rules are used to implement the present invention in two groups of environments, totally 4 sets of systems.

[0028] A) When the learning algorithm is DP RI Algorithm, implementing the present invention specifically includes steps as follows:

[0029]Initialization: set the input parameters of the algorithm and initialize the learning automaton, specifically: set the parallel scale N and the convergence threshold of the learning automaton, and set in turn: n is the resolution...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a parallel learning automaton optimization method based on partial mean value fusion. A learning automaton firstly independently and dispersedly interacts with environment, a probability obtained a mean value fusion machine and an original probability are subjected to weighted average, and a weighted average value is assigned to all learning automations to carry out next circulation, the mean value is not directly assigned to a probability vector, and instead, the mean value is assigned to the probability vector by a certain probability. By use of the method, in an initial learning stage, an own learning direction is kept, and meanwhile, continuous correction is carried out in a learning process.

Description

technical field [0001] The invention relates to a technology in the field of information processing, in particular to a parallel learning automaton optimization method based on partial mean fusion. Background technique [0002] Learning automata is an autonomous system that adjusts its own decision-making behavior through interaction with the environment, and belongs to the field of reinforcement learning. As the space for improving the speed of a single learning automaton becomes smaller, theoretical research needs to consider the problem of improving the speed of learning automata from other perspectives. Structured learning automata has achieved certain results in speeding up learning automata by combining simple learning automata according to a certain structure, the most typical of which is the hierarchical and parallel processing methods. The hierarchical learning machine model divides a large-scale action set into small-scale action sets, which are processed by a hie...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06N99/00
Inventor 李生红马颖华黄德双谢文丹江文王伊凡郭颖葛昊
Owner SHANGHAI JIAO TONG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products