Neural network and Q learning combined estimation method under non-complete information

A kind of neural network, incomplete technology, applied in the field of incomplete information machine game, can solve the problem of complex and incomplete information machine game players cannot obtain all and credible situation information.

Inactive Publication Date: 2017-08-11
HARBIN INST OF TECH SHENZHEN GRADUATE SCHOOL
View PDF0 Cites 65 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0002] The characteristic of incomplete information machine games is that the players cannot obtain all a

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Neural network and Q learning combined estimation method under non-complete information
  • Neural network and Q learning combined estimation method under non-complete information
  • Neural network and Q learning combined estimation method under non-complete information

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0139] The present invention will be further described below in conjunction with the drawings.

[0140] The present invention applies the improved Q learning algorithm to the evaluation function of the incomplete information machine game, and realizes two computer intelligent body systems of Texas Hold'em and Doudizhu respectively. These two computer agent systems not only consider the state information before the current state, but also predict what may happen after the current state. The thinking of these two computer agents is closer to that of humans, and they can choose a more reasonable strategy compared with the traditional valuation function.

[0141] Aiming at the problem of state confusion in the game with incomplete information that the two observed card game status information is the same, but the actual card game status information is different, the continuous partial observation state sequence is combined with the qualification trace (Eligibility Trace) Way to solve ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a neural network and Q learning combined estimation method under non-complete information, and the method comprises the steps: 1, converting the non-complete information into a partly observable Markov decision-making model; 2, converting the non-complete information gaming into complete information gaming through the Monte Carlo sampling technology; 3, calculating Q learning delay return value through a Q learning algorithm based on former n steps, an algorithm combining a neural network and the Q learning and an algorithm UCT based on an upper limit confidence interval; 4, carrying out the fusion of the Q value obtained at a former step, and obtaining a final result. According to the technical scheme of the invention, the method can be used in various types of non-complete information gaming, such as the Chinese poker and Texas Hold'em poker, and improves the gaming level of an intelligent agent. Compared with the conventional related research, the method greatly improves the precision.

Description

Technical field [0001] The present invention relates to the field of computer machine games, and mainly relates to incomplete information machine games, valuation functions, conversion of incomplete information machine game models to partially observable Markov decision models, etc. Background technique [0002] The feature of incomplete information machine game is that players cannot obtain all and credible situation information during the game, which makes the research more complicated and challenging. Therefore, it has attracted the attention of a large number of domestic and foreign scholars. The machine game system consists of four parts: data representation, rule generator, game tree search and valuation function. The valuation function is the core part. The valuation function is similar to the human brain. It has an important role in judging the pros and cons of the current situation and guiding agents to choose strategies. The quality of the valuation function directly ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06N3/08G06N5/04
CPCG06N3/084G06N5/04
Inventor 王轩蒋琳张加佳李昌代佳宁王鹏程林云川胡开亮朱航宇
Owner HARBIN INST OF TECH SHENZHEN GRADUATE SCHOOL
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products