Neural network and Q learning combined estimation method under non-complete information

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
A kind of neural network, incomplete technology, applied in the field of incomplete information machine game, can solve the problem of complex and incomplete information machine game players cannot obtain all and credible situation information.

Inactive Publication Date: 2017-08-11

HARBIN INST OF TECH SHENZHEN GRADUATE SCHOOL

View PDF0 Cites 65 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0002] The characteristic of incomplete information machine games is that the players cannot obtain all and credible situational information during the game, which makes the research more complex and challenging

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0139] The present invention will be further described below in conjunction with the drawings.

[0140] The present invention applies the improved Q learning algorithm to the evaluation function of the incomplete information machine game, and realizes two computer intelligent body systems of Texas Hold'em and Doudizhu respectively. These two computer agent systems not only consider the state information before the current state, but also predict what may happen after the current state. The thinking of these two computer agents is closer to that of humans, and they can choose a more reasonable strategy compared with the traditional valuation function.

[0141] Aiming at the problem of state confusion in the game with incomplete information that the two observed card game status information is the same, but the actual card game status information is different, the continuous partial observation state sequence is combined with the qualification trace (Eligibility Trace) Way to solve ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention provides a neural network and Q learning combined estimation method under non-complete information, and the method comprises the steps: 1, converting the non-complete information into a partly observable Markov decision-making model; 2, converting the non-complete information gaming into complete information gaming through the Monte Carlo sampling technology; 3, calculating Q learning delay return value through a Q learning algorithm based on former n steps, an algorithm combining a neural network and the Q learning and an algorithm UCT based on an upper limit confidence interval; 4, carrying out the fusion of the Q value obtained at a former step, and obtaining a final result. According to the technical scheme of the invention, the method can be used in various types of non-complete information gaming, such as the Chinese poker and Texas Hold'em poker, and improves the gaming level of an intelligent agent. Compared with the conventional related research, the method greatly improves the precision.

Description

Technical field [0001] The present invention relates to the field of computer machine games, and mainly relates to incomplete information machine games, valuation functions, conversion of incomplete information machine game models to partially observable Markov decision models, etc. Background technique [0002] The feature of incomplete information machine game is that players cannot obtain all and credible situation information during the game, which makes the research more complicated and challenging. Therefore, it has attracted the attention of a large number of domestic and foreign scholars. The machine game system consists of four parts: data representation, rule generator, game tree search and valuation function. The valuation function is the core part. The valuation function is similar to the human brain. It has an important role in judging the pros and cons of the current situation and guiding agents to choose strategies. The quality of the valuation function directly ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G06N3/08G06N5/04

CPCG06N3/084G06N5/04

Inventor王轩蒋琳张加佳李昌代佳宁王鹏程林云川胡开亮朱航宇

OwnerHARBIN INST OF TECH SHENZHEN GRADUATE SCHOOL

Neural network and Q learning combined estimation method under non-complete information

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology