Game control method and device and storage medium

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A game control and game technology, applied in indoor games, video games, sports accessories, etc., can solve the problems of deep reinforcement learning instability, reduce the effect of event control, amplify the overestimation of deep reinforcement learning algorithms, etc., to improve the control effect, Reduce the effect of overestimation and coupling reduction

Pending Publication Date: 2022-01-04

CHINA MOBILE SUZHOU SOFTWARE TECH CO LTD +1

View PDF0 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

However, since the Q-Learning algorithm is an off-strategy algorithm, this itself will increase the instability of the deep reinforcement learning algorithm, and each time a greedy strategy (greedy strategy) is used to select the actions performed in the video game, each time Both use the Max operator to select the optimal action, but it will amplify the overestimation problem of the deep reinforcement learning algorithm

Therefore, the instability and overestimation problems in deep reinforcement learning still exist, thereby reducing the control effect of events

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0074] An embodiment of the present invention provides a game control method, such as figure 1 As shown, the method includes:

[0075] S101. Acquire the current video frame when it is detected that the target video game starts;

[0076] When the device detects that the target video game starts, the first frame image when the target video game starts is used as the current video frame, and the current video frame represents the current state.

[0077] In some embodiments, when the device detects that the target video game starts, it also sets the playback memory unit, the total capacity of the playback memory unit, the total number of preset training rounds, the preset training total step size of each round of training, the preset online A value network and a preset target value network; wherein, the playback memory unit is used to store data generated when controlling the target video game; the preset training total step size is greater than the preset number of samples.

[...

Embodiment 2

[0135] Based on the same inventive concept of the first embodiment, further description will be made.

[0136] An embodiment of the present invention provides a game control device, such as Figure 5 As shown, the game control device 3 includes:

[0137] The control unit 31 is used to obtain the current video frame when detecting that the target video game starts; and based on the current video frame and the preset online value network, obtain the current grayscale image and the current action, and control the target video game to execute the current action, Get the current reward value and the next video frame;

[0138] The data construction unit 32 is used to obtain the current five-tuple based on the current grayscale image, the current action, the current reward value, the next video frame and the preset online value network and save it to the preset database;

[0139] The parameter updating unit 33 is used for when the number of obtained current quintuples is greater th...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a game control method and a device and a storage medium, and the method comprises the steps: obtaining a current video frame when the start of a target video game is detected; based on the current video frame and a preset online value network, obtaining a current grey-scale map and a current action, controlling the target video game to execute the current action, obtaining a current reward value and a next video frame, obtaining a current quintuple, and storing the current quintuple in a preset database; when the number of the obtained current quintuple is greater than or equal to the preset sample number and the preset total training round number is greater than zero, performing parameter updating on a preset online value network based on a preset target value network and a preset database; when the number of the current quintuple is an integral multiple of a preset training total step length, replacing a parameter of a preset target value network by using a parameter of a preset online value network, and subtracting one from a preset training total round number; and taking the next video frame as the current video frame, and continuing the process. According to the invention, the event control effect can be improved.

Description

technical field [0001] The invention relates to dynamic optimization technology, in particular to a game control method and device, and a storage medium. Background technique [0002] At present, deep reinforcement learning is a hot field of artificial intelligence research. It combines the perception ability of deep learning and the decision-making ability of reinforcement learning, and provides a new control idea for dynamic optimization related events (such as video games). [0003] Due to the problems of instability and overestimation in deep reinforcement learning, related technologies use an improved algorithm of deep Q network in the control of video games to try to alleviate the overestimation problem of deep reinforcement learning algorithms. However, since the Q-Learning algorithm is an off-strategy algorithm, this itself will increase the instability of the deep reinforcement learning algorithm, and each time a greedy strategy (greedy strategy) is used to select t...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): A63F13/49A63F13/55G06N3/04G06N3/06

CPCA63F13/55A63F13/49G06N3/061G06N3/045

Inventor 夏宗涛

Owner CHINA MOBILE SUZHOU SOFTWARE TECH CO LTD

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Game control method and device and storage medium

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology