Crowd evacuation simulation method in dynamic environment based on deep reinforcement learning

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology of reinforcement learning and dynamic environment, which is applied in the field of crowd simulation and computer simulation, can solve the problems of exponential increase in computational complexity, huge storage space and indexing time, and low accuracy of crowd, so as to increase behavior randomness and crowd behavior The effect of randomness enhancement

Active Publication Date: 2021-02-09

AEROSPACE INFORMATION RES INST CAS +1

View PDF8 Cites 1 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

On the one hand, the state of the environment and the movement of the agent are continuous processes, that is, continuous state space and action space. However, algorithms such as Q-Learning and SARSA only support discrete state space and action space, and the huge continuous space may lead to Curse of Dimensionality Problem

On the other hand, this type of algorithm usually uses Q table as a state-action mapping storage body. In the case of continuous state space and action space, huge storage space and indexing time are required, resulting in an exponential increase in computational complexity.

[0007] Due to the low accuracy and weak randomness of the crowd simulated by the social force model in the existing technology; the traditional reinforcement learning method for crowd simulation only supports discrete state space and action space, and the huge continuous space may lead to the curse of dimensionality problem, and in the case of continuous state space and action space, huge storage space and indexing time are required, resulting in technical problems such as an exponential increase in computational complexity. Therefore, the present invention researches and designs a dynamic environment based on deep reinforcement learning Crowd Evacuation Simulation Method

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0030] 1) Deep reinforcement learning algorithm

[0031] Such as figure 1 As shown, the present invention provides a crowd evacuation simulation method based on deep reinforcement learning in a dynamic environment, which includes:

[0032] The crowd is a multi-agent system. For a single pedestrian agent, a deep neural network is used to approximate the mapping function from state to action as the behavior controller of the agent. By observing the dynamic environment state, the pedestrian agent uses the The mapping function makes behavioral decisions and takes corresponding actions from the action space. The goal of a reinforcement learning agent is the process of finding an optimal policy. The so-called strategy refers to the mapping from state to action, which is often represented by the symbol π. A policy refers to the probability distribution of an agent's actions in a given state: Where S is a finite state set; A is a finite action set; is the state transition proba...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention provides a crowd evacuation simulation method in a dynamic environment based on deep reinforcement learning, and the method comprises the steps: using a crowd as a multi-agent system, employing a deep neural network to approach a mapping function from a state to an action for a single pedestrian agent to serve as a behavior controller of the agent; and by the pedestrian agent, makinga behavior decision by observing the dynamic environment state and utilizing the mapping function, and taking corresponding actions from the action space. In the evacuation simulation process of a discrete time sequence, one strategy refers to probability distribution of agent actions on the time sequence in a given state. The information feedback of the environment to the intelligent agent is reflected as a return value. The target of the intelligent agent is to maximize the expectation of the cumulative return value, that is, an optimal action value function is searched, and an optimal strategy is obtained. According to the method, crowd evacuation simulation which is difficult to realize in classical crowd simulation in a high-dynamic environment can be realized, the simulation effectis closer to the real situation, and the randomness of crowd behaviors is enhanced.

Description

technical field [0001] The invention belongs to the technical field of crowd simulation and computer simulation, and in particular relates to a crowd evacuation simulation method in a dynamic environment based on deep reinforcement learning. Background technique [0002] From the perspective of crowd evacuation simulation, the commonly used crowd simulation models can be divided into macroscopic models and microscopic models. The macro model mainly models the group behavior, regards the group as a continuum following the laws of fluid mechanics, and is suitable for large-scale crowd simulation in a large area. The micro model uses a single individual as the basic unit of modeling, and through the movement of a large number of individuals and the interaction between individuals, group behavior emerges. In contrast, microscopic models are more suitable for modeling and simulating individual behaviors, and have received more research. [0003] Each model has its own applicabl...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G06Q10/04G06Q50/26G06N3/08

CPCG06Q10/04G06Q50/265G06N3/08

Inventor 龚建华申申孙麇李毅殷兵晓武栋

Owner AEROSPACE INFORMATION RES INST CAS

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Crowd evacuation simulation method in dynamic environment based on deep reinforcement learning

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology