State distribution perception sampling-based deep-value-function learning method of agent
A value function and state distribution technology, applied in the field of enhanced learning, can solve problems such as large differences in quantity, achieve the effects of enhancing expression ability, solving sample selection problems, and improving learning quality
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment
[0059] The implementation method of this embodiment is as described above, and specific steps are not described in detail, and the effect is shown below only for case data.
[0060] First, the hash method is used to reduce the dimension and classify the abstract expression of the state set observed by the agent obtained by the convolutional neural network, so as to perceive the state space distribution. On this basis, the samples in the empirical data set are selected reasonably. Finally, use the selected sample data to train the value function of the agent, so that it has a more accurate judgment of the environment. The result is figure 1 , 2 , 3 shown.
[0061] figure 1 It is the result of visualizing the sample after performing steps S1 and S2 of the present invention for the original empirical data of the present invention, that is, a schematic diagram of the distribution of the sample in the state space;
[0062] figure 2 In order to adopt three sampling methods, namely a) ...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com