Inference device, device control system, and learning device
A learning device and equipment technology, applied in the general control system, control/regulation system, adaptive control, etc., can solve the problems that the efficiency of learning and reasoning cannot be improved, and achieve the effect of improving efficiency
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment approach 1
[0036] figure 1 It is a block diagram showing main parts of the device control system of the first embodiment. figure 2 It is an explanatory diagram showing an example of a robot controlled by the facility control system of the first embodiment. image 3 It is an explanatory diagram showing a main part of a feature quantity extractor and a controller in the plant control system according to the first embodiment. Figure 4A It is an explanatory diagram showing the structure of each layer in the feature quantity extractor in the plant control system according to the first embodiment. Figure 4B It is an explanatory diagram showing another structure of each layer in the feature quantity extractor in the plant control system according to the first embodiment. refer to figure 1 ~ FIG. 4, the equipment control system of Embodiment 1 is demonstrated.
[0037] Such as figure 1 As shown, the environment E includes the control device 1 and the robot 2 . The control device 1 contr...
Embodiment approach 2
[0089] Figure 9 It is a block diagram showing main parts of the reinforcement learning system of Embodiment 2. Figure 10 It is an explanatory diagram showing main parts of a first feature quantity extractor, a second feature quantity extractor, a first controller, and a learner in the reinforcement learning system according to the second embodiment. refer to Figure 9 and Figure 10 A reinforcement learning system according to Embodiment 2 will be described.
[0090] Such as Figure 9 As shown, a loop composed of the environment E, the first feature quantity extractor 41 and the first controller 51 is formed. The environment E outputs a state value (hereinafter referred to as "first state value") s indicating the state in the environment E. t . The first feature quantity extractor 41 accepts the output first state value s t input of. The output of the first feature quantity extractor 41 and the first state value s of the input t The corresponding eigenvector (herein...
Embodiment approach 3
[0146] Figure 14 It is a block diagram showing main parts of the reinforcement learning system of Embodiment 3. refer to Figure 14 A reinforcement learning system according to Embodiment 3 will be described. In addition, in Figure 14 in, right with Figure 9 Blocks that are the same as those shown are assigned the same reference numerals and description thereof will be omitted.
[0147] Such as Figure 14 As shown, the reinforcement learning system 500 according to the third embodiment includes a storage device 81 in addition to the inference device 100 and the learning device 400 . Stored in the storage device 81 is the first state value s t , the corresponding action value a t and the corresponding second state value s t+1 formed group. More specifically, storing multiple sets of values (s t 、a t , s t+1 ). These values (s t 、a t , s t+1 ) is collected using a controller different from the first controller 51 (hereinafter referred to as "second control...
PUM
Login to View More Abstract
Description
Claims
Application Information
Login to View More 


