Accelerator beam trajectory control method and system based on deep reinforcement learning

A technology of reinforcement learning and orbit control, applied in general control systems, control/regulation systems, adaptive control and other directions, it can solve problems such as labor-consuming and time-consuming, PID parameter adjustment dependent on engineering experience, and complex beam orbit control problems.

Active Publication Date: 2020-08-25
UNIV OF SCI & TECH OF CHINA
View PDF10 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The traditional beam trajectory feedback control technology uses the singular value decomposition algorithm combined with the PID (proportional, integral, differential) control algorithm to solve the large-scale multivariable input and output control problem of the beam trajectory control of the accelerator system, but with the modern accelerator With the development of engineering, the scale of the accelerator system has become larger and larger, the number of control parameters of the beam trajectory has been increasing, and the related problems of beam trajectory control have become more and more complicated. Although the traditional control algorithm is relatively simple in principle and implementation, However, there are great limitations in practical applications
In addition, the traditional beam trajectory control method needs to manually measure the response matrix between the beam position monitor (hereinafter referred to as BPM) and the trajectory corrector, which is difficult for large-scale systems with hundreds or even thousands of BPMs and trajectory correctors. The implementation on the accelerator requires a lot of work, and the measurement accuracy of the response matrix will directly affect the orbit control accuracy. In modern accelerator systems, due to the influence of nonlinear response, there is a nonlinear mapping between the beam orbit state and the correction action. Therefore, the response matrix is ​​often difficult to measure accurately
At the same time, due to the PID control algorithm used in the traditional beam trajectory controller, a large amount of PID parameter tuning work is required for all PID control loops in the actual engineering application stage. Since PID parameter tuning is very dependent on engineering experience, this will also Become a difficult point in the process of engineering practice
When the operating conditions of the accelerator and the external environment change, the external parameters of the control loop of the traditional orbit control method will also change. At this time, it takes a lot of manpower and time to re-measure the response matrix and re-adjust the controller parameters.
Therefore, there is an urgent need to develop more complex methods to overcome the shortcomings of traditional beam trajectory control methods
[0003] In the prior art, there is a method of inverse reinforcement learning to realize the autonomous flight of helicopters, but its research results are difficult to apply in the control scenarios of high-dimensional state space and action space, that is, it fails to solve the problem of "dimension disaster" well
[0004] The beam trajectory control problem in an accelerator is a typical high-dimensional state space and high-dimensional action space problem. How to design a beam trajectory control method to overcome the traditional beam trajectory control method that requires manual and accurate measurement of the response matrix and The disadvantages of PID tuning are technical problems that need to be solved urgently

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Accelerator beam trajectory control method and system based on deep reinforcement learning
  • Accelerator beam trajectory control method and system based on deep reinforcement learning
  • Accelerator beam trajectory control method and system based on deep reinforcement learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0091] In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with specific embodiments and with reference to the accompanying drawings.

[0092] The invention discloses a method and system for controlling the beam trajectory of an accelerator based on deep reinforcement learning, which is used to control the beam trajectory of the accelerator in a target state. The method utilizes training data and adopts a deep reinforcement learning method to perform a deep neural network Pre-training, store the weight parameters of the trained deep neural network and the empirical data of the orbit control strategy; use the beam position monitor to obtain the state data of the beam orbit online, feed it into the deep neural network, and The output of the deep neural network is coupled to the beam track corrector; the weight data of the trained deep neural network and the ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

An accelerator beam track control method and system based on depth reinforcement learning aim to control the accelerator beam track at a target state; the method comprises the following steps: using adepth reinforcement learning method and the training data to pre-train a depth nerve network, and storing weight parameters of the trained depth nerve network and experience data of a track control strategy; using a beam position monitor to obtain the state data of the beam track online, returning the state data into the depth nerve network, and coupling the output of the depth nerve network to abeam track corrector; loading the weight parameters of the trained depth nerve network and the experience data of the track control strategy, and allowing the depth nerve network to adjust the control parameters through prediction control and on-line reinforcement learning, thus adaptively stably controlling the beam track at the target state.

Description

technical field [0001] The present invention relates to the technical field of accelerator beam current diagnosis and control, in particular to a method and system for controlling accelerator beam current trajectory based on deep reinforcement learning. Background technique [0002] In the field of accelerator beam diagnosis and control technology, the beam trajectory feedback control technology is usually used to correct the beam trajectory so that the beam moves along the optimized trajectory or the target trajectory to ensure the quality and stability of the beam. The traditional beam trajectory feedback control technology uses the singular value decomposition algorithm combined with the PID (proportional, integral, differential) control algorithm to solve the large-scale multivariable input and output control problem of the beam trajectory control of the accelerator system, but with the modern accelerator With the development of engineering, the scale of the accelerator ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G05B13/02G05B13/04
CPCG05B13/027G05B13/042
Inventor 唐雷雷周泽然宣科
Owner UNIV OF SCI & TECH OF CHINA
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products