Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

80 results about "Reinforcement learning control" patented technology

Neural network reinforcement learning control method of autonomous underwater robot

The invention provides a neural network reinforcement learning control method of an autonomous underwater robot. The neural network reinforcement learning control method of the autonomous underwater robot comprises the steps that current pose information of an autonomous underwater vehicle (AUV) is obtained; quantity of a state is calculated, the state is input into a reinforcement learning neuralnetwork to calculate a Q value in a forward propagation mode, and parameters of a controller are calculated by selecting an action A; the control parameters and control deviation are input into the controller, and control output is calculated; the autonomous robot performs thrust allocation according to executing mechanism arrangement; and a reward value is calculated through control response, reinforcement learning iteration is carried out, and reinforcement learning neural network parameters are updated. According to the neural network reinforcement learning control method of the autonomousunderwater robot, a reinforcement learning thought and a traditional control method are combined, so that the AUV judges the self motion performance in navigation, the self controller performance isadjusted online according to experiences generated in the motion, a complex environment is adapted faster through self-learning, and thus, better control precision and control stability are obtained.
Owner:HARBIN ENG UNIV

Inverted pendulum control method based on neural network and reinforced learning

The invention, which belongs to the technical field of artificial intelligence and control, relates to a neural network and enhanced learning algorithm, particularly to an inverted pendulum control method based on a neural network and reinforced learning, thereby carrying out self studying to complete control on an inverted pendulum. The method is characterized in that: step one, obtaining inverted pendulum system model information; step two, obtaining state information of an inverted pendulum and initializing a neural network; step three, carrying out and completing ELM training by using a straining sample SAM; step four, controlling the inverted pendulum by using an enhanced learning controller; step five, updating the training sample and a BP neural network; and step six, checking whether a control result meets a learning termination condition; if not, returning to the step two to carry out circulation continuously; and if so, finishing the algorithm. According to the invention, a problem of easy occurrence of a curse of dimensionality in continuous state space as well as a control problem of a non-linear system having a continuous state can be solved effectively; and the updating speed becomes fast.
Owner:CHINA UNIV OF MINING & TECH

Space robot arresting control system, reinforce learning method and dynamics modeling method

The invention discloses a space robot mechanical arm arresting control system. The space robot mechanical arm arresting control system comprises two loops, namely, an inner loop and an outer loop, inthe outer loop, the system achieves the attitude stability of a space robot mechanical arm base platform in the arresting process through a PD controller, and in the inner loop, the system controls amechanical arm to achieve arresting maneuvering on a non-cooperative target through a reinforce learning control system based on reinforce learning. The invention further discloses a reinforce learning method for controlling the reinforce learning control system of the mechanical arm in the inner loop of the system and a space robot dynamics modeling method of the space robot mechanical arm arresting control system. According to the space robot arresting control system, the reinforce learning method and the dynamics modeling method, compared with PD control, the posture disturbance of the baseplatform under reinforce learning RL control is smaller, the movement process of the tail end of the mechanical arm is more stable, the control precision is higher, moreover, the motion flexibility of the mechanical arm under the reinforce learning RL control is good, and the autonomous intelligence is achieved to the greater extent.
Owner:DALIAN UNIV OF TECH

Deep reinforcement learning control method for vertical path following of intelligent underwater robot

The invention provides a deep reinforcement learning control method for vertical path following ofan intelligent underwater robot. The deep reinforcement learning control method comprises the following steps that firstly, according to path following control requirements of the intelligent underwater robot, an intelligent underwater robot environment is established to interact with an agent; secondly, an agent set is established; thirdly, an experience buffer pool is established; fourthly, a learner is established; and fifthly, intelligent underwater robot path following control is conducted byusing a distributed deterministic strategy gradient. According to the deep reinforcement learning control method for the vertical path following ofthe intelligent underwater robot, the deep reinforcement learning control method for the vertical path following ofthe intelligent underwater robot is designed to solve the problem that marine environment in which the intelligent underwater robot is located is complex and variable, thus a traditional control method can not interact with the environment. The path followingand control task of the intelligent underwater robot can be finished in a distribution mode by using a deterministic strategy gradient, and the deep reinforcement learning control method for the vertical path following ofthe intelligent underwater robot has the advantages of self-learning, high precision, good adaptability and stable learning process.
Owner:HARBIN ENG UNIV

Brain-computer cooperation digital twinning reinforcement learning control method and system

The invention discloses a brain-computer cooperation digital twinning reinforcement learning control method and system. A brain-computer cooperation control model is constructed, an operator gives a virtual robot direction instruction, meanwhile, an electroencephalogram signal generated when the operator gives the virtual robot direction instruction is collected, a corresponding speed instructionof a virtual robot is given according to the collected electroencephalogram signal to complete a specified action, reward value calculation is performed on the brain-computer cooperation control modelaccording to the completion quality, training of the brain-computer cooperation control model is completed, a double-loop information interaction mechanism between the brain and the computer is realized through a brain-computer cooperation digital twinning environment in a reinforced learning manner, and interaction of an information layer and an instruction layer between the brain and the computer is realized. According to the method and the system, the brain state of the operator is detected through the electroencephalogram signals, compensation control is conducted on the instruction of the robot according to the brain state of the operator, accurate control is achieved, and compared with other brain-computer cooperation methods, the method has the advantages that the robustness and generalization ability are improved, and mutual adaptation and mutual growth between the brain and the computers are achieved.
Owner:XI AN JIAOTONG UNIV

Unmanned ship reinforcement learning controller structure with data drive and design method thereof

The invention discloses an unmanned ship reinforcement learning controller structure with data drive and a design method thereof. The controller structure comprises an unknown information extraction module, a prediction model generation module, a reward function module and a moving horizon optimization module. The unmanned ship reinforcement learning controller structure with data drive and the design method thereof are based on the data drive with no need for accurate mathematical modeling for a controlled unmanned ship. The controller only employs the unknown information extraction module tocollect control input and output state data information of the unmanned ship and extract a dynamics unknown function, the prediction model generation module reconstructs the extraction information toobtain a prediction model, and the controller does not depend on accurate manual modeling of the unmanned ship. The unmanned ship reinforcement learning controller structure with data drive and the design method thereof do not need to design different controllers for the two levels of the kinematics and the dynamics. Through a prediction model and a set reward function, the control input is subjected to moving horizon optimization to achieve an optimal control effect. The unmanned ship reinforcement learning controller structure with data drive and the design method thereof can be suitable for an all-drive unmanned ship and an under-actuated unmanned ship.
Owner:DALIAN MARITIME UNIVERSITY

Active suspension reinforcement learning control method based on deep Q neural network

The invention relates to an active suspension reinforcement learning control method based on a deep Q neural network, and belongs to the technical field of automobile dynamic control and artificial intelligence. A reinforcement learning controller main body obtains state observed quantities such as vehicle body acceleration and a suspension dynamic deflection from a suspension system, a strategy is used for determining a reasonable active force to be applied to the suspension system, the suspension system changes the state at a current moment according to the active force, and meanwhile an award value is generated to judge quality of the current active force. By setting a reasonable reward function and combining dynamic data obtained from an environment, an optimal strategy can be determined to determine a size of an active control force so that overall performance of the control system is more excellent under a large amount of training. According to the reinforcement learning controlmethod based on the deep Q neural network, the active suspension system can be dynamically and adaptively adjusted; and therefore, influences caused by factors such as parameter uncertainty, changeable road surface interference and the like which are difficult to solve in a traditional suspension control method are overcome, and riding comfort of passengers is improved as much as possible on the premise that the overall safety of a vehicle is guaranteed.
Owner:SOUTHEAST UNIV +1

Vector reinforcement learning control method of power grid frequency modulation type flywheel energy storage system

The invention provides a vector reinforcement learning control method of a power grid frequency modulation type flywheel energy storage system. The method can overcome the current situation that traditional frequency modulation resources cannot meet frequency modulation requirements due to impact of randomness, volatility and uncertainty of new energy power generation and distributed power generation on a power grid in an existing power system. Frequency modulation of the flywheel energy storage system is combined with vector reinforcement learning, the optimal action of the flywheel energy storage system is selected by performing vector reinforcement learning on voltage, a motor of the system is controlled to work in a generator/motor state to achieve the purpose that the system works ina discharging/charging mode, and therefore the purpose of adjusting the frequency of a power system is achieved. According to the vector reinforcement learning control method of the power grid frequency modulation type flywheel energy storage system, the response speed is much higher than that of a traditional frequency modulation resource, the power grid frequency can be rapidly adjusted, the power grid frequency is kept within the allowable deviation range, the system frequency stability is maintained, and therefore the reliability and safety of power grid operation are guaranteed.
Owner:GUANGXI UNIV

Quasi-proportional resonance controller parameter adjusting method and system

The invention discloses a quasi-proportional resonance controller parameter adjustment method. The method comprises the following steps of: obtaining models of an inverter and a load, and adjusting areinforcement learning training environment by taking the models as parameters; constructing a depth deterministic strategy gradient reinforcement learning framework, and defining depth deterministicstrategy gradient reinforcement learning framework parameters, wherein the deep deterministic strategy gradient reinforcement learning framework parameters comprise a state, an action and a reward value; and training the intelligent agent of the parameter adjustment reinforcement learning framework in the parameter adjustment reinforcement learning training environment. According to the method forsetting control parameters of the multi-parallel quasi-proportional resonance controller based on the reinforcement learning method, because the reinforcement learning control algorithm is not sensitive to the mathematical model and the operation state of the controlled object, the self-learning capability of the reinforcement learning control algorithm has strong adaptability and robustness to parameter change or external interference, the control requirement can be met when the multi-quasi-proportional resonance controller is connected in parallel, and the control effect can be ensured whenthe load is changed.
Owner:GUANGDONG POWER GRID CO LTD +1

Train cooperative operation control method based on reference deep reinforcement learning

ActiveCN114880770AGuaranteed positioning errorGuaranteed Control LatencyGeometric CADInternal combustion piston enginesControl signalDimensionality reduction
The invention discloses a train cooperative operation control method based on reference deep reinforcement learning, and the method specifically comprises the steps: building a train cooperative operation simulation environment, setting a train safety distance and the like, and calculating the estimated shortest real-time distance of two trains; setting a reward function, and establishing an input dimensionality reduction reinforcement learning algorithm controller; a reference controller is added, when the train meets a reference control strategy condition, a reference control signal is used for replacing a reinforcement learning control signal, and the part of data is used for optimizing a reinforcement learning control strategy; training the network until the global reward of the network is optimal and the control result is combined with the expectation, and considering that the preliminary training of the network is completed; and loading the reference control strategy and the reinforcement learning control strategy on the real train, and outputting a train control signal according to the real train information to complete the cooperative operation control of the train. According to the method, the optimal strategy training speed is increased, and the robustness of the control strategy in the actual operation process is ensured.
Owner:SOUTHWEST JIAOTONG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products