Patents
Literature
Patsnap Copilot is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Patsnap Copilot

1023results about How to "Reduce variance" patented technology

Autonomous underwater vehicle trajectory tracking control method based on deep reinforcement learning

ActiveCN108803321AStabilize the learning processOptimal target strategyAdaptive controlSimulationIntelligent control
The invention provides an autonomous underwater vehicle (AUV) trajectory tracking control method based on deep reinforcement learning, belonging to the field of deep reinforcement learning and intelligent control. The autonomous underwater vehicle trajectory tracking control method based on deep reinforcement learning includes the steps: defining an AUV trajectory tracking control problem; establishing a Markov decision-making process model of the AUV trajectory tracking problem; constructing a hybrid policy-evaluation network which consists of multiple policy networks and evaluation networks;and finally, solving the target policy of AUV trajectory tracking control by the constructed hybrid policy-evaluation network, for the multiple evaluation networks, evaluating the performance of eachevaluation network by defining an expected Bellman absolute error and updating only one evaluation network with the lowest performance at each time step, and for the multiple policy networks, randomly selecting one policy network at each time step and using a deterministic policy gradient to update, so that the finally learned policy is the mean value of all the policy networks. The autonomous underwater vehicle trajectory tracking control method based on deep reinforcement learning is not easy to be influenced by the bad AUV historical tracking trajectory, and has high precision.
Owner:TSINGHUA UNIV

Water-cooled heat management system of electric automobile battery pack

The invention discloses a water-cooled heat management system of an electric automobile battery pack. A power battery is fixed between an upper fixing baffle and a lower fixing baffle; a cooling separator is arranged tightly against the power battery; a cooling channel is disposed in the cooling separator, two ends of the cooling channel are communicated with a liquid inlet pipe and a liquid outlet pipe; the liquid inlet pipe is connected with a water pump through a liquid inlet connector; the liquid outlet pipe conveys heat transfer working media cooled by the cooling channel to a heat exchanger and a heating insulation water tank through a liquid outlet connector; the cooling separator is provided with a temperature sensor; and a BMS reads temperature data of the temperature sensor and controls the heat transfer working media conveyed to the inner portion of the cooling separator. The water-cooled heat management system solves the control of temperature of the battery pack when the battery is charged or discharged. The water-cooled system which is safe and reliable, has good operational performance, is convenient for maintenance and replacement and is capable of effectively regulating and controlling the working temperature of the battery is provided.
Owner:HUBEI GREEN DRIVING SCI & TECH

Dialog strategy online realization method based on multi-task learning

The invention discloses a dialog strategy online realization method based on multi-task learning. According to the method, corpus information of a man-machine dialog is acquired in real time, current user state features and user action features are extracted, and construction is performed to obtain training input; then a single accumulated reward value in a dialog strategy learning process is split into a dialog round number reward value and a dialog success reward value to serve as training annotations, and two different value models are optimized at the same time through the multi-task learning technology in an online training process; and finally the two reward values are merged, and a dialog strategy is updated. Through the method, a learning reinforcement framework is adopted, dialog strategy optimization is performed through online learning, it is not needed to manually design rules and strategies according to domains, and the method can adapt to domain information structures with different degrees of complexity and data of different scales; and an original optimal single accumulated reward value task is split, simultaneous optimization is performed by use of multi-task learning, therefore, a better network structure is learned, and the variance in the training process is lowered.
Owner:AISPEECH CO LTD

Width transfer learning network and rolling bearing fault diagnosis method based on same

The invention discloses a width transfer learning network and a rolling bearing fault diagnosis method based on the same, and belongs to the technical field of bearing fault diagnosis. The invention provides a novel width transfer learning network and a rolling bearing intelligent diagnosis method based on the same, and aims to solve the problems of scarcity of vibration data with mark informationof a rolling bearing under a variable load, large distribution difference between source domain data and target domain data in the same state, unbalanced distribution of multi-state data and low diagnosis accuracy and model training efficiency. According to the invention, a width learning system is utilized to extract features of source domain data and target domain data and construct a sample set, and on the basis, a balanced distribution adaptation method in transfer learning is adopted to reduce the difference between a source domain and a target domain. A chicken swarm algorithm is introduced to optimize width transfer learning network parameters and establishing a width transfer learning network model. The proposed network model is applied to rolling bearing fault intelligent diagnosis under the variable load, and an experimental result verifies the high efficiency and accuracy of the proposed method.
Owner:HARBIN UNIV OF SCI & TECH

Robot motion control method and device based on actor-critic method

InactiveCN105690392APreferred course of actionRealize intelligent motion controlProgramme-controlled manipulatorMovement controlTime difference
The invention discloses a robot motion control method and device based on an actor-critic method. The control method comprises the steps that video data are collected, and the current robot position information, the obstacle distribution information and the given destination information are obtained; the position where a robot is located serves as the state of the robot, and the motion direction of the robot serves as an action; state transition is conducted; discrete strategy factors are calculated; the approximate average rewarding value and the approximate average square rewarding value are updated; the current average rewarding time difference and the current average square rewarding time difference are calculated; iteration updating is conducted on approximate average rewarding parameters and approximate average square rewarding parameters; approximate average rewarding gradient calculating, approximate average square rewarding gradient calculating and strategy parameter updating are conducted; and state actions are replaced. The above steps are repeated till the strategy parameters are converged, and the robot motion control is achieved. According to the robot motion control method and device, the intelligent motion control is achieved, and the control result is stable.
Owner:SUZHOU UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products