Patents
Literature
Patsnap Copilot is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Patsnap Copilot

90 results about "Vanishing gradient problem" patented technology

In machine learning, the vanishing gradient problem is a difficulty found in training artificial neural networks with gradient-based learning methods and backpropagation. In such methods, each of the neural network's weights receives an update proportional to the partial derivative of the error function with respect to the current weight in each iteration of training. The problem is that in some cases, the gradient will be vanishingly small, effectively preventing the weight from changing its value. In the worst case, this may completely stop the neural network from further training. As one example of the problem cause, traditional activation functions such as the hyperbolic tangent function have gradients in the range (0, 1), and backpropagation computes gradients by the chain rule. This has the effect of multiplying n of these small numbers to compute gradients of the "front" layers in an n-layer network, meaning that the gradient (error signal) decreases exponentially with n while the front layers train very slowly.

Image super-resolution method based on dense connection network

The invention discloses an image super-resolution method based on dense connection network. By increasing the depth of a convolution neural network and introducing a large quantity of jumping connection in the deep network, the image super-resolution method based on dense connection network effectively solves the problem that the gradient disappears during the reverse propagation of the deep network, optimizes flowing of information on the network, and improves the super-resolution reconstruction capability of the convolution neural network. At the same time, the image super-resolution method based on dense connection network is effectively combined with the bottom layer characteristic and the high layer abstract characteristic, and can reduce the model parameters and compress the deep network model so as to improve the reconstruction efficiency of the image super-resolution. Besides, by introducing a deep monitoring technology, the image super-resolution method based on dense connection network can reconstruct the super-resolution image at different depth of network, thus not only optimizing training of the deep network, but also being able to selecting a suitable network depth to reconstruct a high definition image according to the calculation capability of the test terminal during the testing process. Finally, the image super-resolution method based on dense connection network utilizes an image ser having a plurality of amplification factors to train, so that the obtained model can perform image super-resolution on a plurality of dimensions and does not need to train different models for every amplification factor.
Owner:福建帝视信息科技有限公司

Mechanical equipment residual service life prediction method and system

The invention discloses a mechanical equipment residual service life prediction method and system. The method comprises the steps that a time convolution network serves as a feature extraction algorithm, a long-term and short-term memory network serves as a regression prediction algorithm, a deep neural network life prediction model is constructed, and the deep neural network life prediction modelis trained; according to the model of the tested equipment and the data acquisition time sequence, constructing the acquired real-time operation data of the tested equipment into a service life prediction data set with time sequence characteristics; and carrying out prediction processing on the life prediction data set by using the deep neural network life prediction model to obtain the residualservice life of the tested equipment. A state monitoring signal output by a sensor for monitoring mechanical equipment has the characteristics of a time sequence; a time convolution network and a longshort-term memory network are combined, a deep neural network life prediction model is established for RUL prediction of mechanical equipment, the problems of over-fitting and gradient disappearanceexisting in a common deep neural network model are solved, and the prediction accuracy is improved.
Owner:SHANDONG UNIV

Infant brain magnetic resonance image partitioning method based on fully convolutional network

The invention provides an infant brain magnetic resonance image partitioning method based on a fully convolutional network. The main content of the method comprises the multi-stream three-dimensionalfully convolutional network (FCN) with jump connection, partial transfer learning, training and testing and evaluation. According to the process of the method, first, a probability graph of each pieceof brain tissue is learned from a multi-modal magnetic resonance image; second, initial partitions of different brain tissue are obtained from the probability graphs and used for calculating a distance graph of each piece of brain tissue; third, spatial context information is simulated according to the distance graphs; and last, final partitioning is realized by use of spatial correlation information and the multi-modal magnetic resonance image, wherein the training process mainly comprises training data increasing, training patch preparation and iterative training, and various detection values are used for evaluation after testing is performed. Through the partitioning method, a white matter area, a grey matter area and a cerebrospinal fluid area are successfully divided, the potential gradient vanishing problem of multi-level deep supervision is relieved, training efficiency is improved, and partitioning performance is greatly enhanced.
Owner:SHENZHEN WEITESHI TECH

Semantic similarity calculation method based on deep learning

The invention discloses a semantic similarity calculation method based on deep learning, and relates to the field of semantic similarity calculation. The method comprises the following steps: step 1,constructing a training data set, and preprocessing training data to obtain one-hot sparse vectors; step 2, constructing a semantic similarity calculation network model comprising N layers of BI-LSTMnetworks, a residual network, a similarity matrix, a CNN convolutional neural network, a pooling layer and a full connection layer; step 3, inputting the one-hot sparse vector into the network model,and training parameters by using a training data set to complete supervised training; and step 4, inputting a text to be tested into the trained network model, judging whether the text to be tested isa similar text or not, and outputting a result. The semantic similarity calculation network model comprises a multi-layer BI-LSTM network, a residual network, a CNN convolutional neural network, a pooling layer and a full connection layer. Meanwhile, a BI-LSTM network and a CNN convolutional neural network are used, and a residual network is added into the BI-LSTM network, so that the problem ofgradient disappearance caused by a multi-layer network is solved, and the feature extraction capability of the model is enhanced.
Owner:UNIV OF ELECTRONICS SCI & TECH OF CHINA

GRU based recurrent neural network multi-label learning method

The invention provides a GRU based recurrent neural network multi-label learning method, which comprises the steps of S1, initializing a system parameter [theta]=(W, U, B); S2, inputting a sample {xi,yi}<i=1><N>, calculating a hidden state hT of the output of an RNN (Recurrent Neural Network) at each moment, wherein the sample xi belongs to R<M*1>, yi is a multi-label vector of the sample xi, andyi belongs to R<C*1>; S3, calculating a context vector hT and output zi of an output layer; S4, calculating the predicted output yi^, calculating the loss Li, and determining an objective function J;S5, solving the gradient of the system parameter [theta]=(W, U, B) according to a gradient descent method and a BPTT (Back-propagation Through Time) algorithm; S6, determining a learning rate [eta],and updating each weight gradient W=W-[eta]*[delta]W; S7, judging whether the neural network reaches stable or not, if so, executing a step S8, if not, returning the step S2, and iteratively updatingmodel parameters; and S8, outputting an optimization model. According to the invention, effective feature representation of the sample can be obtained by fully utilizing the RNN so as to improve the accuracy of multi-label classification. In addition, a problem of gradient disappearance is not easy to occur in back propagation.
Owner:HRG INT INST FOR RES & INNOVATION

Hyperspectral image classification method based on global attention residual network

The invention discloses a hyperspectral image classification method based on a global attention residual network, and the method comprises: constructing an overall network, which comprises a multi-scale feature extraction network, a global attention module and an improved residual network module; performing multi-scale feature extraction to extract hierarchical features of the hyperspectral image; constructing a spatial and spectral dependency relationship of global pixel points by the global attention module through combination of the spatial attention module and the spectral attention module; fusing the improved residual error network module and the global attention module to form a novel global attention residual error network; and sending an output result into a classifier through global pooling for final classification, and outputting a result. According to the method, rich spatial-spectral features are obtained at the same time by introducing a multi-scale receptive field and a global attention module, and an improved residual network is added to relieve the gradient disappearance problem and accelerate network convergence, so that the classification precision is improved, and a good and stable classification effect is ensured.
Owner:HOHAI UNIV

Text classification method, device, medium and apparatus based on convolution neural network

The invention provides a text classification method, device, medium and apparatus based on a convolution neural network, wherein, the method comprises the following steps: obtaining a word vector matrix of a text to be classified related to network public opinion; constructing an initial feature matrix according to the word vector matrix, and using the initial feature matrix as the input of the trained text classification model and input the initial feature matrix to the first sequential region block, and determining the output of the region block is determined; The input of each hidden layerin the region block comes from the output of all other hidden layers in the region block. The output of the current region block is taken as the input of the next region block until the output of allthe region blocks is determined, and the output of all the region blocks is transmitted to the full connection layer, and the classification result is determined according to the output of all the region blocks. The network structure adopted in the method can make the transmission of network features and gradients more effective, avoid the problem of gradient disappearance caused by the layer-by-layer transmission of loss function information, and ensure that the network depth can be enlarged while the problem of gradient disappearance can be avoided.
Owner:PING AN TECH (SHENZHEN) CO LTD

Three-dimensional atmospheric temperature profile inversion method and system based on DenseNet convolutional neural network

The invention discloses a three-dimensional atmospheric temperature profile inversion method and system based on a DenseNet convolutional neural network, and belongs to the field of atmospheric microwave remote sensing. The method comprises the following steps: constructing a training data set according to a two-dimensional atmospheric observation brightness temperature image and a three-dimensional atmospheric temperature profile of an oxygen absorption frequency band; based on the training data set, training the DenseNet convolutional neural network until the DenseNet convolutional neural network is converged to obtain a trained network; and inputting a brightness temperature image to be inverted into the trained network, and outputting a three-dimensional atmospheric temperature profileobtained by inversion. A data set used for training takes two-dimensional brightness temperature images as units, each brightness temperature image covers a certain area on the earth, the whole dataset spans a long time interval, the generalization ability is greatly improved, and the inversion error is also reduced. The DenseNet convolutional neural network is large in layer number, the problemof gradient disappearance in the training process is avoided due to the dense connection structure of the DenseNet convolutional neural network, the DenseNet convolutional neural network is suitablefor complex inversion, data of three scenes of clear sky, cloud and rainy days can be directly inverted together, and time consumption is reduced.
Owner:HUAZHONG UNIV OF SCI & TECH

Image super-resolution reconstruction method of supervised convolutional neural network based on multi-scale feature extraction fusion

The invention discloses an image super-resolution reconstruction method of a supervised convolutional neural network based on multi-scale feature extraction fusion. The method comprises the followingsteps: image preprocessing, image feature extraction and image reconstruction; in the image feature extraction step, a plurality of MSB modules are adopted, in the MSB modules, feature extraction is carried out on an image by adopting convolution layers containing convolution kernels of different sizes, feature repeated learning is carried out in a dense connection mode, and a supervision layer error function is designed in the model and used for assisting and correcting reconstruction errors of the model. According to the method, the extracted feature map is processed on different scales, sothe adaptability of the model is enhanced; multi-channel propagation of information is achieved, the convergence speed is increased, and the gradient disappearance phenomenon is relieved; and an auxiliary supervision error function is added, so the back propagation of the gradient is enhanced, extra regularization is provided, the problem of gradient disappearance in a traditional algorithm is effectively solved, and the precision of the algorithm is improved.
Owner:TIANJIN CHENGJIAN UNIV

Power transmission line icing thickness prediction method based on CEEMDAN-QFOA-LSTM

The invention discloses a power transmission line icing thickness prediction method based on CEEMDAN-QFOA-LSTM, and relates to the field of combination of power transmission line state evaluation anddeep learning. The method comprises the following steps o: (1) carrying out data acquisition and preprocessing; (2) carrying out CEEMDAN decomposition on an icing thickness historical data sequence (12); (3) optimizing hyper-parameters of the LSTM by a quantum drosophila melanogaster algorithm; (4) carrying out LSTM model training (14); and (5) predicting the icing thickness of a power transmission line and analyzing a result (15). According to the method, the CEEMDAN decomposition algorithm is used, a sequence which is difficult to directly predict is converted into a plurality of predictablecomponent sequences; a neural network can more accurately grasp the law of the sequence according to multi-dimensional feature information obtained through decomposition; a QFOA optimization algorithm is used for obtaining the hyper-parameters, a complex manual parameter adjustment process is avoided, and a network model is trained more effectively; the used LSTM neural network does not have theproblem of gradient disappearance of a general network, so that optimal convergence of the model is ensured, and the problem of short-term and long-term time sequence prediction is effectively solved.
Owner:CENT CHINA BRANCH OF STATE GRID CORP OF CHINA +1

Ground surface evapotranspiration data downscaling method based on multi-source data and deep learning

The invention provides an evapotranspiration data downscaling method based on multi-source data and deep learning. The method comprises the steps of obtaining low-spatial-resolution satellite surface evapotranspiration data, low-spatial-resolution atmosphere reanalysis data and high-spatial-resolution satellite remote sensing data; carrying out data preprocessing; based on a built deep learning regression network, establishing a surface evapotranspiration inversion model; and performing downscaling inversion on high-spatial-resolution surface evapotranspiration through the surface evapotranspiration inversion model established on the low-spatial-resolution surface evapotranspiration. According to the method, the earth surface evapotranspiration inversion precision is improved by comprehensively considering earth surface evapotranspiration related influence factors. The nonlinear complex relation between remote sensing earth surface parameters and atmospheric data and earth surface evapotranspiration is deeply analyzed based on deep learning. The relation between the remote sensing earth surface parameters and atmospheric data and earth surface evapotranspiration is learned by adopting BN and a dynamic learning rate. The BN processing avoids the gradient disappearance problem, the training speed is greatly increased, and the dynamic learning rate enables the network to converge to the optimal solution better.
Owner:CHINESE ACAD OF SURVEYING & MAPPING
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products