Neural network optimization method of a lifted proximal operator machine (LPOM)

A technology of neural network and optimization method, applied in the field of deep learning, can solve problems such as slow convergence speed and difficulty in training neural network

Active Publication Date: 2018-03-09
PEKING UNIV
View PDF5 Cites 8 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, solving the optimization problem of the neural network is a typical non-convex optimization problem. As the number of neural network layers increases, it becomes more difficult to train the neural network.
However, in practice, ADMMs often only solve shallow neural networks (about 4 layers)
For deep neural networks, the convergence speed of the ADMM method will be very slow

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Neural network optimization method of a lifted proximal operator machine (LPOM)
  • Neural network optimization method of a lifted proximal operator machine (LPOM)
  • Neural network optimization method of a lifted proximal operator machine (LPOM)

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0171] Embodiment 1: shallow network

[0172] For a three-layer (n=3) neural network, the number of units in the hidden layer of the neural network is 300, using the LPOM algorithm, we set the hyperparameter μ i = 2 i-n , K 1 =600, K 2 =100, m 1 =1000, b=100.

[0173] Directly compare the final recognition rate results. When we use the LPOM algorithm to optimize the neural network, the final recognition rate is 95.6%. And when the stochastic gradient descent method is used to optimize the problem, the final recognition result is 95.3% (the result is directly from the MNIST official website http: / / yann.lecun.com / exdb / mnist / obtained on ). It can be seen from this that the LPOM method can obtain recognition results comparable to the stochastic gradient descent method on a shallow neural network.

Embodiment 2

[0174] Embodiment 2: Deep Network

[0175] The method of the present invention is adopted on a deep neural network. We set the structure of the neural network as Where n-2 is the number of hidden layers of the network, we set n-2 to 18, 19, and 20. For the LPOM algorithm, use the same hyperparameter μ i = 2 i-n , K 1 =600, K 2 =100, m 1 =1000, b=100. For the stochastic gradient descent method, we search for hyperparameters that satisfy: 1) search for step size parameters from 0.001, 0.005, 0.01, 0.05, 0.1, 0.5, 1, and 2) search for impulse parameters from 0, 0.2, 0.5, 0.9. For the LPOM algorithm and the SGD algorithm (stochastic gradient descent method), use the literature [17] (Glorot X, BengioY. Understanding the difficulty of training deep feedforward neural networks [C] / / Artificial Intelligence and Statistics.2010,9:249-256 .) documented initialization method: parameter from Obtained from a uniform distribution, where n i and n o are the input and output d...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a neural network optimization method, called as a neural network optimization method of a lifted proximal operator machine (LPOM), and relates to the technical field of deep learning. The neural network optimization method disclosed by the invention has the advantages that existing first-order or second-order derivative information does not need to be used for directly optimizing a neural network, instead, a neural network is converted into a new LPOM optimization problem, and then the LPOM optimization problem is solved by virtue of an alternate minimization method. Byadopting the method disclosed by the invention, a layered structure of the neural network is eliminated in a solving process; solving can be carried out by virtue of an alternant iteration process; asolving method can be converted into a random algorithm in some degree, so that smaller calculation amount is maintained during iteration; and for a neural network more than 20 layers, training errorstill can be stably reduced by adopting the method disclosed by the invention.

Description

technical field [0001] The present invention relates to the field of deep learning technology, in particular to a new neural network optimization method named Lifted Proximal Operator Machine (LPOM). This method transforms the optimization problem of the neural network into a new optimization problem for solution, and the Karush–Kuhn–Tucker (KKT) condition of the converted optimization problem is equivalent to the forward process of the neural network. Background technique [0002] In recent years, deep neural networks have achieved great success in the fields of artificial intelligence, image recognition, and speech recognition. Compared with shallow neural networks, deep neural networks often have more model parameters and larger capacity, and can achieve better results when the amount of data is large. However, solving the optimization problem of a neural network is a typical non-convex optimization problem. As the number of neural network layers increases, it becomes mo...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06N3/08
CPCG06N3/08
Inventor 林宙辰方聪
Owner PEKING UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products