Neural network optimization method of a lifted proximal operator machine (LPOM)

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology of neural network and optimization method, applied in the field of deep learning, can solve problems such as slow convergence speed and difficulty in training neural network

Active Publication Date: 2018-03-09

PEKING UNIV

View PDF5 Cites 8 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

However, solving the optimization problem of the neural network is a typical non-convex optimization problem. As the number of neural network layers increases, it becomes more difficult to train the neural network.

However, in practice, ADMMs often only solve shallow neural networks (about 4 layers)

For deep neural networks, the convergence speed of the ADMM method will be very slow

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0171] Embodiment 1: shallow network

[0172] For a three-layer (n=3) neural network, the number of units in the hidden layer of the neural network is 300, using the LPOM algorithm, we set the hyperparameter μ i = 2 i-n , K 1 =600, K 2 =100, m 1 =1000, b=100.

[0173] Directly compare the final recognition rate results. When we use the LPOM algorithm to optimize the neural network, the final recognition rate is 95.6%. And when the stochastic gradient descent method is used to optimize the problem, the final recognition result is 95.3% (the result is directly from the MNIST official website http: / / yann.lecun.com / exdb / mnist / obtained on ). It can be seen from this that the LPOM method can obtain recognition results comparable to the stochastic gradient descent method on a shallow neural network.

Embodiment 2

[0174] Embodiment 2: Deep Network

[0175] The method of the present invention is adopted on a deep neural network. We set the structure of the neural network as Where n-2 is the number of hidden layers of the network, we set n-2 to 18, 19, and 20. For the LPOM algorithm, use the same hyperparameter μ i = 2 i-n , K 1 =600, K 2 =100, m 1 =1000, b=100. For the stochastic gradient descent method, we search for hyperparameters that satisfy: 1) search for step size parameters from 0.001, 0.005, 0.01, 0.05, 0.1, 0.5, 1, and 2) search for impulse parameters from 0, 0.2, 0.5, 0.9. For the LPOM algorithm and the SGD algorithm (stochastic gradient descent method), use the literature [17] (Glorot X, BengioY. Understanding the difficulty of training deep feedforward neural networks [C] / / Artificial Intelligence and Statistics.2010,9:249-256 .) documented initialization method: parameter from Obtained from a uniform distribution, where n i and n o are the input and output d...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a neural network optimization method, called as a neural network optimization method of a lifted proximal operator machine (LPOM), and relates to the technical field of deep learning. The neural network optimization method disclosed by the invention has the advantages that existing first-order or second-order derivative information does not need to be used for directly optimizing a neural network, instead, a neural network is converted into a new LPOM optimization problem, and then the LPOM optimization problem is solved by virtue of an alternate minimization method. Byadopting the method disclosed by the invention, a layered structure of the neural network is eliminated in a solving process; solving can be carried out by virtue of an alternant iteration process; asolving method can be converted into a random algorithm in some degree, so that smaller calculation amount is maintained during iteration; and for a neural network more than 20 layers, training errorstill can be stably reduced by adopting the method disclosed by the invention.

Description

technical field [0001] The present invention relates to the field of deep learning technology, in particular to a new neural network optimization method named Lifted Proximal Operator Machine (LPOM). This method transforms the optimization problem of the neural network into a new optimization problem for solution, and the Karush–Kuhn–Tucker (KKT) condition of the converted optimization problem is equivalent to the forward process of the neural network. Background technique [0002] In recent years, deep neural networks have achieved great success in the fields of artificial intelligence, image recognition, and speech recognition. Compared with shallow neural networks, deep neural networks often have more model parameters and larger capacity, and can achieve better results when the amount of data is large. However, solving the optimization problem of a neural network is a typical non-convex optimization problem. As the number of neural network layers increases, it becomes mo...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G06N3/08

CPCG06N3/08

Inventor 林宙辰方聪

Owner PEKING UNIV

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Neural network optimization method of a lifted proximal operator machine (LPOM)

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology