Systems and methods for improved optimization of machine-learned models

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A machine learning model and learning rate technology, applied in the field of machine learning, can solve the problems of reducing generalization ability and speeding up

Pending Publication Date: 2020-02-18

GOOGLE LLC

View PDF3 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

In practice, however, reducing variance by increasing the batch size usually results in a speedup sublinear to the batch size, as well as reduced generalization

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0046] 1 Overview

[0047] Generally, the present disclosure relates to systems and methods for improving the optimization of machine learning models. Specifically, the present disclosure provides a stochastic optimization algorithm that is faster than a widely used algorithm for a fixed amount of calculation, and can also scale significantly better as more computing resources become available. Random optimization algorithms can be used with large batch sizes. As an example, in some embodiments, the system and method of the present disclosure can implicitly calculate the inverse Hessian of each mini-batch training data to generate the descent direction. This can be done without an explicit approximation to the Hessian or Hessian vector product. An example experiment is provided to train large image net models (for example, Inception-V3, Resnet-50, Resnet-101, and Inception-Resnet-V2) by successfully using small batch sizes of up to 32,000, which are not compared to the current...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

Generally, the present disclosure is directed to systems and methods for improved optimization of machine-learned models. In particular, the present disclosure provides stochastic optimization algorithms that are both faster than widely used algorithms for fixed amounts of computation, and are also able to scale up substantially better as more computational resources become available. The stochastic optimization algorithms can be used with large batch sizes. As an example, in some implementations, the systems and methods of the present disclosure can implicitly compute the inverse hessian of each mini-batch of training data to produce descent directions.

Description

Technical field [0001] The present disclosure relates generally to machine learning. More specifically, the present disclosure relates to systems and methods for improving the optimization of machine learning models, such as, for example, deep neural networks. Background technique [0002] The progress of machine learning (eg, deep learning) is slowed by the number of days or weeks required to train large models. Natural solutions that use more hardware are limited by diminishing returns and lead to inefficient use of additional resources. [0003] The current state of training deep neural networks is that simple mini-batch optimizers (such as stochastic gradient descent (SGD) and momentum optimizers) and diagonal natural gradient methods are most used in practice. As the availability of distributed computing increases, the total real time (wall-time) for training large models has become a substantial bottleneck, and methods to reduce the total real time without sacrificing model...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G06N3/08G06N3/04

CPCG06N3/084G06N3/045G06N20/00G06F17/16G06N3/047

Inventor R.里夫金Y.肖S.克里什南

Owner GOOGLE LLC

Systems and methods for improved optimization of machine-learned models

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology