Unlock instant, AI-driven research and patent intelligence for your innovation.

Systems and methods for improved optimization of machine-learned models

A machine learning model and learning rate technology, applied in the field of machine learning, can solve the problems of reducing generalization ability and speeding up

Pending Publication Date: 2020-02-18
GOOGLE LLC
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

In practice, however, reducing variance by increasing the batch size usually results in a speedup sublinear to the batch size, as well as reduced generalization

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Systems and methods for improved optimization of machine-learned models
  • Systems and methods for improved optimization of machine-learned models
  • Systems and methods for improved optimization of machine-learned models

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0046] 1 Overview

[0047] Generally, the present disclosure relates to systems and methods for improving the optimization of machine learning models. Specifically, the present disclosure provides a stochastic optimization algorithm that is faster than a widely used algorithm for a fixed amount of calculation, and can also scale significantly better as more computing resources become available. Random optimization algorithms can be used with large batch sizes. As an example, in some embodiments, the system and method of the present disclosure can implicitly calculate the inverse Hessian of each mini-batch training data to generate the descent direction. This can be done without an explicit approximation to the Hessian or Hessian vector product. An example experiment is provided to train large image net models (for example, Inception-V3, Resnet-50, Resnet-101, and Inception-Resnet-V2) by successfully using small batch sizes of up to 32,000, which are not compared to the current...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

Generally, the present disclosure is directed to systems and methods for improved optimization of machine-learned models. In particular, the present disclosure provides stochastic optimization algorithms that are both faster than widely used algorithms for fixed amounts of computation, and are also able to scale up substantially better as more computational resources become available. The stochastic optimization algorithms can be used with large batch sizes. As an example, in some implementations, the systems and methods of the present disclosure can implicitly compute the inverse hessian of each mini-batch of training data to produce descent directions.

Description

Technical field [0001] The present disclosure relates generally to machine learning. More specifically, the present disclosure relates to systems and methods for improving the optimization of machine learning models, such as, for example, deep neural networks. Background technique [0002] The progress of machine learning (eg, deep learning) is slowed by the number of days or weeks required to train large models. Natural solutions that use more hardware are limited by diminishing returns and lead to inefficient use of additional resources. [0003] The current state of training deep neural networks is that simple mini-batch optimizers (such as stochastic gradient descent (SGD) and momentum optimizers) and diagonal natural gradient methods are most used in practice. As the availability of distributed computing increases, the total real time (wall-time) for training large models has become a substantial bottleneck, and methods to reduce the total real time without sacrificing model...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06N3/08G06N3/04
CPCG06N3/084G06N3/045G06N20/00G06F17/16G06N3/047
Inventor R.里夫金Y.肖S.克里什南
Owner GOOGLE LLC
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More