Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Systems and methods for robust large-scale machine learning

A machine learning and computing machine technology, applied in machine learning, neural learning methods, based on specific mathematical models, etc., can solve the problems of expensive horizontal expansion of commercial servers and limited I/O bandwidth scaling.

Active Publication Date: 2018-08-31
GOOGLE LLC
View PDF10 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

While this is an attractive approach for some problems, it has significant scaling limitations in terms of I / O bandwidth
It is also generally more expensive than scaling out using low-cost commodity servers
GPUs are not always a cost-effective solution especially for sparse datasets

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Systems and methods for robust large-scale machine learning
  • Systems and methods for robust large-scale machine learning
  • Systems and methods for robust large-scale machine learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0022] Overview of the Disclosure

[0023] In general, the present disclosure provides systems and methods for robust large-scale machine learning. In particular, the present disclosure provides a new scalable coordinate descent (SCD) algorithm for generalized linear models that overcomes the scaling problems outlined in the Background section above. The SCD algorithm described herein is highly robust, having the same convergence behavior no matter how much it is scaled out and regardless of the computing environment. This allows SCD to scale to tens of thousands of cores and makes it well suited for running in distributed computing environments such as cloud environments with low-cost commodity servers, for example.

[0024] In particular, by using natural partitioning of parameters into blocks, updates can be performed in parallel one block at a time without compromising convergence. In fact, for many real-world problems, SCD has the same convergence behavior as the popu...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The present disclosure provides a new scalable coordinate descent (SCD) algorithm and associated system for generalized linear models whose convergence behavior is always the same, regardless of how much SCD is scaled out and regardless of the computing environment. This makes SCD highly robust and enables it to scale to massive datasets on low-cost commodity servers. According to one aspect, by using a natural partitioning of parameters into blocks, updates can be performed in parallel a block at a time without compromising convergence. Experimental results on a real advertising dataset are used to demonstrate SCD's cost effectiveness and scalability.

Description

technical field [0001] The present disclosure relates generally to novel, highly scalable and robust machine learning systems and techniques, and more particularly to systems and methods for robust large-scale machine learning in a distributed computing environment. Background technique [0002] Although distributed machine learning (ML) algorithms have been extensively studied, scaling to large numbers of machines can remain challenging. The fastest convergent single-machine algorithms update model parameters at a very high rate, which makes them difficult to distribute without trade-offs. As an example, the single-machine stochastic gradient descent (SGD) technique updates model parameters after processing each training example. As another example, coordinate descent (CD) techniques update model parameters after processing individual features. [0003] The general approach to distributed SGD or CD is by letting updates happen with some delay or by breaking the elementary...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06N3/08G06N20/00
CPCG06F9/46G06N20/00G06F9/52G06F3/0644G06F3/0655G06F3/067G06N7/00
Inventor S.伦德尔D.C.费特利E.谢基塔B-Y.苏
Owner GOOGLE LLC
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products