Graph-based large-scale embedding model training method and system for click-through rate prediction

A technology for model training and click-through rate, applied in prediction, neural learning method, biological neural network model, etc., can solve the problem that click-through rate prediction technology cannot be applied to deep learning model, embedding model training, expensive network communication overhead, etc. Achieve the effect of reducing communication overhead, good locality and load balancing, and good scalability

Active Publication Date: 2022-07-01
PEKING UNIV
View PDF2 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0009] In summary, the existing click-through rate prediction technology cannot be applied to deep learning models, and there is expensive network communication overhead in large-scale distributed training scenarios; existing graph processing algorithms are not suitable for the embedding model used for click-through rate prediction. Training; the existing traditional consensus protocols and learning and training systems do not take into account the update dependencies between embeddings; high overhead and low efficiency

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Graph-based large-scale embedding model training method and system for click-through rate prediction
  • Graph-based large-scale embedding model training method and system for click-through rate prediction
  • Graph-based large-scale embedding model training method and system for click-through rate prediction

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0038] Below in conjunction with the accompanying drawings, the present invention is further described by means of embodiments, but the scope of the present invention is not limited in any way.

[0039] The present invention provides a graph-based large-scale embedding model training method and system for click-through rate prediction, designs a new graph-based system method, and proposes a new binary graph representation to manage input data and embedding parameters, Improves scalability for training large embedding models.

[0040] Based on the newly constructed binary graph of the present invention, the graph needs to be partitioned to reduce embedding / gradient communication between different working nodes, and at the same time achieve an optimally balanced workload. In order to reduce the communication overhead and achieve the best workload, the present invention designs a hybrid graph partitioning mechanism based on the embedding model; and the vertex partitioning method ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a graph-based large-scale embedding model training method and system for click-through rate prediction. The system includes a dense parameter module and a client-side module, adopts a hybrid communication architecture, and distributes the click-through rate prediction input data set to different jobs. Each worker node maintains a client, and the local model parameters are directly stored in the GPU memory; each worker node holds a copy of the model parameters and synchronizes them during training. The present invention adopts the Embedding model parameter to represent the importance of the feature value of the corresponding category of the input data of click-through rate prediction, expresses the click-through rate prediction data and the embedding model vector as a binary graph model, and uses the graph locality and degree skew characteristics to perform parallel training of the model; Design graph-based partitioning and bounded synchronization to improve scalability and parallel computing efficiency for training large embedding models.

Description

technical field [0001] The invention belongs to the technical field of distributed machine learning, and relates to a large-scale embedding model training method and system, in particular to a graph-based large-scale embedding model training method and system for click rate prediction. Background technique [0002] Embeddings are often used to deal with representation learning problems on high-dimensional data, such as words in text corpora, users, and items in recommender systems. Deep Embedding techniques use continuous vectors to represent discrete variables and have numerous practical applications, such as click-through rate (CTR) prediction systems, graph processing, and information extraction. However, as the scale of deep embedding models continues to expand and the amount of input data increases, building a huge embedding model training system is more challenging in terms of effectiveness and efficiency. For example, Facebook's production platform proposes a true de...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06Q30/02G06Q10/04G06N3/04G06N3/08
Inventor 崔斌苗旭鹏梁宇轩石屹宁张海林
Owner PEKING UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products