Graph-based large-scale embedding model training method and system for click-through rate prediction

A technology for model training and click-through rate, applied in prediction, neural learning method, biological neural network model, etc., can solve the problem that click-through rate prediction technology cannot be applied to deep learning model, embedding model training, expensive network communication overhead, etc. Achieve the effect of reducing communication overhead, good locality and load balancing, and good scalability
CN114358859BActive Publication Date: 2022-07-01PEKING UNIV

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
PEKING UNIV
Publication Date
2022-07-01

Smart Images

  • Figure 1
    Figure 1
  • Figure 2
    Figure 2
  • Figure 3
    Figure 3
Patent Text Reader

Abstract

The invention discloses a graph-based large-scale embedding model training method and system for click-through rate prediction. The system includes a dense parameter module and a client-side module, adopts a hybrid communication architecture, and distributes the click-through rate prediction input data set to different jobs. Each worker node maintains a client, and the local model parameters are directly stored in the GPU memory; each worker node holds a copy of the model parameters and synchronizes them during training. The present invention adopts the Embedding model parameter to represent the importance of the feature value of the corresponding category of the input data of click-through rate prediction, expresses the click-through rate prediction data and the embedding model vector as a binary graph model, and uses the graph locality and degree skew characteristics to perform parallel training of the model; Design graph-based partitioning and bounded synchronization to improve scalability and parallel computing efficiency for training large embedding models.
Need to check novelty before this filing date? Find Prior Art

Description

technical field

[0001] The invention belongs to the technical field of distributed machine learning, and relates to a large-scale embedding model training method and system, in particular to a graph-based large-scale embedding model training method and system for click rate prediction. Background technique

[0002] Embeddings are often used to deal with representation learning problems on high-dimensional data, such as words in text corpora, users, and items in recommender systems. Deep Embedding techniques use continuous vectors to represent discrete variables and have numerous practical applications, such as click-through rate (CTR) prediction systems, graph processing, and information extraction. However, as the scale of deep embedding models continues to expand and the amount of input data increases, building a huge embedding model training system is more challenging in terms of effectiveness and efficiency. For example, Facebook's production platform proposes a true de...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More