Unlock instant, AI-driven research and patent intelligence for your innovation.

Data update method and device

A data update and updated technology, applied in the computer field, can solve the problem of large data processing volume of the first server

Active Publication Date: 2021-08-06
TENCENT TECH (SHENZHEN) CO LTD
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The embodiment of the present invention provides a data updating method and device, so as to at least solve the technical problem in the prior art that the data processing amount of the first server is relatively large during the data updating process

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data update method and device
  • Data update method and device
  • Data update method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0023] The method for updating data provided in this embodiment may be, but is not limited to, applied to scenarios of topic model training. Most of the existing topic model training methods use a distributed architecture based on parameter servers to train topic models. like figure 1 As shown, the existing topic model training method stores the word-topic matrix in the server, and stores the document-topic matrix in the worker. In the training process of the topic model, each worker obtains the word-topic matrix from multiple servers to the local before each iteration, and then performs the Gibbs sampling algorithm. Since the Gibbs sampling algorithm will change the topic assignment of words, both the word-topic matrix and document-topic matrix need to be updated. Therefore, the worker will push the update of the word-topic to the server for updating. When the worker has sampled all the words in the machine, a round of iteration ends.

[0024] Because in each round of iter...

Embodiment 2

[0107] According to an embodiment of the present invention, there is also provided a data update device for implementing the above data update method, which is applied to the first server, such as Figure 6 As shown, the device includes:

[0108] 1) The first processing module 62 is configured to push the training instruction information to the second server, and pull the second part of the matrix corresponding to the first word set in the second matrix from the second server, wherein the training instruction information carries There is a second word set and a first partial matrix corresponding to the second word set in the first matrix, and the training instruction information is used to instruct the second server to update the first partial matrix and the second matrix, the first server stores the first matrix, the second server stores the second matrix;

[0109] 2) the first update module 64, for updating the second part matrix and the first matrix according to the second...

Embodiment 3

[0188] The application environment of the embodiment of the present invention may be, but not limited to, refer to the application environment in Embodiment 1, which will not be repeated in this embodiment. The embodiment of the present invention provides an optional specific application example for implementing the above data update method.

[0189] As an optional embodiment, the above data update method can be applied to but not limited to such as Figure 8 In the scenario shown where the matrix stored in the server is updated.

[0190] The data update scheme in this embodiment can be applied to machine learning tasks such as advertisement recommendation, text clustering, and user behavior analysis, but is not limited to. The topic model is a machine learning algorithm widely used in text analysis. This scheme provides A solution for efficiently training topic models in a modern environment. Users do not need the details of relational algorithm execution when using it, and...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a data updating method and device. Wherein, the method includes: the first server pushes the training instruction information to the second server, and pulls the second partial matrix corresponding to the first word set in the second matrix from the second server; , the first matrix and the first word set update the second part matrix and the first matrix; the first server sends the updated second part matrix to the second server, and receives the updated first part sent by the second server matrix; the first server updates the updated first matrix according to the updated first part of the matrix. By adopting the above solution, the present invention solves the technical problem in the prior art that the first server has a large amount of data processing in the process of data updating.

Description

technical field [0001] The present invention relates to the field of computers, in particular to a method and device for updating data. Background technique [0002] The topic model is a method for modeling text. In the field of machine learning, the topic model belongs to the generative model, which means that the model can randomly generate observable data, so the topic model can randomly generate an article composed of N topics. article. By modeling the text through the topic model, we can classify the topic of the text, judge the similarity between the texts, and so on. [0003] Most of the existing topic model training methods use a distributed architecture based on parameter servers to train topic models. In the distributed architecture, some servers are used to store parameters, and another part of the servers is used to train the topic model. The server that performs topic model training needs to obtain all the content stored on the server that stores the parameter...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06N20/00G06K9/62G06F40/205G06F40/216
CPCG06F40/216G06F40/205G06F18/214
Inventor 余乐乐肖品崔斌
Owner TENCENT TECH (SHENZHEN) CO LTD