Parallel topic mining method and device

A mining device and topic technology, applied in the computer field, can solve problems such as large communication volume, large time delay, and low efficiency of parallel topic mining

Active Publication Date: 2015-08-26
HONOR DEVICE CO LTD
View PDF4 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] When doing large-scale topic mining, the communication module in LDA needs to upload and download all the elements of the word-topic matrix in each cycle. The communication volume is large, resulting in a large time delay, and the efficiency of parallel topic mining is low.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Parallel topic mining method and device
  • Parallel topic mining method and device
  • Parallel topic mining method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0057] In order to make the purpose, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the drawings in the embodiments of the present invention. Obviously, the described embodiments It is a part of embodiments of the present invention, but not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without creative efforts fall within the protection scope of the present invention.

[0058] figure 1 It is a flow chart of Embodiment 1 of the parallel topic mining method of the present invention. Such as figure 1 As shown, the parallel agent mining method based on dynamic selective communication provided in this embodiment can specifically be executed by the first node in the parallel agent mining device, and the method provid...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

An embodiment of the invention provides a parallel topic mining method and device. The method comprises steps as follows: a first node of the parallel topic mining device receives a second word-topic submatrix sent by a second node and a second remainder submatrix, wherein the second remainder submatrix comprises a row whose row accumulated value is the largest in a remainder matrix as well as a column whose column accumulated value is the largest; the second word-topic submatrix comprises a row, corresponding to the row number of the row with the largest row accumulated value in the remainder matrix, in a word-topic matrix as well as a column, corresponding to the column number of the column with the largest column accumulated value in the remainder matrix, in the word-topic matrix; a first word-topic submatrix is updated according to the second word-topic submatrix, a first remainder submatrix is updated according to the second remainder submatrix, and the updated first word-topic submatrix and the updated first remainder submatrix are sent to the second node. Accordingly, the communication capacity in the topic mining process is reduced, and the topic mining speed is increased.

Description

technical field [0001] The invention relates to computer technology, in particular to a parallel topic mining method and device. Background technique [0002] Mining semantically related word clusters from massive document or image collections is called topic mining. Usually, the bag-of-words matrix is ​​used to represent the document collection and input into the latent Dirichlet Allocation system (Latent Dirichlet Allocation, referred to as: LDA), and the document-topic matrix and word-topic matrix are estimated and output through automatic reasoning methods. [0003] Existing parallel topic mining methods use large-scale multi-processor clusters, which include a parent computing node and multiple child computing nodes, and the child computing nodes and parent computing nodes are usually connected and communicated through a network. When performing topic mining, LDA first divides the input bag of words matrix evenly according to the document index, and inputs them to each...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
Inventor 曾嘉倪冰陈嘉
Owner HONOR DEVICE CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products