A high-performance computing method based on cpu-gpu cooperative parallel text topic model lda

A high-performance computing, topic model technology, applied in computing, unstructured text data retrieval, text database clustering/classification, etc., can solve the problems of single platform and low computing efficiency

Active Publication Date: 2021-09-14
WUHAN UNIV
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0007] In view of this, the present invention provides a high-performance computing method based on CPU-GPU cooperative parallel text topic model LDA, which is used to solve or at least partially solve the problem of single implementation platform and low computing efficiency in the existing methods in the prior art. technical problem

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A high-performance computing method based on cpu-gpu cooperative parallel text topic model lda
  • A high-performance computing method based on cpu-gpu cooperative parallel text topic model lda
  • A high-performance computing method based on cpu-gpu cooperative parallel text topic model lda

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0064] This embodiment provides a high-performance computing method based on CPU-GPU cooperative parallel text topic model LDA, please refer to figure 1 , the method includes:

[0065] Step S1: Based on the dynamic programming algorithm, optimize the allocation of two heterogeneous computing resources, CPU and GPU, and obtain an optimal allocation plan for resources.

[0066] Specifically, in a CPU-GPU heterogeneous system, reasonable resource allocation is crucial to efficiently utilizing the system's computing power. When the present invention uses a dynamic programming algorithm for resource allocation, on the CPU side, the calculation threads and task allocation threads can be reasonably allocated according to the number of threads supported by the CPU; on the GPU side, GPU hardware resource constraints, algorithm storage requirements and general The GPU programming optimization rules transform the problem of optimal allocation of GPU computing resources into a problem of...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a high-performance computing method based on CPU-GPU cooperative parallel text topic model LDA. First, based on a dynamic programming algorithm, two heterogeneous computing resources of CPU and GPU are optimally configured; then, GPU is completed based on a logarithmic function model. Performance evaluation, complete the optimal granularity division of text data; then realize the CPU-GPU collaborative parallel computing of the hidden Dirichlet allocation model based on the exponential stochastic cellular automata algorithm; and further adapt the CPU-GPU based on the improved greedy strategy Heterogeneous scheduling to achieve load balancing. The present invention realizes the high-performance modeling of the text topic model, and is beneficial to quickly discover the topic information hidden in the text, so as to meet the high-efficiency processing requirements of applications such as massive document collection classification and text data stream computing.

Description

technical field [0001] The invention relates to the technical field of high-performance computing in heterogeneous environments, in particular to a high-performance computing method based on CPU-GPU cooperative parallel text topic model LDA. Background technique [0002] With the rapid development of the Internet, a large amount of network texts rich in implicit information (such as Weibo, product reviews, and news reports) are constantly being produced, which has become a kind of basic data that is widely valued. Text topic extraction is an important step in text data mining. Among them, hidden Dirichlet allocation model (LDA) is a classic topic model, and a large number of model variants have been produced, which are widely used in text topic extraction, document collection classification and other calculations. scene. However, the standard LDA model requires a large number of iterative calculations, and the computational complexity is proportional to the amount of data. ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/35G06F9/48G06F9/50
CPCG06F9/4843G06F9/5088G06F16/35
Inventor 李锐王鸿琰舒时立
Owner WUHAN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products