Unlock instant, AI-driven research and patent intelligence for your innovation.

A Parallel Similarity Join Method Based on CPU-GPU Heterogeneous Architecture

A technology of system structure and connection method, applied in the direction of structured data retrieval, database management system, instrument, etc., can solve the problems of processing efficiency limitation, achieve good versatility, improve processing efficiency, and improve execution efficiency

Active Publication Date: 2022-06-17
NORTHEASTERN UNIV LIAONING
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

These algorithms have improved the efficiency of the filtering stage to a certain extent, but they are all based on the design idea of ​​serial processing, and the processing efficiency is greatly limited.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A Parallel Similarity Join Method Based on CPU-GPU Heterogeneous Architecture
  • A Parallel Similarity Join Method Based on CPU-GPU Heterogeneous Architecture
  • A Parallel Similarity Join Method Based on CPU-GPU Heterogeneous Architecture

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0038] The specific embodiments of the present invention will be described in further detail below with reference to the accompanying drawings and embodiments. The following examples are intended to illustrate the present invention, but not to limit the scope of the present invention.

[0039] like figure 1 As shown, the method of this embodiment is as follows.

[0040] A parallel similarity connection method based on CPU-GPU heterogeneous architecture, comprising the following steps:

[0041] Step 1: Use the GPU to build the SoA new inverted index in parallel on the initial dataset S, such as figure 2 Shown is a schematic diagram of the constructed SoA-based inverted index;

[0042] Step 1.1: The given data set S in this embodiment is shown in Table 1, which contains a string of 9 user records, and each row of data Si is divided into several data set tokens according to the spaces in the string, and each record is composed of different tokens;

[0043] Table 1 Example d...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a parallel similarity connection method based on CPU-GPU heterogeneous architecture, belonging to the technical field of computer database technology and parallel computing. This method analyzes and designs the data similarity connection method, builds a new inverted index structure, realizes the parallel construction of the inverted index on the GPU, decomposes the similarity connection method, and redesigns it according to the different computing characteristics of the two processors. In the calculation process, double prefix filtering is implemented based on GPU, which effectively reduces the size of the candidate set. The similarity connection method based on the CPU-GPU heterogeneous architecture provided by the present invention can accurately convert the traditional data similarity connection to the CPU-GPU heterogeneous computing system, thereby effectively improving the processing of the similarity connection of large-scale data sets efficiency.

Description

technical field [0001] The invention relates to the field of computer database technology and parallel computing technology, in particular to a parallel similarity connection method based on a CPU-GPU heterogeneous architecture. Background technique [0002] With the development of the traditional Internet and the emergence of the mobile Internet, the amount of data has grown rapidly, and the concept of "big data" has gradually become familiar to people. But the large amount of data also brings new challenges to traditional data storage and processing. In order to process big data faster, people adopt distributed strategies such as MapReduce and HDFS to calculate and store big data. The traditional CPU performance improvement method has reached the bottleneck, and it becomes more and more difficult to improve the CPU performance by increasing the main frequency and the number of cores. The processing speed of the traditional similarity connection algorithm, which is only c...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/25G06F16/22G06F15/163
CPCG06F16/252G06F16/2228G06F15/163
Inventor 聂铁铮徐坤浩申德荣于戈寇月
Owner NORTHEASTERN UNIV LIAONING