High-concurrency sequence alignment calculation acceleration method based on CPU + GPU isomerism

A sequence comparison and heterogeneous technology, which is applied in the field of high-concurrency sequence comparison calculation acceleration, can solve the problems of low utilization of computing resources, increase the ability of data parallelism, and limited acceleration effects, so as to improve the utilization of GPU resources, The effect of high processing efficiency and efficient asynchronous communication

Active Publication Date: 2022-02-18
GUANGZHOU JIAJIAN MEDICAL TESTING CO LTD
View PDF6 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] The above technical solutions are to optimize or improve a specific algorithm (such as SW algorithm, FM-index algorithm). These technical solutions focus on the algorithm itself and use the characteristics of GPU to solve the problem of data parallelism. The improvement of the specific algorithm used to achieve the purpose of increasing the speed of biological sequence comparison is that its disadvantage is that when performing biological sequence comparison, it is necessary to find a relatively accurate matching sub-segment as a seed. For different reads, it contains The length, position, and number of SMEMs are very different; when using the task division method of processing one reads per thread on the GPU platform, this will cause serious desynchronization between different threads, including reads with shorter SMEMs It is necessary to wait for the reads containing longer SMEMs to be found, and for the reads containing shorter SMEMs, the number of SMEMs it contains is more, so the reads containing longer SMEMs need to wait for the reads containing shorter SMEMs in turn. This situation of w

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • High-concurrency sequence alignment calculation acceleration method based on CPU + GPU isomerism
  • High-concurrency sequence alignment calculation acceleration method based on CPU + GPU isomerism
  • High-concurrency sequence alignment calculation acceleration method based on CPU + GPU isomerism

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0047] The present invention will be further described in detail below in conjunction with the embodiments and the accompanying drawings, but the embodiments of the present invention are not limited thereto.

[0048] A highly concurrent sequence alignment calculation acceleration method based on CPU+GPU heterogeneity, including the following:

[0049] Data access asynchronous model design, data pipeline transmission to achieve efficient asynchronous communication, and then support the asynchronous execution of concurrent comparison algorithms through the global work list.

[0050] Compare algorithms and strategies for task concurrency. The strategy includes: FM-index, Smith-Watermen algorithm and Grid, Block dimension, sequence and thread high concurrency strategy.

[0051] The data-driven concurrent execution mechanism asynchronously loads the data into the GPU cache through the data partition strategy, serves multiple computing tasks, makes full use of the access overhead o...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a high-concurrency sequence alignment calculation acceleration method based on CPU + GPU isomerism. The method comprises the following steps: reconstructing BWA-MEM algorithm codes; performing task concurrent processing on the CPU: completing division of a sequence set, and forming a plurality of concurrent tasks for the first time; running the BWA-MEM algorithm after code reconstruction, and completing concurrent processing of data on the GPU; and task concurrent processing on the GPU: for seed sets and chains generated in the sequence data comparison process, dividing the seed sets with the same or adjacent length, position and quantity into the same data block and chain, and performing the same processing, thereby completing the division of the seed sets and the chains, and forming a plurality of concurrent tasks for the second time. According to the method, the characteristics of the BWA-MEM algorithm and the characteristics of GPU acceleration equipment are closely combined by designing a task parallel and data parallel mode, the strong concurrent operation capability of the GPU is fully utilized, excellent performance is provided for a sequence alignment algorithm, and the efficiency of high-concurrent processing is higher.

Description

technical field [0001] The invention relates to the field of biological sequence comparison, in particular to a CPU+GPU heterogeneous high-concurrency sequence comparison calculation acceleration method. Background technique [0002] Biological sequence alignment is the application of the classic text alignment problem in the computer field to the biological field. With the development of emerging new molecular biology techniques, subsequent molecular biology research such as gene variation, RNA expression, protein and gene interaction requires researchers to use high-throughput methods to explain. This poses new challenges for high performance computing. The challenge in the era of high-throughput sequencing is no longer data generation, but data storage, processing and analysis. The development of third-generation sequencing technology will further accelerate the sequencing speed and generate longer sequencing fragments, which puts forward higher requirements for the dev...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F15/16G06F9/48
CPCG06F15/16G06F9/4881Y02D10/00
Inventor 张巍林超宁张崇
Owner GUANGZHOU JIAJIAN MEDICAL TESTING CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products