Three-level flow sequence comparison method based on many-core co-processor

A sequence comparison and processor technology, applied in the field of sequence comparison in the field of bioinformatics, can solve problems such as occupying program running time and affecting program running efficiency, so as to reduce comparison time, improve comparison efficiency, and increase comparison speed Effect

Active Publication Date: 2015-02-25
NAT UNIV OF DEFENSE TECH
View PDF3 Cites 13 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the frequent data transmission between CPU and MIC takes up a lot of program running time, which greatly affects the running efficiency of the program.
At present, there is no public report on the technical scheme of sequence alignment using MIC

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Three-level flow sequence comparison method based on many-core co-processor
  • Three-level flow sequence comparison method based on many-core co-processor

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0055] The National Defense University uses a server equipped with two eight-core 2.4GHz CPUs and a 57-core 1.1GHz MIC card as the environment. The server hard disk size is 43TB, the memory size is 132GB, and the storage space on the MIC card is 6GB. The input data is the human genome. The space occupied by the reference genome is 3GB, and the space occupied by short DNA sequences is 240GB, including 80 million sequences, to verify the effect of the present invention:

[0056] Such as figure 1 As shown, the specific implementation steps are as follows:

[0057] Step 1: The CPU is based on the available space of the main memory of the computer M_CPU=45GB (the operating system and other services occupy a certain amount of memory, and a part of the memory needs to be reserved for use when the program is running, so the available memory size is 45GB, which is less than the installed memory size of 132GB ), and the space occupied by the short DNA sequence M_DNA=240GB, the short DN...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a three-level flow sequence comparison method based on a many-core co-processor. The purpose of increasing the comparison speed of sequence comparison software is achieved. According to the technical scheme, sequence comparison is performed in a multi-threading manner by an MIC (microwave integrated circuit) many-core co-processor; three serial steps of reading sequences from a main memory to an MIC, comparing the sequences and writing a comparison result to the main memory in a sequence comparison process of the MIC are in a three-level flow mode, namely, sequences required for next comparison are read during sequence comparison, a previous comparison result is written into the main memory, and a reading and writing operation and a comparison operation are carried out simultaneously. By the three-level flow sequence comparison method based on the many-core co-processor, three main processes of sequence reading, sequence comparison and comparison result returning are carried out simultaneously, the comparison efficiency is improved, and the comparison time is shortened. Compared with a two-channel eight-core CPU (central processing unit), the three-level flow sequence comparison method has the advantages that the speed of a comparison process can be increased 2.3 times at least, a large amount of memory space is prevented from being copied, and the space-time efficiency of a procedure is improved.

Description

technical field [0001] The invention relates to a sequence alignment method in the field of biological information, in particular to a sequence alignment method based on a many-core coprocessor. Background technique [0002] Molecular biology is a subject that studies the material basis of life phenomena at the molecular level. By studying the principles of the structure, function and synthesis of biomolecules, the functions and traits of organisms can be analyzed and analyzed in unprecedented molecular detail. understanding, and then more scientifically and rigorously clarify the nature of life phenomena. [0003] In molecular biology research, DNA sequence analysis is the basis for further research and modification of target genes. DNA (deoxyribonucleic acid) is a biological macromolecule, which is divided into four bases, which are recorded as A, T, C, and G. The arrangement order of these macromolecules determines a certain genetic instruction, and these genetic instruc...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F9/38
Inventor 廖湘科朱小谦崔英博彭绍亮邹丹王恒朱敏刘欣王海强高明
Owner NAT UNIV OF DEFENSE TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products