Graph-based multi-sequence alignment method and system

A multi-sequence and genome sequence technology, applied in the medical field, can solve the problems of ignoring the relationship between individual sequences between groups, high similarity of population genomes, and increasing computational complexity, so as to reduce time and space complexity and achieve a clear and clear data structure. , the effect of reducing the length and number of

Pending Publication Date: 2022-07-01
INST OF LAB ANIMAL SCI CHINESE ACAD OF MEDICAL SCI
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, due to the characteristics of high genome similarity, long sequence, and large sample size of the population, the estimation of the guide tree and the progressive alignment will inevitably lead to many redundant operations, which greatly increase the complexity of the calculation, and the entire alignment process is slow. high cost
In addition, if the method of preprocessing group comparison is adopted, it will also consume a lot of time and computing resources, and directly ignore the relationship between individual sequences between groups
[0005] In order to solve the problem that the traditional multiple sequence alignment method cannot meet the current situation when faced with a large number of sequenced sample genomes, the present invention provides a graph-based multiple sequence alignment method and its system

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Graph-based multi-sequence alignment method and system
  • Graph-based multi-sequence alignment method and system
  • Graph-based multi-sequence alignment method and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0060] In order for those skilled in the art to better understand the solutions of the present invention, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention.

[0061] In some of the processes described in the description and claims of the present invention and the above-mentioned drawings, various operations are included in a specific order, but it should be clearly understood that these operations may not be in accordance with the order in which they appear herein. For execution or parallel execution, the sequence numbers of the operations, such as 101, 102, etc., are only used to distinguish different operations, and the sequence numbers themselves do not represent any execution order. Additionally, these flows may include more or fewer operations, and these operations may be performed sequentially or in parallel. It should b...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a graph-based multi-sequence alignment method, system and device and a computer readable storage medium, and the method comprises the steps: obtaining a group of generic genome sequence data, and carrying out the composition of the generic genome sequence data to obtain a generic genome coloring graph; marking and acquiring characteristics of an access state of a single node of the colored graph, and traversing the colored graph to obtain a cSupB data model after the colored graph is decomposed; obtaining a cSupB data model after colored graph decomposition, and extracting offset value data features of nodes in the cSupB data model; and performing first preprocessing on the cSupB data model based on the offset value data features to obtain an initial comparison result of the cSupB data model. The problem that a traditional multi-sequence alignment mode cannot meet the current situation when a large number of sample genomes are sequenced at present is solved.

Description

technical field [0001] The invention belongs to the field of medical technology, and in particular relates to a graph-based multiple sequence alignment method and a system thereof. Background technique [0002] The development of life sciences, medicine and other fields is closely related to the application of sequencing technology. However, due to sequencing technology, sequencing costs and even computing costs, many genome studies have many problems, such as over-reliance on reference genomes. At present, the reference genome occupies a very important position in many fields. In almost all studies involving genomes, the first thing people have to do is to construct a reference genome for the research species, and then carry out different follow-up studies based on the reference genome, such as the Comparisons of newly sequenced individual data from other species of species to reference genomes reveal differences, an approach that underlies the search for the genetic origin...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G16B30/10G16B45/00G16B50/30
CPCG16B30/10G16B45/00G16B50/30
Inventor 郭金旦陈禹保刘江宁秦川
Owner INST OF LAB ANIMAL SCI CHINESE ACAD OF MEDICAL SCI
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products