Super-long genome-based variation detection algorithm and detection system

A mutation detection and genome technology, applied in the direction of microbial determination/inspection, calculation, biochemical equipment and methods, etc., can solve the problem of inability to accurately detect the structural variation of large and ultra-long genomes, and achieve the effect of improving sensitivity

Active Publication Date: 2016-04-13
WUHAN FRASERGEN CO LTD
View PDF5 Cites 18 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] The invention provides a variation detection algorithm based on an ultra-long genome, referred to as the VariationBlast algorit

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Super-long genome-based variation detection algorithm and detection system
  • Super-long genome-based variation detection algorithm and detection system
  • Super-long genome-based variation detection algorithm and detection system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0067] The principles and features of the present invention are described below in conjunction with the accompanying drawings, and the examples given are only used to explain the present invention, and are not intended to limit the scope of the present invention.

[0068] Such as figure 1 As shown, a variation detection algorithm based on ultra-long genomes includes the following steps:

[0069] S1. Using a partial sequence alignment algorithm to detect all matches between the sequencing fragments and the reference sequence, and obtain partial matching events, each partial matching event includes the sequencing fragments and the reference fragments on the reference sequence;

[0070] S2. Sorting the sequenced fragments in all partial matching events according to the positions compared to the reference sequence, and grouping the partial matching events whose positions on the reference sequence overlap or are sequentially connected by the sequenced fragments;

[0071] S3. Score...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a super-long genome-based variation detection algorithm, a Variation Blast algorithm for short. Under the condition that a long sequence is obtained, large-scale structural variations can be generally detected by comparing a sequence with a reference genome; and because a sequence spanning structural variations can generate part of segments matching a reference sequence, and then by virtue of the comparison between part of the segments of the sequence and corresponding segments of the reference sequence, an accurate point position of the structural variations can be detected, the Variation Blast detects the comparison between every sequence and the reference genome by virtue of a successive comparison method, then all the sequences representing the structural variations are classified and screened, and finally, possible structural variations and respective types thereof are obtained from comparison sites and directions.

Description

technical field [0001] The invention relates to gene sequence structure variation detection, in particular to a variation detection algorithm and detection system based on an ultra-long genome. Background technique [0002] The Human Genome Project (HGP) launched in the 1990s, coupled with the ensuing Thousand Genomes Project, and the implementation of the Encyclopedia of DNA Elements Project (ENCODE) have accelerated the development of the genome era. Second-generation and third-generation DNA have enabled the successful completion of genome sequencing projects for many species, thereby accumulating a large amount of biological data. These biological big data must be properly analyzed to dig out information with potential theoretical value and application value. Genome sequence polymorphism refers to the differences in DNA sequence and structure within and between populations of species. These genome differences in humans determine the genome differences or polymorphisms ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): C12Q1/68G06F19/18
Inventor 朱世杰
Owner WUHAN FRASERGEN CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products