Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Sequencing data assembly method

An assembly method and sequencing data technology, applied in the field of bioinformatics, can solve problems such as information loss and difficult assembly, and achieve the effect of improving the effect of gene assembly

Active Publication Date: 2019-05-28
GENERGY BIO TECH SHANGHAI CO LTD
View PDF10 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

At present, the most commonly used method is Next Generation Sequencing (NGS), but the NGS method will lose a large number of repetitive elements and structural variation information, so assembling a complete genome map becomes a difficult problem

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Sequencing data assembly method
  • Sequencing data assembly method
  • Sequencing data assembly method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0061] In order to make the object, technical solution and advantages of the present invention clearer, various embodiments of the present invention will be described in detail below in conjunction with the accompanying drawings. However, those of ordinary skill in the art can understand that, in each implementation manner of the present invention, many technical details are provided for readers to better understand the present application. However, even without these technical details and various changes and modifications based on the following implementation modes, the technical solution claimed in each claim of the present application can be realized.

[0062] First introduce several concepts used in the present invention:

[0063] 1.read: During the sequencing process, a DNA molecule is first cloned to form several copies, and then these copies are broken into several short fragments that can be directly sequenced. Each fragment is called a "read", and the sequencer genera...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to a sequencing data assembly method. According to the method, firstly, a genome assembly file is obtained by using an optical spectrum platform Irys. Meanwhile, a scffold file,namely a fai file, of an NGS is obtained. Secondly, the data preprocessing is carried out: a threshold value is set, and a comparison result with low reliability is filtered out. Meanwhile, cmp filesare merged and ranked, and an N50 is calculated. Thirdly, the assembly effect statistics is carried out: the comparison result of Bio and NGS is counted, wherein the result comprises the scaffold length, the number and the total amount of NGS in the contig of Bio Nano with the NGS. Fourthly, according to a network topological relation between the contig of Bio and the scffold of the NGS, the classified analysis is carried out on the assembled new contig length and the scffold length. Therefore, the genome assembly can be assisted, and the genome loading effect of species is obviously improved.

Description

technical field [0001] The present invention relates to bioinformatics, and in particular mainly to assisting de novo sequencing data assembly and detection of structural variation. Background technique [0002] Genome de novo sequencing is genome de novo sequencing, which refers to the sequencing of the whole genome sequence of a species whose genome sequence is unknown or has no genome of a related species. Then, the sequencing sequences were assembled, assembled and annotated by bioinformatics methods to obtain a complete genome sequence map of the species. At present, the most commonly used method is Next Generation Sequencing (NGS), but the NGS method will cause a large number of repetitive elements and structural variation information to be lost, so it becomes a difficult problem to assemble a complete genome map. [0003] BioNano Genomics has expanded nanochannel technology and developed it into a flexible optical mapping platform Irys with high resolution and extrem...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G16B30/20
CPCG16B20/00
Inventor 马丰收张艺何飞刘洋
Owner GENERGY BIO TECH SHANGHAI CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products