Targeted sequencing data simulation method and device based on NGS

A data simulation and targeted sequencing technology, applied in the field of data processing, can solve the problems of large storage space and long time-consuming CNV detection, and achieve the effect of reducing time-consuming

Active Publication Date: 2018-06-29
BEIJING KEXUN BIOTECH CO LTD
View PDF3 Cites 9 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] The embodiment of the present invention provides a method and device for simulating targeted sequencing data based on NGS, so as to at least solve the techni

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Targeted sequencing data simulation method and device based on NGS
  • Targeted sequencing data simulation method and device based on NGS

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0021] In order to enable those skilled in the art to better understand the solutions of the present invention, the following will clearly and completely describe the technical solutions in the embodiments of the present invention in conjunction with the drawings in the embodiments of the present invention. Obviously, the described embodiments are only It is an embodiment of a part of the present invention, but not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts shall fall within the protection scope of the present invention.

[0022]It should be noted that the terms "first" and "second" in the description and claims of the present invention and the above drawings are used to distinguish similar objects, but not necessarily used to describe a specific sequence or sequence. It is to be understood that the data so used are interchangeable under appropriate c...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a targeted sequencing data simulation method and device based on NGS. The method comprises the steps that multiple target area bins corresponding to a simulation sequencing depth data set which needs to be generated are determined, wherein the simulation sequencing depth data set comprises a sequencing depth simulated by each bin in multiple bins; an expectation value of the simulation sequencing depth data set is determined; a normal-distribution first random number with the expectation value as an average value and a preset variance as a variance is generated, whereinthe preset variance is a variance pre-determined according to an actual sample; multiple Poisson-distribution second random numbers with the first random number as an average value and a variance aregenerated; according to multiple adjustment parameters, the second random numbers are adjusted separately, and the simulation sequencing depth data set is generated. According to the targeted sequencing data simulation method and device based on NGS, the technical problem is solved that in the prior art, since simulated sequencing sequential data needs to be generated, the consumed time for detecting CNV is long, and the large storage space is occupied.

Description

technical field [0001] The present invention relates to the field of data processing, in particular to an NGS-based targeted sequencing data simulation method and device. Background technique [0002] Copy number variation (CNV) is an important part of genome structural variation and one of the important pathogenic factors of human diseases. Currently, the methods used for CNV research include: array-based comparative genomic hybridization (ACGH for short), SNP genotyping chip technology and next-generation sequencing (NGS). Among NGS methods, CNV detection based on read depth is the most widely used method, which is based on the assumption that the copy number is directly proportional to the number of sequencing fragments (reads). [0003] When sequencing through NGS technology, the data that needs to be used is massive. The existing public data and actual data are not enough to adjust and optimize the software. In order to obtain these massive data, different types of dat...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F19/20G06F19/28
CPCG16B25/00G16B50/00
Inventor 党明浩刘珂弟张静波关永涛王伟伟刘倩唐宇
Owner BEIJING KEXUN BIOTECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products