Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Whole-exome sequencing data analysis system

A technology of sequencing data and whole exons, applied in digital data processing, special data processing applications, genomics, etc., can solve the problem of inability to identify low-frequency pathogenic variants and pathogenic mutations

Inactive Publication Date: 2016-10-12
WANKANGYUAN TIANJIN GENE TECH CO LTD
View PDF0 Cites 23 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the limitations of GWAS are: most of the identified association sites are located in the intergenic regions, introns, and regulatory regions of the genome; secondly, the probes of the chip are designed based on currently known (most of them are common SNPs) , failure to identify low-frequency pathogenic variants and novel pathogenic variants

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Whole-exome sequencing data analysis system
  • Whole-exome sequencing data analysis system
  • Whole-exome sequencing data analysis system

Examples

Experimental program
Comparison scheme
Effect test

example

[0052] 1. Data introduction

[0053] Data Type: Whole Exome Sequencing

[0054] Tissue source: DNA from cancer tissue and peripheral blood of the same patient

[0055] Experimental Design: Exon Capture Sequencing

[0056] Sequencing platform: Illumina Hiseq 2000, paired-end sequencing

[0057] Average read length: 100bp

[0058] The quality statistics of the original sequencing data are shown in Table*.

[0059] Table 4.1 Whole-exome sequencing data quality statistics

[0060]

[0061] 2. System use

[0062] The whole exome sequencing data analysis process includes: sequencing data quality assessment and control, high-quality read screening, read comparison to the reference genome, searching for genomic variation, paired samples for somatic mutation, calculation of copy number variation, functional annotation, etc. . Next, each analysis step will be realized step by step by using the function modules integrated in the software.

[0063] (1) Quality control of raw se...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a whole-exome sequencing data analysis system. The system comprises a quality control module which is used for assessing single base quality in an original sequencing data file and read quality; a genome mapping module which is used for finishing a read to genome mapping process by employing an aln algorithm of a BWA; a genome variation module which is used for finding variation sites in a genome by employing a Unified Genotyper method of a GATK packet; and a variation site annotation module which is used for annotating variation candidate sites or a genome interval. According to the system, large-scale data analysis is finished through simple parameter submission; the analysis comprises quality detection of original data, data denoising and sequencing upstream to downstream original sequencing data of genome mapping of the read; the sequencing data is analyzed through a parameter automatic submission and analysis module; the candidate pathogenic mutation sites and related genes are output; and the basis is provided for later experiment verification.

Description

technical field [0001] The invention belongs to the field of gene information data processing, and in particular relates to a whole exome sequencing data analysis system. Background technique [0002] With the completion of the Human Genome Project and the construction of the international human haplotype map, the prediction and functional research of disease susceptibility loci by analyzing genome information has been rapidly promoted. This type of research is mainly based on biochip-based genotyping technology, using genome-wide association analysis (GWAS) methods to find genetic factors associated with complex diseases. With the increasing density of probes in biochips, especially the design of shingled probes, the mining of disease risk sites is becoming more and more comprehensive. However, the limitations of GWAS are: most of the identified association sites are located in the intergenic regions, introns, and regulatory regions of the genome; secondly, the probes of t...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F19/18
CPCG16B20/00
Inventor 薛成海吕艳玲郑文辉
Owner WANKANGYUAN TIANJIN GENE TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products