Unlock instant, AI-driven research and patent intelligence for your innovation.

Sequencing data processing system and SMN (survival motor neuron) gene detecting system

A technology for sequencing data and processing system, which is used in electrical digital data processing, special data processing applications, and microbial determination/inspection. Reduce cost, improve ease of use, low cost effect

Active Publication Date: 2017-10-20
AEGICARE (SHENZHEN) TECH CO LTD
View PDF7 Cites 10 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The purpose of the present invention is to overcome the above-mentioned deficiencies of the prior art, and provide a sequencing data processing system and an SMN gene detection system, aiming to solve the technical problems of the existing SMN gene detection method, which are cumbersome in the experimental process, and have low precision and poor accuracy

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Sequencing data processing system and SMN (survival motor neuron) gene detecting system
  • Sequencing data processing system and SMN (survival motor neuron) gene detecting system
  • Sequencing data processing system and SMN (survival motor neuron) gene detecting system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0068] A method for processing sequencing data, comprising the steps of:

[0069] S111: Obtain high-throughput sequencing data containing the SMN gene.

[0070] S112: Annotate all exons of the SMN2 gene in the reference genome (chromosome 5: 69344512-69373860 base pairs, exons 1 to 7) as X, and use the BWA-MEM software to compare the sequencing data with the annotated The sequence comparison of the reference genome was carried out to obtain the matching sequence in the sequencing data.

[0071] S113: Find all mutations on the SMN1 gene in the annotated reference genome from the matching sequence, and combine the difference base sites (ie SMN1 / SMN2 difference sites, located at position 70247773 on chromosome 5, where SMN1 is C, and SMN2 is T), determine all the mutation sites of the SMN gene in the sequencing data, and use the Hidden Markov method to obtain the total copy number of Exon 7 of the SMN gene, and the Hidden Markov method formula is as follows:

[0072]

[0073...

Embodiment 2

[0078] Computer simulation testing the effect of annotation positioning of the reference genome in Example 1:

[0079] By annotating the SMN2 exon sequence in the reference genome as X, the sequencing sequences of both SMN1 and SMN2 genes were accurately mapped to SMN1, and the positioning results are as follows figure 1 Shown: figure 1 The first line in the table is exon 1-7 of SMN1, and the second line is exon 1-7 of SMN2; the hollow box plot indicates that the standard reference genome (unannotated) is used for gene mapping, which is recorded as the original The reference genome (P), and the dark solid box plot indicates the gene location after the reference genome SMN2 is annotated with X, which is recorded as the reference genome after annotation (M), and the abscissa indicates four different test data sets (specifically : SR1: 48 samples; SR2: 48 samples; SR3: 48 samples; SR4: 48 samples), the ordinate represents the number of uniquely mapped sequencing sequences.

[0...

Embodiment 3

[0082] The sequence matching of the control group (sequencing data without SMN region) and the experimental group (sequencing data containing SMN region) after reference genome annotation were compared, and the detailed analysis results are shown in Table 1 and Table 2.

[0083] Table 1 is the control group: DNA capture does not contain SMN regions (ie, does not contain SMN1 and SMN2 regions); Table 2 shows the experimental group: DNA capture contains SMN regions (ie, contains SMN1 and SMN2 regions). The results of the data in Table 1 and Table 2 below show that after the reference genome annotation, the sequencing sequence that could not be uniquely matched before was successfully matched to SMN1, and the sequencing sequence that was previously matched to SMN2 was also matched to SMN1, while other regions of the genome Has little effect.

[0084] Table 1

[0085]

[0086] Table 2

[0087]

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention belongs to the technical field of gene sequencing and specifically relates to a sequencing data processing system and an SMN (survival motor neuron) gene detecting system. The system comprises a data acquisition unit, a sequence alignment unit and an information determination unit, wherein the data acquisition unit is used for obtaining high-throughput sequencing data containing SMN genes, the sequence alignment unit is used for annotating all exons of the SMN 2 genes in a reference genome and performing sequence alignment on the sequencing data and the annotated reference genome to obtain a matching sequence in the sequencing data, and the information determination unit is used for determining mutation information of the SMN genes in the sequencing data according to different basic group sites between the matching sequencing and number 7 exon in the SMN genes. The sequencing data processing system disclosed by the invention not only can comprehensively and accurately detect SMN 1 and SMN 2 sequences, obtain other mutation sites and copy number information and provide more virulence gene information, but also can be directly integrated with existing common detecting procedures, effectively improve detection usability and reduce detection cost.

Description

technical field [0001] The invention belongs to the technical field of gene sequencing, and in particular relates to a sequencing data processing system and an SMN gene detection system. Background technique [0002] Spinal muscular atrophy (SMA) refers to a group of inherited neuromuscular diseases that cause proximal muscle weakness and atrophy due to degeneration of anterior horn cells of the spinal cord. Motor neuron survival genes (survival motorneuron, SMN) are its causative genes, including SMN1 and SMN2. SMN1 and SMN2 have always had difficulties in the detection of genetic diseases. There are two main reasons: one is that the two disease-causing genes are in a local repetitive region, and the two are close in the genome, and the sequences are almost identical. There is only one It can be used to distinguish the variation sites of the two genes; the second is that the copy number variation of these two genes in the population is very important for its pathogenicity,...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): C12Q1/68G06F19/22
CPCC12Q1/6869C12Q1/6883C12Q2600/156G16B30/00C12Q2535/122
Inventor 李阳刘阳张洋顾卓雅吕佩涛
Owner AEGICARE (SHENZHEN) TECH CO LTD