True and false gene mutation analysis method based on high-throughput sequencing and application

An analysis method and high-throughput technology, applied in the field of bioinformatics, can solve the problems of unmatched, undetectable C>T mutations, high cost of experiment time, etc., and achieve the effect of saving experiment cost and time

Pending Publication Date: 2021-01-08
GUANGZHOU KINGMED DIAGNOSTICS GRP CO LTD +2
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

If there is no pseudogene, the reads near this position will be compared to the real gene (although there is a mismatch, but it is still the best match), because C and T do not match, after comparison, you can find C at this position >T mutation; however, due to the existence of the pseudogene SMN2 on the genome, it is the best match for the reads near this place to be compared to the pseudogene, so the reads

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • True and false gene mutation analysis method based on high-throughput sequencing and application
  • True and false gene mutation analysis method based on high-throughput sequencing and application
  • True and false gene mutation analysis method based on high-throughput sequencing and application

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0059] A method for analyzing true and false gene mutations based on high-throughput sequencing, which is applied to the mutation analysis of SMN1 / SMN2 genes, such as figure 1 shown, including the following steps:

[0060] 1. Obtain the difference points.

[0061] Compare the reference sequences of homologous true genes and pseudogenes to obtain the difference sites with differences.

[0062] Taking SMN1:840C as an example, the site in the true gene SMN1 is located at chr5:70247773 in hg19 (Human Genome Reference Sequence, UCSC), and the corresponding base is C, while the site in the pseudogene SMN2 is located at chr5:69372353, corresponding to The base is T.

[0063] Obtain all the differential sites of the above SMN1 / SMN2 genes that are differential in hg19.

[0064] 2. Comparison of NGS data.

[0065] Obtain NGS sequencing data, compare it to the reference genome sequence (hg19), and use the optimal alignment principle to obtain the true gene reads group covering the di...

Embodiment 2

[0111] The method for analyzing true and false gene mutations based on high-throughput sequencing described in Example 1 was used to retrospectively analyze 32,853 whole-exome sequencing samples in our laboratory, and the test results found 125 homozygous patients (96 of which were Neuromuscular disease project, highly related to the SMN1 gene), 1129 heterozygous carriers.

[0112] The above results show that the mutation analysis method of true and false genes in Example 1 can provide powerful auxiliary reference information for the treatment of pseudogenes in mutation analysis, which is convenient for subsequent analysis and judgment.

Embodiment 3

[0114] A method for analyzing true and false gene mutations based on high-throughput sequencing, which is applied to the mutation analysis of CYP21A2 / CYP21A1P genes, such as figure 2 shown, including the following steps:

[0115] 1. Obtain the difference points.

[0116] The reference sequences of the homologous true and pseudogenes were compared to obtain the differential sites with differences in hg19.

[0117] There are many difference sites between this pair of true and false genes. In this embodiment, 10 pathogenicity difference sites that have been identified are subsequently analyzed.

[0118] 2. Comparison of NGS data.

[0119] Obtain NGS sequencing data, compare it to the reference genome sequence (hg19), and use the optimal alignment principle to obtain the true gene reads group covering the difference sites of the true gene and the pseudogene reads group covering the difference sites of the pseudogene, respectively. The gene reads group and the pseudogene reads ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a true and false gene mutation analysis method based on high-throughput sequencing and application, and belongs to the technical field of bioinformatics. The true and false gene mutation analysis method comprises the following steps: acquiring difference sites in reference sequences of homologous true genes and false genes; comparing the NGS sequencing data with the difference sites to respectively obtain the true gene reads number and the false gene reads number corresponding to the same difference site, taking the ratio of the true gene reads number to the false genereads number of the same difference site as a judgment index, and carrying out mutation analysis and judgment on the true gene according to a predetermined strategy. According to the kit, mutation oftrue and false genes can be preliminarily screened, genes which possibly have problems are found out, and judgment is carried out in combination with clinical actual conditions. MLPA or sanger sequencing experiments carried out on genes one by one are avoided, and the experiment cost and time are greatly saved.

Description

technical field [0001] The invention relates to the technical field of bioinformatics, in particular to a high-throughput sequencing-based method for analyzing true and false gene mutations and its application. Background technique [0002] Pseudogenes, also called pseudogenes, are nonfunctional residues of gene families formed during evolution. It is similar to a normal gene, but the DNA sequence pseudogene that loses its normal function can be regarded as a non-functional genomic DNA copy that is very similar to the coding gene sequence in the genome. Generally, it is not transcribed and has no clear physiological meaning. [0003] However, there are some genes in the human genome that have highly homologous pseudogenes, such as homologous gene pairs such as SMN1 / SMN2 and CYP21A2 / CYP21A1P. When using NGS sequencing, since the alignment of reads is based on the principle of optimal matching, when the bases in the true gene are mutated to the bases of the pseudogene, these ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G16B20/50G16B20/30
CPCG16B20/30G16B20/50
Inventor 刘晶星莫桂玲林晓红喻长顺于世辉严婷
Owner GUANGZHOU KINGMED DIAGNOSTICS GRP CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products