Susceptible genotype detection method

A susceptibility gene and detection method technology, applied in the fields of computer, medicine, genetics, and bioinformatics, can solve the problems of no analysis, cumbersome operation process, etc., and achieve the effect of improving efficiency, strong implantability and reducing cost.

Inactive Publication Date: 2017-10-24
SHANGHAI INST OF MATERIA MEDICA CHINESE ACAD OF SCI
8 Cites 6 Cited by

AI-Extracted Technical Summary

Problems solved by technology

For big data analysis, multi-disciplinary talents and the connection and operation of various software are required, and the operation process is cumbersome
At present, there is no method that can efficiently analyze wh...
View more

Method used

[0053] Since the susceptibility genotype can be used as a powerful evidence for disease diagnosis at the genetic level, it can be used as a prenatal screening method to achieve the purpose of disease prevention. Therefore, after the exome sequence of the sample to be tested is prepared, the susceptibility genotype detection of type I neurofibroma can be performed according to the susceptibility genotype detection method of the present invention, and the quality detection, comparative analysis, and mutation detection can be completed. , function annotation, advanced information analysis and analysis report output and other steps. Specifically, the original sequencing data undergoes data quality inspection to generate a quality data report, and at the same time removes sequence adapters of gene sequences, filters sequences with N higher than 5%, and filters sequences with a base quality value lower than 30 and more than 20%. Perform comparative anal...
View more

Abstract

The invention discloses a susceptible genotype detection method, which can be used for diagnosing type I neurofibromatosis diseases. The method comprises the steps of collecting a to-be-detected sample, capturing an exon region sequence of the to-be-detected sample, and forming original sequencing data; obtaining a sequence meeting a quality requirement according to a quality detection result of each sequence in the original sequencing data, and forming preliminary adjustment data; performing comparison according to each sequence in the preliminary adjustment data and a reference genomic sequence to obtain a comparison result, and forming mutation detection data; determining a gene sequence with a mutation gene in the mutation detection data and a mutation site corresponding to the mutation gene; and performing functional annotation on the mutation site, and determining whether the to-be-detected sample contains a to-be-detected susceptible genotype or not. According to the susceptible genotype detection method, whether the mutation site of the to-be-detected mutation gene exists in the to-be-detected sample or not can be efficiently detected, thereby providing guidance for clinical diagnosis of the diseases and prenatal screening.

Application Domain

Sequence analysisSpecial data processing applications

Technology Topic

NeurofibromatosisMutation detection +13

Image

  • Susceptible genotype detection method

Examples

  • Experimental program(1)

Example Embodiment

[0023] In the following, the structure and working principle of the present invention will be further described in conjunction with the accompanying drawings.
[0024] Such as figure 1 As shown, a susceptible genotype detection method of the present invention includes the following steps:
[0025] S1. Collect the sample to be tested, capture the sequence of the exon region of the sample to be tested, and form the original sequencing data.
[0026] S2. Perform quality testing on each sequence in the original sequencing data one by one, and obtain sequences that meet the quality requirements based on the results of the quality testing to form preliminary adjustment data.
[0027] In the embodiment of the present invention, any one of the following bioinformatics software can be used for quality detection of the gene sequence: FastQC, Cutadapt, FASTX-Toolkit, bbmap.
[0028] Specifically, the quality of each sequence in the original sequencing data can be checked by calling bioinformatics software, and appropriate screening and trimming can be performed, including: removing gene sequences with N higher than 5%; deleting the quality value lower than 30 The proportion of the sequence is higher than 20%, so that its quality value meets the requirements. In the embodiment of the present invention, the original sequencing data that meets the quality requirements include sequences that contain less than 5% of N and have a base quality value of more than 30 (Q30) above 80%, and the preliminary adjustment data is only composed of Sequence composition that meets quality requirements.
[0029] In this embodiment of the present invention, before performing quality testing on each gene sequence in the original sequencing data, the sequence linker of each sequence should be removed, where the sequence linker is the sample tag of each sequence.
[0030] S3. Compare each sequence in the preliminary adjustment data with the reference genome sequence to obtain the comparison result to form the mutation detection data.
[0031] In the embodiment of the present invention, to compare each gene sequence in the preliminary adjustment data with the reference genome sequence, any one of the following bioinformatics software can be called: BWA, Samtools, Picard, GATK, QualiMAp, IGV, R.
[0032] Specifically, in order to obtain more accurate comparison results, each sequence in the preliminary adjustment data is compared with the reference genome sequence, and according to the comparison result of each sequence and the reference-based group sequence, the preliminary adjustment data is compared with the reference-based group sequence The repeated sequence deletion of the same part prevents redundant data from appearing, and the base quality score calibration of the sequence without repeated sequence after the redundant data is deleted is re-calibrated, and the final mutation detection data is obtained, which is provided for the mutation detection analysis step Raw materials.
[0033] S4. Perform mutation detection analysis on each sequence in the mutation detection data, and determine the gene sequence with the mutation gene and the mutation site corresponding to the mutation gene in the mutation detection data.
[0034] In the embodiment of the present invention, any one of the following bioinformatics software can be used to perform mutation detection and analysis on each gene sequence in the mutation detection data: GATK, BEDTools, VCFtools, bcftools, and mutation detection and analysis can be performed on single nucleotides. Polymorphism detection, insertion and deletion detection and copy number variation detection.
[0035] Specifically, the corresponding bioinformatics software can be called to perform single nucleotide polymorphism detection, insertion and deletion detection, and copy number variation detection for each sequence in the mutation detection data, to find the gene mutation site, and simultaneously detect Type of mutation.
[0036] S5. Perform functional annotation of the mutation site and determine whether the sample to be tested contains the susceptible genotype to be tested.
[0037] In the embodiment of the present invention, any one of the following bioinformatics software can be used for functional annotation of the mutation site: Annovar, SnpEff, SnpSift. Among them, the annotation database mainly includes: refGene, cytoBAnd, gwasCatalog, clinvar, dbsnp138, etc.
[0038] Specifically, by calling the bioinformatics software, functional annotations can be performed on the gene region, gene interval, and untranslated region of the mutation site. If the result of the functional annotation is found to be consistent with the susceptible genotype to be detected If they match, it is determined that the sample to be tested contains the susceptible genotype mutation site to be tested.
[0039] In the embodiment of the present invention, the susceptibility genotype to be detected is the susceptibility genotype of type I neurofibromas, including the genotype of rs1801052 being AA and the genotype of rs1129506 being AA in the type I neurofibromas genes, that is, when detecting When the sample to be tested contains any of these two genotypes, it can be considered that there is a high-risk mutation site of type I neurofibromatosis in the sample to be tested, and the patient to be tested may have type I nerve. High-risk groups of fibroids.
[0040] In another embodiment of the present invention, when it is determined that the sample to be tested contains the susceptible genotype to be tested, it is also possible to verify the test result of the sample to be tested based on the entire exome genome sequence of the patient’s immediate family members. The genetic sequence testing of immediate family members can therefore improve the accuracy of the testing results from a genetic perspective.
[0041] In another embodiment of the present invention, it is also possible to perform additional detection on gene sequences with mutant genes that have been functionally annotated to obtain one or more related detection results of mutation harmfulness, candidate genes, and protein mutations. Specifically, according to the user's requirements, the corresponding bioinformatics software can be called to complete the corresponding analysis and output the corresponding results.
[0042] Additional testing includes at least:
[0043] The analysis of the harmfulness of mutations can make the corresponding harmfulness ranking according to the influence of each mutation on the gene function;
[0044] Candidate genes and disease relevance ranking can rely on the results of database annotations to evaluate the impact of each mutation on the corresponding disease, especially the impact on NF1 disease;
[0045] Candidate gene function annotation;
[0046] Candidate gene function enrichment analysis can complete the detection and screening of candidate genes by calling function annotations and the built-in script library, and at the same time restore the biological function pathways of the selected high-confidence genes.
[0047] Protein mutation prediction can predict the impact on the three-dimensional structure from the predicted changes in the primary structure of the protein.
[0048] In another embodiment of the present invention, according to the results of steps S2-S5, the quality data report, the comparison data report, the mutation data report, and the mutation function evaluation report can also be summarized separately, and according to the quality data report, the comparison data report, and the mutation data report And the mutation function evaluation report output the susceptible genotype test result report.
[0049] Among them, while the quality of each gene sequence in the original genetic data is tested, it is evaluated whether the quality of the trimmed data meets the requirements of the subsequent analysis process to obtain a quality data report. The quality data report mainly includes sequencing fragments. Base quality distribution, four-base content distribution in sequenced fragments, GC content of sequenced fragments, etc.; while matching each gene sequence in the preliminary adjustment data with the reference genome sequence, the corresponding sequencing analysis results can be obtained. In order to obtain a comparative data report, the comparative data report includes statistical results of comparison rates, exon coverage depth and distribution, evaluation of exon region capture specificity, statistical results of insert distribution, etc., which are used to evaluate the results of sequencing experiments. It is related to the reliability of subsequent mutation site detection results; after the mutation detection and analysis of each gene sequence in the mutation detection data, the found gene mutation sites can be counted, and the corresponding Venn diagram can be drawn to obtain the mutation data Report: After functionally annotating the mutation site, a mutation function evaluation report is obtained according to the result of the functional annotation.
[0050] Finally, the quality data report, comparison data report, mutation data report, and mutation function evaluation report are integrated into written reports and data files, and a susceptible genotype test result report with professional annotations is output, and the final analysis result is displayed to the user. Data storage and backup can be performed.
[0051] According to an embodiment of the present invention, the susceptible genotype detection method of the present invention can be used to detect whether a patient has the susceptible genotype of type I neurofibromas. The detection method mainly includes: genomic DNA sample preparation, library construction, Steps such as quality inspection, comparison analysis, mutation detection, function annotation, advanced information analysis and analysis report output.
[0052] First, randomly break the genomic DNA of the sample to be tested into fragments of 150 to 200 bp, prepare multiple sequences of the sample to be tested, and then construct a library with the prepared multiple sequences of the sample to be tested. The library sequence passes through the exon region The specific biotin-labeled DNA probe is hybridized and captured by magnetic beads with capture probe function, and finally the captured sequence is eluted from the magnetic beads to obtain sequence fragments of the target region. The specific capture process refers to: SureSelectXTTarget Enrichment System for Illumina Paired-End Sequencing Library, IlluminaHiSeq and MiSeq Multiplexed Sequencing Platforms, Protocol Version 1.3.1, February 2012.
[0053] As the susceptible genotype can be used as a powerful evidence for disease diagnosis at the genetic level, it can be used as a prenatal screening method to achieve the purpose of disease prevention. Therefore, after the exome sequence of the sample to be tested is prepared, the susceptible genotype detection of type I neurofibromas can be performed according to the susceptible genotype detection method of the present invention, and the quality detection, comparison analysis, and mutation detection can be completed , Function annotation, advanced information analysis and analysis report output. Specifically, the original sequencing data undergoes data quality testing to generate a quality data report, and at the same time remove the sequence connectors of the gene sequence, filter the sequences with N higher than 5%, and filter the sequences with base quality values ​​lower than 30 and more than 20%. The preliminary adjustment data obtained after the quality inspection process is compared and analyzed, compared with the reference genome sequence, and a comparative data report is generated. Perform mutation detection analysis on the mutation detection data obtained after comparison and analysis, and perform mutation detection such as single nucleotide polymorphism, insertion and deletion, and copy number variation. The detected mutation sites are functionally annotated to evaluate the function of the mutation sites. Finally, the patient’s NF1 gene identified the genotype of rs1801052 as AA and the genotype of rs1129506 as AA. It is believed that there is a high-risk mutation site of type I neurofibromas in the sample to be tested, and the patient is likely to have type I Neurofibromas. Finally, it is verified by referring to the whole exome sequencing data of the father and/or mother to increase the accuracy and reliability of the test results.
[0054] The above is only a schematic description of the present invention. Those skilled in the art should know that various improvements can be made to the present invention without departing from the working principle of the present invention, which all fall within the protection scope of the present invention.

PUM

no PUM

Description & Claims & Application Information

We can also present the details of the Description, Claims and Application information to help users get a comprehensive understanding of the technical details of the patent, such as background art, summary of invention, brief description of drawings, description of embodiments, and other original content. On the other hand, users can also determine the specific scope of protection of the technology through the list of claims; as well as understand the changes in the life cycle of the technology with the presentation of the patent timeline. Login to view more.

Similar technology patents

Systems and methods for notifying multiple hosts from an industrial controller

ActiveUS8150959B1reusable block of code very difficultimprove efficiency
Owner:ROCKWELL AUTOMATION TECH

Hybrid OLED having improved efficiency

InactiveUS20080284317A1improve efficiency
Owner:GLOBAL OLED TECH

MIMO-OFDM transmitter

InactiveUS20070253504A1improve efficiencyreduce time
Owner:FUJITSU LTD

Classification and recommendation of technical efficacy words

  • Improve efficiency
  • low cost

System and method for transmitting wireless digital service signals via power transmission lines

ActiveUS7929940B1reduce bandwidth requirementlow cost
Owner:NEXTEL COMMUNICATIONS

Plastic waveguide-fed horn antenna

InactiveUS20100214185A1low cost
Owner:RGT UNIV OF CALIFORNIA

System and method for determination of position

InactiveUS20090149202A1low costreduce requirement
Owner:STEELE CHRISTIAN

Adaptive antenna optimization network

InactiveUS6961368B2low costminimal space
Owner:ERICSSON INC
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products