Pathogenic gene locus database and establishment method thereof

A disease-causing gene and method establishment technology, applied in genomics, instrumentation, proteomics, etc., can solve problems such as unknown significance, neglected sites, and large amounts

Pending Publication Date: 2020-10-20
GUANGZHOU KINGMED DIAGNOSTICS CENT +1
View PDF0 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] In practice, uncommon loci are difficult to collect enough samples for pathogenicity research, so they are not included in the database, but due to the diversity of gene mutations and disease symptoms (the same gene Different mutations may cause different symptoms) and heterogeneity (a symptom may be caused by multiple different gene mutations), the proportion of disease-causing sites that have been discovered so far is very low, that is, the significance of many mutations is unknown, Although these individual rare loci are relatively rare, their total number is large
[0005] And these unverified data have played a very important role in prompting the detection of pathogenic gene mutations. If we only rely on the common sites included in the database for genetic testing, many meaningful sites will be ignored. The impact on the compound heterozygous disease-causing gene is very large, which greatly increases the difficulty of detection and reduces the diagnostic efficiency

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Pathogenic gene locus database and establishment method thereof
  • Pathogenic gene locus database and establishment method thereof
  • Pathogenic gene locus database and establishment method thereof

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0042] A disease-causing gene locus database is established by the following methods:

[0043] 1. Obtain reference data.

[0044] Obtain clinically verified disease-causing gene locus data information as reference data.

[0045]For example, clinically verified disease-causing gene locus data information can be obtained from databases that include pathogenic loci, such as HGMD, ClinVar, etc. In this embodiment, the HGMD database is used as the basis for expansion.

[0046] 2. Expand to obtain mutation site data.

[0047] Amino acid changes caused by base mutations are the most common type of mutations recorded in the disease-causing site database. There are often many types of base mutations that cause a certain amino acid change. Some require mutations to specific amino acids to cause disease, and some require mutations. Mutations to stop codons cause disease, and some only require amino acid changes to cause disease, but only sites with published research results are includ...

example 1

[0056] For example, the 268th amino acid of the DMD gene is Leu, and the codon is TTA, and there is only one record in the database, namely [c.804A>C; p.Leu268Phe]. After checking the codon table, there may be 9 single-base mutations at this codon, which are as follows:

[0057] 1) Two are termination mutations: [c.803T>A; p.Leu268Term], [c.803T>G; p.Leu268Term];

[0058] 2) Five missense mutations: [c.804A>C; p.Leu268Phe], [c.804A>T; p.Leu268Phe], [c.803T>C; p.Leu268Ser], [c.802T> A; p.Leu268Ile], [c.802T>G; p.Leu268Val];

[0059] 3) There are two synonymous mutations: [c.804A>G; p.Leu268Leu], [c.802T>C; p.Leu268Leu];

[0060] After comparing with the reference data of pathogenic mutation sites [c.804A>C; p.Leu268Phe] in the HGMD database, [c.804A>T; p.Leu268Phe] can be expanded to class I; [c.803T >A; p.Leu268Term], [c.803T>G; p.Leu268Term] can be extended to type II; [c.803T>C; p.Leu268Ser], [c.802T>A; p.Leu268Ile], [ c.802T>G; p.Leu268Val] can be extended to category Ⅲ...

example 2

[0062] The 333rd amino acid of DMD gene is Ser, the codon is TCA, and there is only one record in the database, namely [c.998C>A; p.Ser333Term]. After checking the amino acid codon table, there are 9 single-base mutations at this codon, which are as follows:

[0063] 1) There are two termination mutations: [c.998C>A, p.Ser333Term], [c.998C>G, p.Ser333Term];

[0064] 2) There are 4 missense mutations: [c.998C>T, p.Ser333Leu], [c.997T>C, p.Ser333Pro], [c.997T>A, p.Ser333Thr], [c.997T >G,p.Ser333Ala];

[0065] 3) There are three synonymous mutations: [c.999A>T, p.Ser333Ser], [c.999A>C, p.Ser333Ser], [c.999A>G, p.Ser333Ser];

[0066] After comparing with the reference data [c.998C>A; p.Ser333Term] of the pathogenic mutation site in the HGMD database, it can be concluded that [c.998C>G, p.Ser333Term] can be extended to class I, and can also be extended to Class II, because class I overlaps with class II, that is to say, the mutation in the original database is a termination muta...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a pathogenic gene locus database and an establishment method thereof, and belongs to the technical field of disease gene detection. The establishment method of the pathogenicgene locus database comprises the following steps: acquiring clinically verified pathogenic gene locus data information as reference data; acquiring gene loci causing pathogenicity due to amino acid change in the reference data, and expanding codons of amino acids at the loci; acquiring gene loci causing pathogenicity due to splicing locus change in the reference data, and expanding other mutationforms of the loci; and screening the data, removing loci of which the population mutation occurrence frequency is higher than a predetermined threshold value, remaining high-risk pathogenic mutationloci and high-risk pathogenic shearing loci, and combining the remaining high-risk pathogenic mutation loci and high-risk pathogenic shearing loci with the reference data to form the pathogenic gene locus database. The database records a large number of site records with high pathogenic risk, the possibility of omission can be reduced, and the accuracy and efficiency of clinical interpretation work are greatly improved.

Description

technical field [0001] The invention relates to the technical field of disease gene detection, in particular to a disease-causing gene locus database and a method for establishing the same. Background technique [0002] Gene mutations are divided into polymorphism and pathogenicity. There are about 4 million mutations in each person's genome, most of which are normal non-pathogenic sites, namely polymorphic sites, while pathogenic sites It needs to go through complex process verification, which is a long-term accumulation process. [0003] At present, there are many databases that include pathogenic loci, such as HGMD, ClinVar, etc., but these databases include mutations that have actually occurred, that is, mutations supported by real sample cases, which are obtained after comparison with clinical symptoms and verification. That is, most of the sites included in the database are relatively common sites. [0004] In practice, uncommon loci are difficult to collect enough s...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G16B50/00G16B20/50G16B20/10
CPCG16B50/00G16B20/50G16B20/10
Inventor 刘晶星于世辉喻长顺
Owner GUANGZHOU KINGMED DIAGNOSTICS CENT
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products