Gene and phenotype association knowledge base and establishment method and application thereof

A knowledge base and gene technology, applied in the field of bioinformatics, can solve problems that consume money, manpower and time, and have not been reported

Pending Publication Date: 2021-04-20
BEIJING GRANDOMICS BIOTECH
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

In practical applications, in order to determine the candidate gene that causes the disease according to the phenotype, it is usually necessary to use sequencing to obtain the sequence information of the individual, determine the variant sites that are closely related to the phenotype from a large number of variant sites, and compare them with database information or scientific literature. Conduct comparative analysis, which is a process that consumes a lot of money, manpower and time
[0004] With the development of computer technology, bioinformatics has advanced by leaps and bounds, and the emergence of various biological databases has provided the possibility to realize the association of genes and phenotypes based on computer technology, but the method of constructing gene-phenotype association information has not been reported yet.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Gene and phenotype association knowledge base and establishment method and application thereof
  • Gene and phenotype association knowledge base and establishment method and application thereof
  • Gene and phenotype association knowledge base and establishment method and application thereof

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0053] This embodiment discloses a method for constructing a gene-phenotype association knowledge base, including the following steps:

[0054] S1: Get document entity;

[0055] S2: Determine and identify the document type;

[0056] S3: extract the gene entry and phenotype entry in the literature entity, and obtain the literature corpus;

[0057] S4: store the association relationship between genes and phenotypes, and obtain the knowledge base of associations between genes and phenotypes.

[0058] Specific steps are as follows:

[0059] S1: Get Document Entity

[0060] The PubMed database (https: / / www.ncbi.nlm.nih.gov / pubmed / ) was used to collect literature title and abstract information. Compared with the full-text literature, its information volume is smaller and the analysis efficiency is higher. As of July 2018, a total of 27,853,513 articles have been obtained.

[0061] S2: Determine the type of document

[0062] Before judging the document type, a filtering step is...

Embodiment 2

[0101] This embodiment discloses a gene-phenotype association knowledge base. The gene-phenotype association knowledge base includes a document acquisition unit, a document type judgment unit, an entry extraction unit and a storage unit.

[0102] The document acquisition unit is used to acquire document entities. The document type judging unit is used for judging and identifying the document type. The entry extraction unit is used to extract gene entries and phenotype entries in the document entity to obtain the document corpus. The storage unit is used to store the association relationship between genes and phenotypes, and obtain the knowledge base of associations between genes and phenotypes.

Embodiment 3

[0104] This embodiment discloses a method for quantifying the relationship between genes and phenotypes using the gene-phenotype association knowledge base constructed by the method described in Example 1, or using the gene-phenotype association knowledge base described in Example 2, Include the following steps:

[0105] (1) Extract the association information of target phenotype and target gene

[0106] (2) Calculate the amount of information of each phenotype and each gene separately

[0107] use the formula Calculate the amount of information P of phenotype y y . G y is the number of genes associated with phenotype y, G total is the total number of all gene sets. The parent phenotype of phenotype y is phenotype z, G z is the number of genes associated with phenotype z. Wherein, the parent phenotype refers to the phenotype including the upper level, and the data of the parent phenotype comes from the HPO database. For example, under the HP:0012647 (abnormal inflamm...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention belongs to the field of bioinformatics, and particularly relates to a gene and phenotype association knowledge base and an establishment method and application thereof. The gene and phenotype association knowledge base integrates, extracts and stores a large number of association relationships between genes and corresponding phenotypes, realizes automatic quantification of the association relationships between the genes and the phenotypes in combination with an algorithm, and can be applied to the field of scientific research or medical science. Compared with a traditional mode, the knowledge base is reliable, universal, flexible and efficient, manpower and material resources are saved, and practicability is high.

Description

technical field [0001] The invention belongs to the field of bioinformatics, and in particular relates to a gene-phenotype correlation knowledge base and its construction method and application. Background technique [0002] Mendel's laws of inheritance were published by the Austrian imperial geneticist Gregor Mendel in 1865, including the law of segregation and the law of free assortment. Phenotypes or traits that conform to Mendel's law of inheritance can also be called monogenic traits, which exist widely in all walks of life, such as animal coat color, rice grains with or without awn, and human single and double eyelids. Typically, phenotypic differences are due to one or a few mutations in a single gene. [0003] Specifically, taking human Mendelian diseases as an example, about 7,000 Mendelian diseases have been discovered, of which only about 5,000 have relevant information on genetic molecular mechanisms. In practical applications, in order to determine the candida...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G16B20/00G16B50/30G16B40/00
Inventor 朱赢朱沁汪德鹏
Owner BEIJING GRANDOMICS BIOTECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products