A method for gene association analysis based on deep learning algorithm

A technology of deep learning and association analysis, applied in the field of bioinformatics, can solve problems such as inability to handle variable-length sequences, achieve high accuracy, save time, and reduce burden

Active Publication Date: 2020-07-17
HANGZHOU DIANZI UNIV
View PDF7 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the disadvantage of the traditional method is that it cannot handle variable-length sequences, and can only use fixed-length sequence fragments as input. Although a high prediction effect has been obtained, it was found in subsequent studies that the distance interval in the sequence is large. Residues can interact with each other, and researchers have to account for these errors

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A method for gene association analysis based on deep learning algorithm
  • A method for gene association analysis based on deep learning algorithm
  • A method for gene association analysis based on deep learning algorithm

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0036] Reference manual attached figure 1 In the following, the technical solutions of the present invention will be specifically described through implementation, but the present invention is not limited to the following embodiments.

[0037] Step 1: According to the existing biological knowledge, segmentation is performed at the chromosome level according to the gene distribution, and the effective position interval information of the SNP is obtained according to the location of the gene, for subsequent segmentation of the SNP. The sample genes of CEU (NorthernEuropeans from Utah) are used as a simulation.

[0038] Step 2: Suppose that based on the case control of the population, the gene sequence of n independent individuals is used to translate the SNP at the chromosome level to obtain the required input data.

[0039] Step 3: As attached Figure 4 As shown, according to the position information obtained in step 1, the SNP sequence obtained in step 2 is grouped according to the ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a method for gene association analysis on basis of a deep learning algorithm. The method based on SNP (single nucleotide polymorphism) set analysis needs to take SNP information, which comes from different positions of the same individual but is related, as a reference, and SNP of the individual is divided into multiple units according to existing biological knowledge. Firstly, all SNP is divided into multiple SNP sets on the whole chromosome level according to related knowledge of biology such as the principles approaching genomic characteristics. After division ends, each SNP set is input into an established bidirectional LSTM (long short-term memory) network, the network is a circulating neural network, and the state of the network contains old information of the last moment and also is a basis of weight change of the next moment. After LSTM network learning is completed, the attention degree required for input data can be output through network calculation. The method has better sensitivity and specificity and a new field is explored for development and research of the clinical medicine, genetic epidemiology and preventive medicine.

Description

Technical field [0001] The invention specifically relates to a gene association analysis method based on an LSTM (Long Short-Term Memory) network. The method is based on a deep convolutional neural network and a recurrent neural network model and belongs to the technical field of bioinformatics. Background technique [0002] The research on the association between gene chromosome base pairs and pathogenicity has always been one of the core research contents of bioinformatics. Carry out data mining in a huge database, deeply understand the complexity of organisms, and use existing knowledge and data to analyze as much as possible. However, due to the polymorphism of genes, there are often two or more discontinuous types in biological groups. Variant or genotype or alleles, so the use of machine learning methods with the characteristics of effectiveness and intelligence to study gene polymorphism can open up new research for the development of clinical medicine, genetics and preven...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G16B20/20G16B20/30G16B50/00
CPCG16B20/00G16B40/00
Inventor 颜成钢盛再超彭冬亮薛安克
Owner HANGZHOU DIANZI UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products