Method for gene association analysis on basis of deep learning algorithm

A technology of deep learning and association analysis, applied in the field of bioinformatics, can solve the problem of inability to process variable-length sequences, and achieve the effects of high accuracy, time saving and burden reduction.

Active Publication Date: 2017-08-08
HANGZHOU DIANZI UNIV
View PDF7 Cites 16 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the disadvantage of the traditional method is that it cannot handle variable-length sequences, and can only use fixed-length sequence fragments as input. Although a high prediction effect

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for gene association analysis on basis of deep learning algorithm
  • Method for gene association analysis on basis of deep learning algorithm
  • Method for gene association analysis on basis of deep learning algorithm

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0036] Reference manual attached figure 1 In the following, the technical solution of the present invention will be specifically described through implementation, but the present invention is not limited to the following examples.

[0037]Step 1: According to the existing biological knowledge, segment according to the gene distribution at the chromosome level, and obtain the effective position interval information of the SNP according to the position of the gene for subsequent segmentation of the SNP. The sample gene of CEU (Northern Europeans from Utah) is used here as a simulation.

[0038] Step 2: Assuming the case-condition control based on the population, the gene sequences of n independent individuals are used to translate the SNP at the chromosome level to obtain the required input data.

[0039] Step 3: If the manual is attached Figure 4 As shown, according to the position information obtained in step 1, the SNP sequences obtained in step 2 are grouped into SNPs ac...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a method for gene association analysis on basis of a deep learning algorithm. The method based on SNP (single nucleotide polymorphism) set analysis needs to take SNP information, which comes from different positions of the same individual but is related, as a reference, and SNP of the individual is divided into multiple units according to existing biological knowledge. Firstly, all SNP is divided into multiple SNP sets on the whole chromosome level according to related knowledge of biology such as the principles approaching genomic characteristics. After division ends, each SNP set is input into an established bidirectional LSTM (long short-term memory) network, the network is a circulating neural network, and the state of the network contains old information of the last moment and also is a basis of weight change of the next moment. After LSTM network learning is completed, the attention degree required for input data can be output through network calculation. The method has better sensitivity and specificity and a new field is explored for development and research of the clinical medicine, genetic epidemiology and preventive medicine.

Description

technical field [0001] The invention specifically relates to a gene association analysis method based on an LSTM (Long Short-Term Memory) network. The method is based on a deep convolutional neural network and a recurrent neural network model, and belongs to the technical field of bioinformatics. Background technique [0002] Research on the association between gene chromosome base pairs and pathogenicity has always been one of the core research contents of bioinformatics. Carry out data mining in huge databases, gain an in-depth understanding of the complexity of organisms, and use existing knowledge and data to analyze as much as possible. However, due to polymorphisms in genes, there are often two or more discontinuous species in biological groups. variants or genotypes or alleles, so the use of machine learning methods with the characteristics of effectiveness and intelligence to study genetic polymorphisms can open up new opportunities for the development and research o...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F19/24
CPCG16B20/00G16B40/00
Inventor 盛再超颜成钢彭冬亮薛安克
Owner HANGZHOU DIANZI UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products