Gene-gene interaction recognition method based on sparsity factor analysis

An interaction and gene technology, applied in the field of genetics, can solve the problem of insufficient recognition ability of gene-gene interaction

Inactive Publication Date: 2019-03-19
PEKING UNIV
View PDF0 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The purpose of the present invention is to solve the problem of insufficient gene-gene interaction recognition ability of existing algorithms in dealing with high-dimensional situations

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Gene-gene interaction recognition method based on sparsity factor analysis
  • Gene-gene interaction recognition method based on sparsity factor analysis
  • Gene-gene interaction recognition method based on sparsity factor analysis

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0014] Suppose K={1,2,...,k} is a set of SNP sites, code x k ={-1,0,1}, k∈K; y={0,1} is a binary quality trait, define M={1,2,...,m}, mm , m∈M; n×k matrix X is the standardized genotype coding matrix, and n×m matrix Z is the hidden variable matrix. Define a linear transformation W with dimension k×m, which satisfies Z=XW and X′=ZW T , and define the residual matrix as Ψ=X-X′.

[0015] The model structure is as figure 2 As shown, the genotype coding matrix X is projected onto the latent variable matrix Z through the linear mapping W, and then through the linear transformation ZW T Restore to X', and minimize the error term Ψ. Among them, according to the sparsity assumption, the dimension m<

[0016] Assuming that the error function of data X and X′ is l, the model can be expressed as:

[0017]

[0018] Among them, ρ and γ are both adjustment parameters. When γ approaches +∞, the model approaches LASSO, and when γ approaches 1, the...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a gene-gene interaction recognition method based on the sparsity factor analysis (Sparse Factor Analysis for Epistasis, EPISFA). The method comprises the following steps of 1)entering genotype raw data and screening according to a linkage disequilibrium coefficient between genes; 2) randomly dividing the data into blocks; 3) dividing the data into diseased and non-diseasedgroups according to a disease state, calculating the correlation coefficient matrix of two groups, and using Fisher transform to deduct gene site correlation of the correlation coefficient matrix ofthe two groups; 4) learning model weights using a sparsity factor analysis method; 5) conducting cross-validation, selecting model parameters, and identifying corresponding gene-gene interaction. Experiments show that the statistical efficiency and computational efficiency of the method are both high and the method has good application prospects.

Description

1. Technical field [0001] The invention relates to the field of genetics, in particular to a gene-gene interaction recognition method based on sparse factor analysis. 2. Background technology [0002] Studying genetic susceptibility to complex diseases has always been an important issue in the field of genetics. Although genome-wide association studies in recent years have found a large number of polymorphic sites associated with diseases, the one-dimensional information that only includes polymorphic sites is far from explaining the heritability of complex diseases in the population. Gene-gene interactions are one of the main reasons for such loss of heritability. [0003] Genetic research in the era of genome-wide association studies often detects a large number of polymorphic sites at one time, so it is difficult to overcome the problem of "dimension expansion" using traditional hypothesis testing methods. To this end, many machine learning-based algorithms have been pr...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G16B20/00G16B40/00
Inventor 项骁胡永华王斯悦
Owner PEKING UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products