SNP pathogenic factors and disease association model establishment method

A technology of causative factors and association relationships, applied in the field of data processing, can solve the problems of accuracy impact, high modeling difficulty, low model accuracy, etc., to achieve the effect of accurate association relationship model, reduce the degree of mutual influence, and simple operation

Active Publication Date: 2022-04-08
XIDIAN UNIV
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0016] 1. The accuracy of the model established by the existing method is low;
[0017] 2. SNP data has small sample characteristics, and the small sample size makes modeling difficult and affects accuracy;

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • SNP pathogenic factors and disease association model establishment method
  • SNP pathogenic factors and disease association model establishment method
  • SNP pathogenic factors and disease association model establishment method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0143] A.Dat100 dataset

[0144] This set of experimental data comes from the New York City Cancer Control Project. For this set of data, biologists embedded a total of 7 SNP pathogenic factors in the data Dat100 of 100 SNPs and 2000 samples at the same time, namely SNP pathogenic factors (98), (78), (6093), (4475), (8583100), (972047), (2581879299) (the 7 causative factors are respectively numbered 1, 2,..., 7 below), each causative factor is associated with the probability of the disease are given.

[0145] The establishment of the relationship model between each SNP pathogenic factor and disease in this set of data is completed by the technology of the present invention. Figure 5 to Figure 8 Histogram of modeling results by AD-DTEM method for the top six causative factors in the Dat100 dataset.

[0146] The present invention also uses the known real SNP pathogenic factor and disease correlation model on the set of experimental data, and the accuracy of the evaluation in...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention belongs to the technical field of data processing, and discloses a method for establishing a correlation model between SNP pathogenic factors and diseases, collecting sample data sets corresponding to current SNP pathogenic factors; absolutely dividing the sample data sets according to initial values; based on machine learning Methods SNP pathogenic factors and disease association modeling; modeling results accuracy evaluation; determine the SNP pathogenic factors and disease association model. The present invention reduces the degree of interaction among various SNP pathogenic factors through the method of absolute division, so that the established correlation model between each SNP pathogenic factor and the disease is more accurate. The present invention is simple to operate, only needs to input the original SNP data and all SNP pathogenic factors, and can obtain a more accurate correlation model between each SNP pathogenic factor and the disease.

Description

technical field [0001] The invention belongs to the technical field of data processing, and in particular relates to a method for establishing a correlation model between SNP pathogenic factors and diseases. Background technique [0002] Currently, the closest existing technology: [0003] SNP: Single Nucleotide Polymorphism (SingleNucleotidePolymorphisms), refers to the polymorphism caused by a single nucleotide (A, T, C, G) variation in the genome. More and more research evidence shows that SNPs are closely related to diseases, and this relationship is the basis for understanding the causes of diseases, medical prevention and diagnosis. An in-depth understanding of the association between SNPs and diseases can provide the possibility to understand the pathogenesis of diseases, and can also take a step forward on the road of treating and defeating complex diseases. [0004] Studies on the association between SNPs and diseases can be divided into two categories: SNP-level ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G16B20/20G16B40/20
CPCG16B20/20G16B40/20
Inventor 张军英朱皓晨
Owner XIDIAN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products