Unlock instant, AI-driven research and patent intelligence for your innovation.
Method for establishing prediction model of complex data
What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A complex data and predictive model technology, applied in the direction of electrical digital data processing, special data processing applications, hybridization, etc., can solve problems such as distinctions
Inactive Publication Date: 2017-08-08
赵乐平
View PDF0 Cites 22 Cited by
Summary
Abstract
Description
Claims
Application Information
AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology
Problems solved by technology
[0011] Although OOR is closely related to kernel machine methods, there are differences
Method used
the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more
Image
Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
Click on the blue label to locate the original text in one second.
Reading with bidirectional positioning of images and text.
Smart Image
Examples
Experimental program
Comparison scheme
Effect test
no. 1 example
[0053] First Embodiment: Next, the method of the present invention will be described in detail by taking the process of constructing a prediction model of high-dimensional omics data from clinical translational research as an example.
[0054] 1. Method:
[0055] 1.1. Motivation
[0056] 1.1.1. Problem Statement: Take n objects (i=1,2,...,n) in the database as samples. On each i-th subject (Xi), the observed set of high-dimensional (p-dimensional in this case) sparse covariates is denoted as X i =(x i1 ,x i2 ,...,x ip ), based on the typical characteristics of HDOD, where the number of covariates is usually much larger than the sample size. The corresponding target Y is also observed on each i-th object i The outcome variable for , which can be binary, categorical, continuous, or truncated (that is, partially observed). The likelihood of all observed data can be written as
[0057]
[0058] Among them, in the above-mentioned summation function, n objects are summed ...
no. 2 example
[0166] Second Embodiment: Next, taking the construction of a disease prediction model of polymorphic and multi-allelic HLA genes as an example, the method of the present invention will be further described in detail.
[0167] 1. Method
[0168] 1.1. Motivation
[0169] Analysis of covariate data generated from high-dimensional polymorphic genetic studies. Specifically, including T1D and eight class II HLA genes (HLA*DRB1, *DRB3, *DRB4, *DRB5, *DQA1, *DQB1, *DPA1, *DPB1) (manuscript: Zhao et al 2015, to be submitted) case-controlled study. Due to their structural polymorphism, only one of the HLA*DRB3, *DRB4, and *DRB5 alleles will appear on any single chromosome. Therefore, HLA*DRB345 is used below to represent the genotypes of all three genes. Wherein, each gene contains two alleles, and each allele represents a completely phase-separated nucleotide sequence. When the jth gene has mj possible sequence variations, if a pair of alleles is in Hardy-Weinberg equilibrium (HWE,...
the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More
PUM
Login to View More
Abstract
The invention provides a method for establishing a prediction model of complex data comprising following steps: A. obtaining high dimension omics data HDOD and determining a group of data objects having representativeness for the HDOD as examples; B. determining the similarity measurement between each data object in the HDOD and each example and accordingly establishing a similarity measurement matrix of the data objects and examples; C. selecting examples containing information from the examples through the penalized likelihood method by means of the similarity measurement matrix of the data objects and the examples; D. establishing a prediction model based on the selected examples. The method of the invention provides a natural quantification tool for discovering and verifying interaction among complex variables. The prediction model of the invention is suitable for large-scale database through searching based on similarity.
Description
technical field [0001] The invention relates to a method for constructing a complex data prediction model. Background technique [0002] The advent of next-generation sequencing technologies has enabled researchers to process large amounts of collected data (for example, enabling clinical researchers to process hundreds of biological samples collected from patients) and to perform data such as genome-wide expression levels, methylation levels, or body The analysis of cell mutations is called high-dimensional omics data (HDOD, high dimension omics data). Although the amount of clinical samples available is usually limited, the bottleneck of clinical research has shifted from sample collection to data management and data analysis because the number of observed variables per sample can reach thousands or millions. Using HDOD along with other clinical variables to build predictive models of specific clinical outcomes has been one of the many analytical goals of researchers in b...
Claims
the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More
Application Information
Patent Timeline
Application Date:The date an application was filed.
Publication Date:The date a patent or application was officially published.
First Publication Date:The earliest publication date of a patent with the same application number.
Issue Date:Publication date of the patent grant document.
PCT Entry Date:The Entry date of PCT National Phase.
Estimated Expiry Date:The statutory expiry date of a patent right according to the Patent Law, and it is the longest term of protection that the patent right can achieve without the termination of the patent right due to other reasons(Term extension factor has been taken into account ).
Invalid Date:Actual expiry date is based on effective date or publication date of legal transaction data of invalid patent.