Methods for determining disease risk combining downsampling of class-imbalanced sets with survival analysis

A survival analysis, downsampling technique used in the field of processing electronic data to determine disease risk

Pending Publication Date: 2021-08-17
SOMALOGIC OPERATING CO INC
View PDF1 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

In this case, specificity will be maximized at the expense of sensitivity, which can be a problem when the goal is to identify as many individuals as possible at risk of developing the condition or event

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Methods for determining disease risk combining downsampling of class-imbalanced sets with survival analysis
  • Methods for determining disease risk combining downsampling of class-imbalanced sets with survival analysis
  • Methods for determining disease risk combining downsampling of class-imbalanced sets with survival analysis

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0074] This example provides a description of the combination of downsampling and a Cox proportional hazards elastic net regression model to evaluate the prediction of a myocardial infarction (MI) event up to 4 years after the initial blood draw, which can be found in figure 2 Completed within the exemplary data risk analysis platform.

[0075] The purpose of this example is at least twofold: 1) to select and identify features predictive of both minority and majority classes, and 2) to derive estimated effect sizes such that minority class risk is well predicted. As a comparison, the predictive power of the logistic regression elastic net model (with and without downsampling) and that of the Cox elastic net model without downsampling were examined.

[0076] Materials and methods - Datasets

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

A method for downsampling class-imbalanced sets with survival analysis comprising: acquiring a class-imbalanced data set, wherein the class-imbalanced data set comprises biological data from a plurality of subjects, wherein the biological data of each subject includes an observation, a time value, and a plurality of clinical measurements, and wherein the biological data is categorized as being part of a majority data class or a minority data class, wherein the majority data class has a greater number of observations than the minority data class; downsampling the class-imbalanced data set, wherein the downsampling results in the majority data class having an equivalent or substantially equivalent number of observations as the minority data class; and performing cross-validation on the downsampled data set with a survival analysis to generate a survival model, wherein the observation comprises an event or no event at a specific time value.

Description

[0001] Cross References to Related Applications [0002] This application claims priority to U.S. Provisional Patent Application No. 62 / 773,028, filed November 29, 2018, and U.S. Provisional Patent Application No. 62 / 783,733, filed December 21, 2018, which are incorporated by reference in their entirety into this article. technical field [0003] The present disclosure relates generally to the field of disease risk determination, and more particularly, to systems and methods for processing electronic data to determine disease risk. Background technique [0004] Methods for identifying biomarkers associated with the risk of various disease-related conditions or events (e.g., cardiovascular events, diabetes diagnoses, various cancer types, etc.) have improved, mainly due to the discovery of high-throughput techniques , such as gene sequencing, transcriptomics, proteomics, and metabolomics. However, these techniques also complicate matters by providing high-dimensional data r...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): A61B5/117A61B5/00
CPCA61B5/103A61B5/117A61B5/05G16H50/30G16H50/20A61B5/7275A61B5/7264A61B5/021A61B5/14546A61B5/4866A61B5/4872A61B5/7221G09B23/00G16H50/50
Inventor Y·夏甲G·达塔L·亚历山大M·欣特伯格
Owner SOMALOGIC OPERATING CO INC
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products