Supercharge Your Innovation With Domain-Expert AI Agents!

Data prediction analysis method based on COX model and random survival forest

A technology of data forecasting and analysis methods, applied in computing models, machine learning, computer components, etc., can solve problems such as R software running crashes, irrelevant variables affecting the prediction effect, etc., to speed up the operation speed and improve the effect of model prediction

Inactive Publication Date: 2017-01-25
GUANGDONG KINGPOINT DATA SCI & TECH CO LTD
View PDF0 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] However, there are a lot of data in many analysis tasks. If it is directly imported, the R software will crash, and there are too many irrelevant variables that will affect the prediction effect.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data prediction analysis method based on COX model and random survival forest

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0024] The above and other technical features and advantages of the present invention will be described in more detail below in conjunction with the accompanying drawings.

[0025] Such as figure 1 Shown is a flow chart of a data prediction analysis method based on the COX model and random survival forest of the present invention, and the analysis method comprises the following steps:

[0026] In step S1, a KM curve is drawn for the observed values ​​in the data set, and the change trend of the survival function over time is obtained from the KM curve.

[0027] The data set includes observations and variables. The data set is characterized by a large sample size, a large number of variables, and a high degree of correlation between individual variables.

[0028] In step S2, a COX model is made on the variables in the data set and screened.

[0029] COX proportional hazards regression model (Cox’s proportional hazards regression model), referred to as COX regression model. T...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a data prediction analysis method based on a COX model and a random survival forest. The method comprises steps as follows: S1, a KM curve graph is made for observed values in a data set, and a survival function changing trend increasing with time is obtained from the KM curve graph; S2, the COX model is made for variables in the data set, and the variables are screened; S3, the observed values in the data set are divided into a test set and a training set in the proportion of 75%; S4, random survival forest regression is performed on the training set, and a random survival forest model is obtained; S5, the effect of the test set is evaluated with the random survival forest model. Compared with method in the prior art, the method has the benefits as follows: the COX model is firstly used for variable screening, the number of in-model variables can be greatly reduced, finally, the screened variables are subjected to fitting and prediction with a random survival forest model method, and the fitting and prediction effect of the model is judged according to an ROC graph; the software running speed can be greatly increased, and the model prediction effect is substantially improved.

Description

technical field [0001] The invention relates to the field of data prediction analysis, in particular to a data prediction analysis method based on COX model and random survival forest. Background technique [0002] Today's society is a society of rapid development, information circulation, advanced technology, communication between people is getting closer, life is becoming more and more convenient, a large amount of data is the product of this high-tech era. How to make good use of these data and mine valuable information from the data is the key for enterprises to win the competition. The random forest method is one of the commonly used machine learning methods for dealing with high-dimensional data. This method does not need to specify the distribution characteristics of parameters in advance, and can evaluate the predictive ability of each predictor variable on the outcome; at the same time, it uses internal cross-validation to evaluate its prediction error rate and can...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06K9/62G06N99/00
CPCG06N20/00G06F18/24323
Inventor 邹立斌李青海简宋全侯大勇许飞月
Owner GUANGDONG KINGPOINT DATA SCI & TECH CO LTD
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More