Unlock instant, AI-driven research and patent intelligence for your innovation.

A Hierarchical Important Feature Selection Method Based on Clinical High-Dimensional Data of Breast Cancer

A feature selection method and high-dimensional data technology, applied in the computer field, can solve the problem of high dimensionality of clinical data, and achieve the effect of ensuring accuracy, sufficient learning, reducing time and computing resource consumption

Active Publication Date: 2022-05-03
UNIV OF ELECTRONICS SCI & TECH OF CHINA
View PDF9 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0013] The purpose of the present invention is to solve the problem of too high clinical data dimension in establishing breast cancer survival prediction model
Use the hierarchical feature selection method combining statistical feature selection and integrated feature selection to solve the problem of important feature extraction and model practicability

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A Hierarchical Important Feature Selection Method Based on Clinical High-Dimensional Data of Breast Cancer
  • A Hierarchical Important Feature Selection Method Based on Clinical High-Dimensional Data of Breast Cancer
  • A Hierarchical Important Feature Selection Method Based on Clinical High-Dimensional Data of Breast Cancer

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0040] In order to make the purpose, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the implementation methods and accompanying drawings.

[0041] see figure 1 , The layered important feature selection method for breast cancer clinical high-dimensional data of the present invention includes statistical feature calculation, integrated feature calculation and threshold value setting methods involved in integrated feature calculation. The present invention uses a layered feature selection method combining statistical feature selection and integrated feature selection to effectively solve the problems of important feature extraction and model practicability. Its specific implementation process is as follows:

[0042] S1: Statistical feature selection.

[0043] Feature extraction and cleaning are performed on the original clinical data to obtain the original feature set F n; ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a hierarchical important feature selection method based on clinical high-dimensional data of breast cancer. The feature selection method of the present invention includes statistical feature selection and integrated feature selection, wherein the statistical feature selection adopts the single factor analysis method, and the features that have a significant impact on the outcome variable are initially selected through different statistical tests; the integrated feature selection is achieved by establishing a gradient boosting tree After the model is trained, the feature importance score is obtained, and then the designed and verified importance score threshold is used to realize the feature selection that has an important impact on the outcome variable. The invention can effectively overcome the problems of excessively high data feature dimension, too many redundant features, messy data and the like in the process of clinical breast cancer prediction and modeling. Redundant or meaningless features in the high-dimensional data of clinical breast cancer can be excluded, so as to select as few features as possible that have an important impact on breast cancer modeling to ensure the accuracy and practicability of the breast cancer model.

Description

technical field [0001] The invention relates to the fields of computer technology, statistical machine learning technology, feature engineering technology and the like. Background technique [0002] Breast cancer is the malignant tumor with the highest incidence rate among women in the world, which seriously threatens women's health. Breast cancer patients are usually intervened through surgery, chemotherapy and other treatment measures, and may face the risk of recurrence at any time after treatment. Scientifically assessing and predicting the survival status of breast cancer patients can assist doctors in formulating appropriate treatment plans, providing new support for reducing the risk of recurrence and improving prognosis. [0003] To realize the assessment and prediction of the survival status of breast cancer patients, such as the recurrence-free survival rate, a machine learning prediction model can be established based on breast cancer clinical data. However, cli...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G16H50/20G16H50/30G16H50/70
CPCG16H50/20G16H50/30G16H50/70
Inventor 付波刘沛林劼郑鸿邓玲
Owner UNIV OF ELECTRONICS SCI & TECH OF CHINA