High-dimensional data classification method based on two-stage mixed feature selection

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A hybrid feature, high-dimensional data technology, applied in the fields of instruments, character and pattern recognition, computer components, etc., can solve the problems of easy to fall into local optimum, early convergence, easy overfitting of high-dimensional data, etc., to improve classification performance, improved operating speed, and the effect of accurate predictions

Pending Publication Date: 2021-12-10

ZHEJIANG SCI-TECH UNIV

View PDF0 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

Although it can achieve satisfactory results, there are still some problems, such as premature convergence, easy to fall into local optimum, and easy overfitting when dealing with high-dimensional data.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0087] Embodiment 1, high-dimensional data classification method based on two-stage mixed feature selection, such as Figure 1-5 As shown, firstly, the MIC method is used to obtain the correlation between features and labels, and then a suitable deletion threshold is learned according to the Q-Learning algorithm to obtain the selected feature subset; and then the improved Particle Swarm Optimization (PSO, Particle SwarmOptimization ) to search for the optimal feature subset, and then predict the label of the sample in the data set.

[0088] Step 1. Obtain the data set and process it;

[0089] Download the microarray data set from the Internet, then organize the characteristic information of the data in the host computer, mark the classification labels of all samples, and finally remove the serial number of each sample, delete the missing samples in the data set, and obtain the processed data set;

[0090] In this embodiment, 15 medical-related microarray data sets are obtaine...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a high-dimensional data classification method based on two-stage mixed feature selection. The method comprises the following steps: obtaining a processed data set; preprocessing the processed data set based on a maximum information coefficient (MIC) method to obtain an MIC matrix; obtaining a selected feature subset; performing fine search on the selected feature subset by using an improved PSO algorithm to obtain an optimal feature subset; updating features in the processed data set obtained in the step S1 according to the optimal feature subsets, establishing a training set and a test set for ten-fold cross validation according to the updated data set, and sequentially inputting the training set and the test set into a KNN classifier of which K is equal to 1 to obtain the classification accuracy of the corresponding ten optimal feature subsets; and taking the average value of the classification accuracy rates of the ten optimal feature subsets as the accuracy rate of the optimal feature subsets.

Description

technical field [0001] The present invention relates to technical fields such as reinforcement learning, feature selection, pattern recognition, machine learning, etc., and specifically relates to a high-dimensional data classification method based on two-stage mixed feature selection. Background technique [0002] With the rapid development of science and technology, more and more data are collected in machine learning tasks. There are a large number of irrelevant and redundant features in these data, which will reduce the prediction accuracy of the model and increase the computational complexity. Therefore, how to filter out the features most relevant to the task to be solved has become an urgent problem in machine learning and pattern recognition. As an effective tool for reducing the feature dimension, feature selection can eliminate useless features in the original data according to a given evaluation standard, save computing costs and improve prediction accuracy. In ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G06K9/62

CPCG06F18/24147G06F18/214

Inventor 李欣倩沈琪浩任佳

Owner ZHEJIANG SCI-TECH UNIV

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

High-dimensional data classification method based on two-stage mixed feature selection

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology