Local scale parameter, entropy and cosine similarity-based spectral feature selection method

A cosine similarity and scale parameter technology, which is applied in the fields of electrical digital data processing, special data processing applications, unstructured text data retrieval, etc., can solve the influence of outliers, and the uniform scale parameter σ cannot completely and accurately reflect the data distribution information And other issues

Active Publication Date: 2018-02-09
SHAANXI NORMAL UNIV
View PDF3 Cites 11 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0036] Aiming at the deficiencies of the prior art, the present invention proposes a spectral feature selection method based on local scale parameters, entropy and cosine similarity, which overcomes the defect that the unified scale parameter σ cannot completely

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Local scale parameter, entropy and cosine similarity-based spectral feature selection method
  • Local scale parameter, entropy and cosine similarity-based spectral feature selection method
  • Local scale parameter, entropy and cosine similarity-based spectral feature selection method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0197] The present invention verifies the validity of four newly proposed feature selection methods based on spectral clustering, and compares the performance of the four spectral feature selection methods with the multi-category cluster feature selection method MCFS and the Laplacian score feature selection method. To this end, the proposed four spectral feature selection methods FSSC_OE, FSSC_OC, FSSC_SE, FSSC_SC, and the comparison algorithm MCFS and Laplacian score feature selection methods are used to achieve feature selection, and the corresponding feature subsets are obtained. The training samples are used to construct the SVM classifier, and the accuracy, sensitivity, specificity and other indicators of the classifier are compared.

[0198] Using a 10-fold cross-validation experiment, first add samples of each class to 10 sample sets one by one (the initial sample set is empty), until each sample of this class is added, and the samples are evenly divided into 10 share....

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a local scale parameter, entropy and cosine similarity-based spectral feature selection method. A Gaussian kernel function is adopted as a similarity measurement method, defined local scale parameters which are of features and based on local standard deviation of the features are used as kernel function parameters, and problems that uniform scale parameters in calculating feature affinity matrices cannot reflect data distribution information, and local scale parameters are impacted by off-group points are overcome; entropy and cosine similarity sorting is respectively adopted to measure feature importance degrees, a suitable feature subset can be quickly selected; and technical support is provided for data analysis of diseases such as tumours, and the method has important biomedical significance.

Description

technical field [0001] The invention belongs to gene microarray data and text data analysis technology, and relates to a spectral feature selection method based on local scale parameters, entropy and cosine similarity. Background technique [0002] Feature selection is the primary task of high-dimensional big data analysis such as gene microarray data and text data [1,2] , whose goal is to remove irrelevant or redundant features from all features, and select a feature subset with good distinguishing ability, so as to preserve all the classification information of the original feature set as much as possible. According to whether the feature selection process uses sample class label information, feature selection algorithms are divided into supervised methods and unsupervised methods [3] . The supervised feature selection method performs feature selection by calculating the correlation between features and class label columns, while the unsupervised feature selection method...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30
CPCG06F16/35
Inventor 谢娟英周颖丁丽娟
Owner SHAANXI NORMAL UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products