Distributed column subset selection method and system and leukemia gene information mining method
A column subset and distributed technology, applied in the field of big data processing, can solve the problems of time-consuming calculation, algorithm failure to achieve linear acceleration, poor reliability, etc.
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0045] The two-stage algorithm assumes that all subsets have the same quality and is the main reason for the inadequacy of existing techniques. However, in fact, the quality between subsets is often different, ignoring this difference will lead to waste of time and resources, and even affect the final result of feature selection. In this application, the quality of the subset is measured by the number of optimal features in the subset. Specifically, there must be k most representative features in a data set, called k optimal features, the more optimal features a subset contains, the higher the quality of the subset. For the CSS problem, define the optimal solution S OPT is a set containing k optimal features, and the combination of k features is the feature combination with the strongest ability to fit the original data set among all feature combinations.
[0046] Such as figure 1 It is a schematic flow chart of the method of the present invention: the distributed column su...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com