Fuzzy-rough concentration attribute selection method based on information gain rate

A technology of information gain rate and attribute selection, applied in fuzzy logic-based systems, character and pattern recognition, instruments, etc., it can solve problems such as low correlation, redundancy, and no removal of correlation, and achieve the effect of improving data quality.

Inactive Publication Date: 2017-09-22
浙江象立医疗科技有限公司
View PDF0 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] But currently there is an attribute selection method based on information gain ratio in fuzzy rough sets (Dai J, XuQ. Attribute selection based on information gain ratio in fuzzy rough set theory with application to tumor classification[J].Applied Soft Computing,2013,13(1 ):211-221.): (1) The less relevant attributes are not removed, and the less relevant attributes may be selected into the results
(2) In the result of attribute selection, there may be redundancy

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Fuzzy-rough concentration attribute selection method based on information gain rate
  • Fuzzy-rough concentration attribute selection method based on information gain rate
  • Fuzzy-rough concentration attribute selection method based on information gain rate

Examples

Experimental program
Comparison scheme
Effect test

experiment example

[0033] In the medical field, using machine learning algorithms to diagnose diseases has become a new trend. Compared with traditional manual diagnosis, machine learning algorithm diagnosis is more efficient and more accurate. However, data collected in real life often contain a lot of noise and redundant attributes. Using this kind of data to train the model is inefficient and has low accuracy. Therefore, preprocessing techniques to remove redundant attributes and noise are essential steps. In the present invention, the method proposed in this patent is used to reduce the attributes of Breast Cancer Wisconsin (Diagnostic) in the UCI (http: / / archive.ics.uci.edu / ml) data warehouse, and verify the validity of the results. The features of the Breast Cancer Wisconsin (Diagnostic) dataset are extracted from fine needle aspiration (FNA) images of breast masses. These features describe the properties of the nuclei in the image. There are only two categories of data sets: benign an...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a fuzzy-rough concentration attribute selection method based on an information gain rate. The method is characterized by under a fuzzy-rough set, calculating an information gain rate of each attribute and removing an attribute with a small information gain rate; calculating an information gain rate of each attribute which is not selected, selecting the attribute with a maximum information gain rate and adding into an attribute selection result; and repeating the above selection process till that a maximum value of the information gain rate is 0 or the attribute set which is not selected is an empty set, and removing a redundancy attribute in a selection result. Compared to an attribute selection method based on the information gain rate existing in a current fuzzy-rough set, by using the method of the invention, irrelevant and redundancy attributes in a lot of attributes can be further eliminated so that data quality is increased, a data processing rate is accelerated and a generalization capability of a classifier is improved.

Description

technical field [0001] The invention relates to an attribute selection method, in particular to an attribute selection method based on information gain rate in fuzzy rough sets. Background technique [0002] In reality, the results of data collection are often accompanied by noise data, which makes uncertain mathematical tools particularly important. Compared with other theories dealing with uncertain and imprecise problems, rough set theory does not need to provide any prior knowledge other than the data set that the problem needs to deal with. Due to the superiority of rough set in dealing with uncertain data, it has been widely used in many fields such as classification and clustering, among which attribute selection is one of the most important applications. Attribute selection can eliminate redundant, irrelevant attributes from a large number of attributes, thereby improving data quality, speeding up data processing, and improving the generalization ability of classifi...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/62G06N7/02
CPCG06N7/02G06F18/285
Inventor 代建华郑国杰胡虎
Owner 浙江象立医疗科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products