Data classification method based on ID3 algorithm
A data classification and algorithm technology, applied in computing, computer components, instruments, etc., can solve problems such as multi-valued bias, and achieve the effect of improving the accuracy of prediction
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0074] Using the commercial car purchase customer database (as shown in Table 1) as the training set D, the sample set is obtained after the data is selected, preprocessed and converted. This set contains 4 conditional attributes: favorite season (including 4 attribute values: spring, summer, autumn, winter), whether a business person (including 2 attribute values: yes, no), income (including 3 attribute values: high, medium, low), driving level (including 2 attribute values: good, generally). The sample set is divided according to the category attribute "whether to buy a car" (contains 2 attribute values: buy and not to buy).
[0075]
[0076] Utilize the data classification method of the present invention to classify each attribute in the training set D, specifically as follows:
[0077] Step 1. Obtain the information entropy I and conditional entropy E (A i ) and information gain Gain(A i ):
[0078] Step 1.1, calculate the information entropy I of the classification...
Embodiment 2
[0125] The Benxi Formation database of the Sulige Gas Field (as shown in Table 2) is used as the training set Y. According to the single-layer gas test data of Block X in the Sulige Gas Field over the years, "effective thickness", "shale content", "matrix permeability", " Gas saturation" 4 conditional attributes.
[0126]
[0127]
[0128] The k-means cluster analysis method was used to select, preprocess and transform the data. Take the effective thickness as an example:
[0129] Step 1. Randomly pick three values among the effective thicknesses of these 15 wells: μ 1 =3.3,μ 2 =5.5,μ 3 = 6.6;
[0130] Step 2. Utilize the formula (11) to calculate the class to which the effective thickness of each well belongs, and there are 3 clusters;
[0131] Step 3. For each cluster, use formula (12) to recalculate the centroid, μ 1 = 4, μ 2 =6.
[0132] The effective thickness can be divided into three intervals {y<4, 4≤y≤6, 6
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com