A data flow detection method based on fuzzy c-means clustering algorithm and entropy theory
A technology of mean clustering and detection method, applied in computing, computer parts, character and pattern recognition, etc., can solve the problem that data cannot be processed as accurately as possible, performance is degraded, etc.
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0032] We selected an artificial dataset and two real datasets for experiments. The real data is downloaded from the open database UCI. The first is the real data without concept drift, the Seeds Data dataset, which includes three categories, Kama, Rosa and Canadian, each with 70 samples and seven attributes. from figure 1 It can be seen that FCM clusters the data more accurately; figure 2 is the entropy value curve of the data. It can be seen from the ordinate that when the concept drift of good classification and no attribute change occurs, the entropy value is relatively low.
Embodiment 2
[0034] Gaussian datasets are used to detect concept drift. The two sets of Gaussian data obey the distribution of N([2;2], 1) and N([4;4], 8). The data flow length is 1000, and the concept drift length is 400. image 3It is the classification of two sets of Gaussian data. Because the mean and variance are different, it shows that the data attributes have changed, and the concept drift of attribute changes has occurred at the junction. Figure 4 is the curve of its data flow entropy. It can be seen that the peak value of the entropy curve appears at the junction, indicating that the conceptual drift of property changes has occurred; after that, the entropy value tends to be stable, indicating that the current system can adapt to the new data flow and does not need to update parameters.
Embodiment 3
[0036] Power supply dataset. This dataset collects 24-hour mainnet and subnet power supply data. There are 1247 samples per hour. The experiment selected data from three time periods: 0 o'clock, 1 o'clock, and 21 o'clock. First, the data at 0:00 and 21:00 are used for experiments. Compared with 0:00, 21:00 is the peak point of electricity consumption, it can be considered that the conceptual drift of property changes has occurred compared with 0:00. Figure 5 It is the entropy value curve of the two sets of data at the junction. It can be seen that the entropy value increases significantly, and after the data is stable, the entropy value decreases. Image 6 It is the entropy curve of the data flow at point 0 and point 1. The power consumption at point 0 and point 1 is similar. It can be regarded as a data flow without concept drift, so the entropy value curve is stable.
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com