Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

MapReduce-based distributed particle swarm clustering algorithm

A particle swarm, distributed technology, applied in computing, computing models, instruments, etc., can solve the problems of aggravating the cost of clustering algorithm, high time complexity and space complexity, etc., to solve the clustering problem, good scalability and acceleration effect

Inactive Publication Date: 2020-09-22
JIANGSU COLLEGE OF INFORMATION TECH
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

At present, most sequential clustering algorithms are inversely proportional to the growth of data set size and scalability, and the high time complexity and space complexity aggravate the cost of clustering algorithms.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • MapReduce-based distributed particle swarm clustering algorithm
  • MapReduce-based distributed particle swarm clustering algorithm
  • MapReduce-based distributed particle swarm clustering algorithm

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0038] In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the accompanying drawings and specific embodiments.

[0039] Please refer to Figure 1 to Figure 6 , the present invention is a distributed particle swarm clustering algorithm based on MapReduce, and the algorithm steps are:

[0040] Step 1: Use MapReduce job to update particle swarm centroid;

[0041] Step 2: Use the MapReduce job to evaluate the fitness of the population with new particle centroids generated in step 1, and calculate the new fitness value of the updated population. The fitness evaluation is based on the fitness function, which is measured by obtaining the average distance between particle centroids The distance between all data points and the particle centroid;

[0042] Step 3: Merge the fitness value calculated in step 2 with the update group generated in step 1, and upda...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to the field of artificial intelligence big data analysis, in particular to a MapReduce-based distributed particle swarm clustering algorithm, which is characterized by comprising the following steps of: 1, updating the centroid of a particle swarm by adopting MapReduce operation; 2, adopting MapReduce operation to evaluate the adaptability of the population with the new particle centroid generated in the step 1, calculating a new fitness value of the updated population, carrying out fitness evaluation based on a fitness function, and measuring the distances between all data points and the particle centroid by obtaining the average distance between the particle centroids; 3, combining the fitness value calculated in the step 2 with the updating group generated in thestep 1, and updating the optimal individual centroid and the optimal global centroid at the same time; and returning to the step 1, and carrying out next iteration. According to the method, the clustering problem of a super-large-scale commercial data set is effectively solved, and high-quality clustering is achieved.

Description

technical field [0001] The invention relates to the field of artificial intelligence big data analysis, in particular to a distributed particle swarm clustering algorithm based on MapReduce. Background technique [0002] With the development of Internet technology, the data that needs to be stored, analyzed, and processed has shown explosive growth. In addition to the huge amount of data, the data created or collected is also becoming more and more complex. Addressing how to efficiently generate, manage and analyze data and derive resulting information requires a comprehensive, end-to-end approach covering all stages from initial data acquisition to final analysis. Clustering is a data mining technique used when analyzing data. The main goal of the clustering algorithm is to divide a group of unlabeled data objects into different clusters, so that the cluster members have common specifications and relatively similar membership. In order to obtain high-quality clustering, t...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06N3/00
CPCG06N3/006
Inventor 赵彦
Owner JIANGSU COLLEGE OF INFORMATION TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products