Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Optimization theory-based parallel clustering method with scale constraint

A technology for optimization theory and clustering methods, applied in character and pattern recognition, instruments, computer components, etc., can solve problems such as poor clustering quality, inability to execute data sets, and dependence on prior knowledge to achieve high applicability , easy-to-deploy effects

Inactive Publication Date: 2020-07-10
TONGJI UNIV
View PDF0 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Benefits of technology

This technology improves cluster analysis efficiency for big data applications without adding any extra constraint on its dimensions. It uses an iterative method called LAGLE (Lagrangians Multiplicator Vector) which reduces dimensionality issues while optimizing parameters simultaneously. Additionally, it achieves full parallelization over multiple machines efficiently even when dealing with very small datasets. Overall, this technique makes efficient use of computational resources more effective than previous methods.

Problems solved by technology

This patented technical solution discussing how clumped together two or three datasets may help improve performance for various applications like big data analytical processing tasks. These techniques involve dividing up larger sets of data into small subsets called chunks, allowing multiple instances of these subsets to work equally well without overlapping their content. Additionally, they require specific scaling restrictions at any scales where the dataset needs to match its own properties. Existing approaches either lack sufficient accuracy or cannot handle significant differences across all dimensions accurately enough.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Optimization theory-based parallel clustering method with scale constraint
  • Optimization theory-based parallel clustering method with scale constraint
  • Optimization theory-based parallel clustering method with scale constraint

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0113] based on figure 1 The Spark distributed computing process, the clustering results of the present invention are compared with the clustering results of the interleaving group convolution algorithm IGC, the fuzzy clustering algorithm FCM and the K-Means clustering algorithm.

[0114] Such as figure 2 As shown, in terms of the number of iterations, the convergence speed of the method proposed by the present invention is relatively stable. Under the test cases of large-scale data sets, the solution can still be completed within a limited number of times and high-quality clustering results can be obtained. Among them, the IGC algorithm is more sensitive to the size of the data set, and it is difficult for IGC to converge under large-scale testing.

[0115] In the comparison of clustering quality, the clustering result of the K-Means algorithm is used as the standard (since K-Means has no size constraints, so its intra-class sum of squares (WCSS) is the smallest), the large...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to an optimization theory-based parallel clustering method with scale constraint, which comprises the following steps of: S1, acquiring a data set to be clustered and a scale constraint vector, and initializing parameters according to the data set and the scale constraint vector; S2, decomposing the data set subjected to parameter initialization into a plurality of sub-problems through a distribution matrix; S3, introducing a Lagrange multiplier vector, solving and clustering the sub-problems in parallel through a projection matrix, and updating the allocation matrix according to solving results of the sub-problems; and S4, calculating a convergence judgment parameter, judging whether clustering reaches a convergence stopping criterion or not according to the convergence judgment parameter, stopping iteration and outputting the current allocation matrix and the corresponding clustering result if the convergence stopping criterion is met, and otherwise, continuingto execute the steps S1-S3 for iteration. Compared with the prior art, the method has the advantages that a large-scale data set is processed in finite iterations, and a high-quality clustering resultis provided.

Description

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Owner TONGJI UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products