Data clustering method and device and computer readable storage medium

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A data clustering and clustering technology, which is applied to other database clustering/classification, computer components, computing, etc., can solve the problems of unsatisfactory clustering results and low clustering performance, so as to facilitate popularization and improve clustering Performance, simple and convenient effect

Pending Publication Date: 2021-02-26

CHINA MOBILE SUZHOU SOFTWARE TECH CO LTD +1

View PDF3 Cites 1 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0003] At present, the commonly used subspace clustering algorithm based on linear representation, through l 1 -Norm, nuclear norm or F-norm constrains the representation coefficient to obtain the representation coefficient Z with a block diagonal structure. However, a single norm constrains the representation coefficient Z, and the obtained representation coefficient Z usually has deficiencies , so that the final clustering result is not ideal and the clustering performance is low

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0028] The embodiment of the present invention proposes a data clustering method, figure 2 Schematic diagram of the implementation process of a data clustering method proposed in the embodiment of this application Figure 1 ,Such as figure 2 As shown, in an embodiment of the present invention, the method for data clustering performed by the data clustering device may include the following steps:

[0029] Step 101, receiving and transforming the original data set.

[0030] In the embodiment of the present application, the data clustering device may first receive the original data set, and perform dimension conversion on the original data set after receiving the original data set.

[0031] Further, in the embodiments of the present application, the original data set can be high-dimensional data, for example, the original data set can be Extended Yale B face data set, Augmented Reality (Augmented Reality, AR) face data set or and high-dimensional data such as handwritten dig...

Embodiment 2

[0085] Based on the first embodiment above, in another embodiment of the present application, when the data clustering device solves the converted third objective function, that is, when solving the above formula (11), it can use the preset auxiliary variable to The converted third objective function is iteratively solved to obtain the representation coefficients.

[0086] Further, in the embodiment of the present application, the data clustering device can introduce preset auxiliary variables J, T∈R n×n , and after introducing the preset auxiliary variable, use the augmented Lagrangian multiplier method to reconstruct, so that the above formula (13) can be obtained, and then the preset auxiliary variable J, preset auxiliary variable T, Z, E , Lagrange multipliers, and μ are updated to obtain the optimal representation coefficient Z * .

[0087] In the embodiment of this application, as an example, for the original data set X=[x 1 ,x 2 ,...,x n ]∈R m×n When determining t...

Embodiment 3

[0108] Based on the first and second embodiments above, in another embodiment of the present application, Figure 4 Schematic diagram of the implementation process of a data clustering method proposed in the embodiment of this application Figure II ,Such as Figure 4 As shown, the data clustering device is based on the similarity matrix, and the method of using spectral clustering to obtain the clustering result corresponding to the original data set may include the following steps:

[0109] Step 301: Obtain a normalized symmetric Laplacian matrix corresponding to the original data set according to the similarity matrix calculation.

[0110] In the embodiment of the present application, after the data clustering device determines the similarity matrix, it can cluster the original data set according to the normalized symmetric spectral clustering algorithm.

[0111] Further, in the embodiment of the present application, the data clustering device may first obtain the normali...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

Embodiments of the invention disclose a data clustering method and device, and a computer readable storage medium. The data clustering method comprises the steps of receiving and converting an original data set; determining a low-rank dictionary and a weight matrix corresponding to the original data set according to the original data set; determining a representation coefficient corresponding to the original data set according to the low-rank dictionary and the weight matrix; establishing a similarity matrix corresponding to the original data set according to the representation coefficient; based on the similarity matrix, obtaining a clustering result corresponding to the original data set by utilizing spectral clustering, so that an ideal clustering effect can be obtained, and the clustering performance is effectively improved.

Description

technical field [0001] The invention relates to data detection technology, in particular to a data clustering method and device, and a computer-readable storage medium. Background technique [0002] When clustering a data set of high-dimensional data, the high-dimensional data from different subspaces can be divided into their respective low-dimensional subspaces according to the potential subspace structure of the data set. Different subspaces correspond to different categories. . In many fields, subspace clustering algorithms have been widely used, among which sparse subspace clustering algorithm (Sparse subspace clustering, SSC), low rank representation for subspace clustering algorithm (Low rank representation for subspace clustering, LRR) and least squares The linear representation-based subspace clustering algorithm represented by the regression subspace clustering algorithm (Robust and efficient subspace segmentation via least squares regression, LSR) has attracted r...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G06F16/906G06K9/62

CPCG06F16/906G06F18/23Y02D10/00

Inventor 赵剑邱思远

Owner CHINA MOBILE SUZHOU SOFTWARE TECH CO LTD

Data clustering method and device and computer readable storage medium

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

Embodiment 3

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology