Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Bidirectional clustering detection method for local similarity submatrices in data matrix

A technology of similarity sub-matrix and data matrix, applied in structured data retrieval, database model, relational database, etc., can solve problems, increase the difficulty of distinguishing, limited types of local similarity sub-matrix, etc., and achieve reduced labor intensity and good results readability effect

Inactive Publication Date: 2014-03-05
YANTAI UNIV
View PDF2 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

There is often noise interference in the process of data matrix acquisition and generation. On the one hand, the existence of noise will weaken the correlation of elements in the local similarity sub-matrix, and on the other hand, it will also increase the difficulty of distinguishing the elements in the local similarity sub-matrix from other elements in the data matrix.
If the traditional clustering or classification algorithm is directly applied to the local similarity sub-matrix detection of a large-scale data matrix, serious problems will arise. Those vectors that are classified into different classes may actually have a local similarity relationship. Yes, that is to say, related elements and irrelevant elements are entangled with each other to make the problem very complicated
[0005] The disadvantages of the prior art are: (1) the types of local similarity sub-matrices that can be detected are limited; (2) all local similarity sub-matrices that exist in the output original data matrix cannot be detected, because for any given data matrix there may be The number, size, and positional relationship of the local similarity sub-matrices are unknown. Whether the algorithm can detect and output all the local similarity sub-matrices existing in it is a very challenging task; (3) when detecting local similarity sub-matrices with overlapping situations Difficulty when using the local similarity sub-matrix; many existing works in published papers transform the local similarity sub-matrix detection into an optimization problem of an objective function. One problem that these methods cannot solve is that when a certain After a local similarity sub-matrix, how to deal with the value of the element at the position covered by the local similarity sub-matrix in the original data matrix
If reassignment is used, it will inevitably have a serious impact on the local similarity sub-matrix detection with overlapping situations, and even make it impossible to continue the detection
Existing local similarity sub-matrix detection techniques cannot handle complex types of data, and in reality, the data in many databases are complex types, which limits the application range of these techniques

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Bidirectional clustering detection method for local similarity submatrices in data matrix
  • Bidirectional clustering detection method for local similarity submatrices in data matrix

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0012] The present invention proposes a new two-way clustering detection technology framework system, designs and realizes a complete local similarity sub-matrix detection process based on clustering results, and combines traditional clustering algorithms (such as K-means, FCM, etc.) with local similarity sub-matrix The matrix detection is separated into two independent sequential processes as attached figure 1 As shown, the detection of the local similarity sub-matrix is ​​based on the clustering results of traditional clustering algorithms. The following is attached figure 2 The specific detection steps of the present invention are described in detail:

[0013] (1) Preprocessing of the original data matrix: Suppose the original data matrix D consists of m rows and n columns, as attached figure 2 As shown, first judge the size of the number of row numbers m and the number of column numbers n, if n≥m, transpose the data matrix D to obtain a new data matrix D consisting of n row...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to data intelligent information processing in the technical field of computers, in particular to a bidirectional clustering detection method for local similarity submatrices in a data matrix and the implementation of the method. The bidirectional clustering detection method for the local similarity submatrices in the data matrix is beneficial for finding out important knowledge and rules existing in the data matrix. A new bidirectional clustering detection technical framework system is provided, and an integral local similarity submatrice detection process based on the clustering results is designed and achieved. Traditional clustering algorithms (such as K-means and FCM) and the local similarity submatrice detection are separated into two independent sequential processing processes, and the local similarity submatrice detection is based on the clustering results of the traditional clustering algorithms. The bidirectional clustering detection method has the advantages that the previous clustering work can be completed by flexibly adopting the corresponding traditional clustering algorithms along with the development of definition of the local similarity submatrices, the existing algorithms are well and organically connected with the new technology, and continuation and consistency of the knowledge hierarchy are achieved.

Description

Technical field [0001] The present invention relates to data intelligent information processing in the field of computer technology, and particularly relates to a two-way clustering detection method for local similarity sub-matrices in a data matrix and its implementation. The present invention helps to discover important knowledge and laws in the data matrix. Background technique [0002] Traditional clustering or classification algorithms use the whole row vector or column vector in the data matrix as the analysis object, which is called feature vector. The elements in the vector are also called features, which remain unchanged or almost unchanged for general deformation and distortion, and only contain as little redundant information as possible. In decision theory, feature extraction occupies an important position. It decides which features to select by analyzing specific recognition objects. The feature extraction process not only compresses the amount of information, but al...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30
CPCG06F16/2237G06F16/285
Inventor 张艳洁胡占义孙立民
Owner YANTAI UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products