Inverse mahalanobis distance measuring method based on weighting Moore-Penrose in process of data mining

A Mahalanobis distance and data mining technology, applied in electrical digital data processing, special data processing applications, instruments, etc., to solve problems such as poor correlation data stability, inability to completely maintain the mean and variance of data sources, and poor reliability.
CN101984428AInactive Publication Date: 2011-03-09ZHEJIANG UNIV OF TECH

Patent Information

Authority / Receiving Office
CN · China
Current Assignee / Owner
ZHEJIANG UNIV OF TECH
Publication Date
2011-03-09
Estimated Expiration
Not applicable · inactive patent

Smart Images

  • Figure 1
    Figure 1
  • Figure 2
    Figure 2
  • Figure 3
    Figure 3
Patent Text Reader

Abstract

The invention relates to an inverse mahalanobis distance measuring method based on weighting Moore-Penrose in the process of data mining, comprising the following steps: 1) calculating a covariance matrix S of a data totality X; 2) based on the spectrum decomposition theory of a real time symmetric matrix, expanding the covariance matrix S of a data totality X; 3) constituting a weight matrix M, and a weight matrix N by the following concrete process: 1. constituting an n n matrix M; and 2. constituting an n n matrix N; 4) calculating a weight Moore-Penrose inverse matrix of the covariance matrix S; and 5) calculating the mahalanobis distance between a data individual Xi and a data individual Xj. The invention provides an inverse mahalanobis distance measuring method based on weighting Moore-Penrose in the process of data mining, which is free from the influence of dimension (with invariance of linear conversion), maintains data mean value and variance information and ensures normal operation with higher performance no matter what relevance data are processed.
Need to check novelty before this filing date? Find Prior Art

Description

technical field

[0001] The invention relates to the technical field of data mining process, in particular to a WMP Mahalanobis distance measurement method for processing limited correlation data sets. Background technique

[0002] With the continuous accumulation of business data of enterprises or industries, massive data sets have been formed. If relying solely on manual sorting or understanding of such a huge data source, there are already problems such as efficiency and accuracy. Therefore, more and more enterprises are using data mining technology to solve the problems of collation of massive data and knowledge discovery, and provide support for enterprise decision-making. Data preprocessing accounts for about 60%-70% of the workload of the entire data mining process, and plays a vital role in the results of data mining. An important step in data preprocessing is to fill in the missing data in the original data. In the process of complementing the missing value, the d...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More