Dimension reduction clustering data analysis method

A data analysis and clustering technology, applied in the computer field, can solve problems such as lack of theoretical basis

Inactive Publication Date: 2017-02-08
GUOXIN YOUE DATA CO LTD
View PDF1 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] Commonly, if the variable dimension in the classification standard table is relatively high, it is necessary to perform dimensionality reduction cluster analysis. Among the existing dimensionality reduction methods, the model represented by the projection pursuit method involves the only parameter—the density window At present, the wide value must be determined by experience or trial calculation, and there is a lack of corresponding theoretical basis; in addition, the calculation results of this type of model need to be classified by other methods to obtain the final clustering result; In the application problem, the preference of the decision maker should also be taken into account, that is, the subjective tendency of a certain variable to have a greater weight

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Dimension reduction clustering data analysis method
  • Dimension reduction clustering data analysis method
  • Dimension reduction clustering data analysis method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0036] This embodiment provides a data analysis method for dimensionality reduction clustering, which includes:

[0037] S101. By generating sample data, perform dimensionless processing on the sample data to construct projection data required by the model.

[0038] In this embodiment, the projection data required by the model can be constructed by generating sample data. For example: According to the classification standard table, a certain amount of sample data is randomly generated within each level. Mark the influence index of the sample data as X ij (i=1,2,…,n; j=1,2,…m), where n is a natural number, indicating the number of samples; m is a natural number, indicating the number of variables, if the classification standard table is divided into 5 levels, To generate 100 samples in each level range, the number of samples n is 500.

[0039] Since the dimensions of the variables in the sample data are not the same or the value ranges are quite different, it is necessary to...

Embodiment 2

[0068] This example demonstrates the data analysis method of dimensionality reduction clustering through specific operations, that is, a water quality analysis method, see Table 1-1 below, which is the water quality monitoring result table of a certain reservoir in a certain month, first according to the "Surface Water "Environmental Quality Standards" Table 1-2 to evaluate this reservoir, the steps in Example 1 can be used.

[0069] Table 1-1 Water quality monitoring results of a certain reservoir in a certain month

[0070]

[0071] Table 1-2 Reservoir water quality evaluation standard table

[0072]

[0073] 30 samples were randomly generated within the scope of each water quality standard level, and the reservoir water quality evaluation standard is divided into 5 levels, so a total of 150 water quality samples were obtained. Based on the generated sample data at all levels, the water quality evaluation model of the reservoir based on the projection pursuit dynamic ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a dimension reduction clustering data analysis method. The method comprises the following steps of: S101, generating sample data, carrying out a dimensionless method for the sample data and constructing projection data needed by a model; S102, constructing a projection index and obtaining an optimal projection direction vector as shown in the specification; and S103, carrying out linear projection on the projection data to obtain a one-dimensional projection characteristic value. Through the abovementioned steps, a projection pursuit technology and a dynamic clustering method can be combined and are applied to a high-dimensional data dimension reduction clustering model, the operation is simple and the model objectivity is also improved; for the possible situation of decision-maker preference, constraint conditions are increased so that the model can comprehensively consider objective weights and the decision-maker preference and the application range of the model is widened.

Description

technical field [0001] The invention relates to a data analysis method in the computer field, in particular to a data analysis method for dimensionality reduction clustering. Background technique [0002] In the prior art, it is necessary to classify a group of vectors according to the data classification standard table containing multiple variables. [0003] Commonly, if the variable dimension in the classification standard table is relatively high, it is necessary to perform dimensionality reduction cluster analysis. Among the existing dimensionality reduction methods, the model represented by the projection pursuit method involves the only parameter—the density window At present, the wide value must be determined by experience or trial calculation, and there is a lack of corresponding theoretical basis; in addition, the calculation results of this type of model need to be classified by other methods to obtain the final clustering result; In the application problem, the p...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06F16/285
Inventor 夏虎康明陈进宝
Owner GUOXIN YOUE DATA CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products