Geodesic line preservation-based nonlinear data dimension reduction method

A data dimensionality reduction and geodesic technology, applied in electrical digital data processing, special data processing applications, instruments, etc., can solve problems such as inability to process, and achieve the effect of reducing the number and order of the matrix

Inactive Publication Date: 2017-07-18
SUN YAT SEN UNIV
View PDF0 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

As a very good popular learning algorithm, LTSA can effectively learn the overall embedding coordinates that reflect the low-dimensional manifold structure of the data set, but it also has shortcomings: the order of the matrix used for eigenvalue decomposition in the algorithm is equal to the number of samples, When the sample set is large, it will not be able to handle

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Geodesic line preservation-based nonlinear data dimension reduction method
  • Geodesic line preservation-based nonlinear data dimension reduction method
  • Geodesic line preservation-based nonlinear data dimension reduction method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0016] As shown in the attached figure, the non-linear data dimensionality reduction method based on geodesic preservation includes the following contents:

[0017] 1. Assume that the high-dimensional data sample point set is X=[x 1 … x N ]∈R D×N , the sample point set mapped to the low-dimensional space is Y=[x 1 … x N ]∈R d×N . Among them: D is the dimension of high-dimensional space; d(dD×N The N D-dimensional real column vectors in . Y is the output sample set that maps high-dimensional data to low-dimensional space, and is the low-dimensional space R d×N The N d-dimensional real column vectors in .

[0018] 2. Calculate the Euclidean distance d between adjacent point pairs i and j in the sample point set x (i, j), construct a weighted circulation graph reflecting the neighborhood relationship of the sample point set, and calculate the geodesic distance matrix D corresponding to the sample point set according to the weighted circulation graph.

[0019] 3. Random...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a geodesic line preservation-based nonlinear data dimension reduction method. Firstly a random shortest path is obtained for an input sample point set, and a geodesic line set of the sample point set is found. Low-dimensional embedded global coordinates of each geodesic line are obtained by performing rotation transformation on local coordinates of the geodesic line in a high-dimensional manifold, so that centralized low-dimensional embedded global coordinates can be represented by using centralized local coordinates in the high-dimensional manifold. The global coordinates of each geodesic line can be represented by using low-dimensional embedded coordinates of a selection matrix and all sample points, and a square error sum of the low-dimensional embedded global coordinates and the local coordinates subjected to the rotation transformation is minimized according to a square error of an actual value and an estimated value, and a minimum principle, so that the low-dimensional embedded global coordinates of the sample points are obtained.

Description

technical field [0001] The invention belongs to the field of machine learning, and in particular relates to a non-linear data dimension reduction method based on geodesic preservation in manifold learning. Background technique [0002] Data dimensionality reduction refers to the process of mapping samples from a high-dimensional space to a low-dimensional space through a linear or nonlinear method, thereby obtaining a representation of the high-dimensional space in a lower-dimensional space. Through this operation, the redundancy of the original data can be reduced, and the efficiency and pertinence of data processing can be improved. Data dimensionality reduction methods are mainly divided into two categories: linear mapping and nonlinear mapping methods. Among them, the representative methods of the linear mapping method include principal component analysis (Principle Component Analysis, PCA for short) and linear decision analysis (Linear Discriminant Analysis, LDA for sh...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06F16/2237G06F16/2264
Inventor 刘洁林少斌刘希欧阳效源马争鸣
Owner SUN YAT SEN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products