Unlock instant, AI-driven research and patent intelligence for your innovation.

Dimensionally reduction method based on deep Pearson embedment

A dimension reduction and depth technology, applied in neural learning methods, image data processing, instruments, etc., can solve problems such as difficulty in preserving local structure information of data, approximation errors, and increased calculation costs

Inactive Publication Date: 2017-05-24
BEIJING UNIV OF TECH
View PDF0 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Local Linear Embedding (Local Linear Embedding, LLE), Laplacian Eigenmap (Laplacian Eigenmap, LE), and t distribution Stochastic Neighborhood Embedding (t-SNE) and other manifold learning methods are well maintained. The manifold geometric structure of the data has a good effect on dealing with the data of nonlinear distribution, however, it faces the problem of "out-of-sample"
In general, a transformation matrix can be learned explicitly through a linear approximation strategy, but this leads to approximation errors and increases the computational cost
t-SNE solves the "out-of-sample" problem through the parameterization of the neural network, but because t-SNE uses the heavy-tailed student distribution in the embedding space to construct the neighbor probability, it cannot preserve the data well when the embedding dimension is greater than 3. local structure
In recent years, the development of deep neural networks has made it possible to learn more complex nonlinear mappings, such as autoencoders (Anto-Encoder, AE), but it mainly focuses on maximizing the variance of low-dimensional spatial data, which makes it very difficult. It is difficult to preserve the local structure information of the data like the manifold learning method
Therefore, people improve AE by adding regularization items so that they can learn the manifold geometry of the data, such as Laplacian Auto-Encoder (LAE), Hessian regularized sparse autoencoder (Hessian regularized Sparse Auto-Encoder, HSAE), etc., but they take a lot of time to train the entire network

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Dimensionally reduction method based on deep Pearson embedment
  • Dimensionally reduction method based on deep Pearson embedment
  • Dimensionally reduction method based on deep Pearson embedment

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0018] Given a high-dimensional sample data set, according to figure 1 Shown algorithm flowchart, the specific embodiment of the present invention is as follows:

[0019] Step 1: Input the high-dimensional sample data set D={(x 1 ,c 1 ),(x 2 ,c 2 ),...,(x n ,c n )},in p is the sample dimension; c i ∈{1,...,m}, m represents the total number of categories; n represents the size of the dataset. If the value range of the data set is between 0 and 1, jump directly to step 2, otherwise normalize the data set;

[0020] Step 2: Divide the dataset into training dataset D train ={(x 1 ,c 1 ),(x 2 ,c 2 ),...,(x train_n ,c trian_n )} and the test dataset D test ={(x 1 ,c 1 ),(x 2 ,c 2 ),...,(x test_n ,c test_n )} two parts, where train_n, test_n are the number of training set and test set samples respectively and n=train_n+test_n;

[0021] Step 3: Construct a deep Pearson neural network, including an input layer, an output layer and L-2 hidden layers, where L is a ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a dimensionally reduction method based on a deep Pearson embedment, comprising the following steps: firstly, constructing a deep Pearson embedded network, wherein a cost function of the deep Pearson embedded network is the sum of Pearson relevant coefficients of row vectors in a similarity matrix between high dimension samples and a euclidean distance matrix between low dimension embedment samples; secondly, initializing network parameters by means of random or non-supervision pretraining; and finally, supervising and fine-tuning the entire network in a manner of storing data local correlation structures. According to the dimensionally reduction method disclosed by the invention, as the Pearson relevant coefficients are adopted to evaluate relevancy of the raw vectors in the similarity matrix of a high dimension sample pair and the euclidean distance matrix of a low dimension embedment sample pair, and negative correlation is required for the relevancy as far as possible, a smaller Euclidean distance in a low-dimensional space is ensured for samples of greater similarity in a high-dimensional space. The dimensionally reduction method based on deep learning and the Pearson relevant coefficients disclosed by the invention can be widely applied to solving a visualization task, a classification task and the like in the field of machine learning.

Description

technical field [0001] The invention relates to a dimensionality reduction method of deep Pearson embedding, which can be applied to solving tasks such as machine learning, pattern recognition and visualization and classification in the field of computer vision. Background technique [0002] Dimensionality reduction transforms high-dimensional observation data into a low-dimensional feature space through linear and nonlinear mapping under the assumption that low-dimensional data can maintain the essential structure of high-dimensional spatial data. Linear dimensionality reduction methods can discover the geometric structure of embedded linear subspaces from high-dimensional data spaces, such as principal component analysis and linear discriminant analysis. The linear dimension reduction method is simple to implement and has high computational efficiency, but its assumption of linearity or approximately linearity in the embedded subspace of the high-dimensional data space is ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06T3/00G06N3/08
CPCG06N3/08G06T3/06
Inventor 李玉鑑张亚红
Owner BEIJING UNIV OF TECH
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More