Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Sparse tensor canonical decomposition method based on data division and calculation distribution

A technology of data division and sparseness, which is applied to digital computer components, calculations, computers, etc., can solve the problems of single tensor canonical decomposition algorithm and no CPD algorithm, so as to improve data parallelism, avoid load imbalance, and reduce The effect of memory access latency

Active Publication Date: 2021-05-07
BEIHANG UNIV
View PDF3 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Due to the widespread use of sparse tensors and CPD algorithms, there have been many researches on implementing CPD algorithms on sparse tensors in recent years, such as MATLAB Tensor Toolbox, GigaTensor, SPLATT (The Surprisingly ParalleL sparse Tensor Toolkit), DFacTo and HyperTensor, however, these algorithms are all based on isomorphic single-core or multi-core CPUs or GPUs, and there is no CPD algorithm on sparse tensors that is suitable for the special many-core architecture of domestic Shenwei processors. At the same time, the current parallelized tensor Quantitative canonical decomposition algorithm is also relatively simple

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Sparse tensor canonical decomposition method based on data division and calculation distribution
  • Sparse tensor canonical decomposition method based on data division and calculation distribution
  • Sparse tensor canonical decomposition method based on data division and calculation distribution

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0074] In order to make the purpose, technical solution and advantages of the present invention more clear, the present invention will be further described in detail below in conjunction with the accompanying drawings and examples. It should be understood that the specific examples described here are only used to explain the present invention, not to limit the present invention. In addition, the technical features involved in the various embodiments of the present invention described below can be combined with each other as long as they do not constitute a conflict with each other.

[0075] The process diagram of the present invention is as figure 1 As shown, the hardware architecture diagram is shown in figure 2 shown.

[0076] Such as figure 1 Shown: the concrete implementation steps of the present invention are as follows:

[0077] Step 1: Read the sparse tensor data specified by the user into the main memory of the core group, and according to the specified sparse ten...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to a sparse tensor canonical decomposition method based on data division and task allocation. The sparse tensor canonical decomposition method comprises the following steps: initially, performing multi-stage division and task allocation on a plurality of processing cores on a core group according to the many-core characteristics of an SW processor; initially, performing multi-stage segmentation processing on sparse tensor data; designing a communication strategy aiming at sparse tensor canonical decomposition by utilizing the register communication characteristics of the SW processor SW26010; aiming at the common performance bottleneck of different sparse tensor canonical decomposition methods, namely different requirements (whether tensor elements need to be randomly extracted for calculation) of matrix tensor multiplied by Khatri-Rao product (MTTKRP for short) during specific operation, different calculation schemes of the MTTKRP process are designed by utilizing the characteristics of a SW processor. According to the method, the characteristics of the SW system structure are fully excavated, the calculation requirements of sparse tensor decomposition are fully considered, multiple sparse tensor canonical decomposition calculation methods can be completed on the SW system structure in parallel and efficiently, and dynamic load balance is guaranteed to the maximum extent.

Description

technical field [0001] The invention relates to the fields of tensor numerical algorithm, parallel computing, Shenwei architecture and the like, in particular to a sparse tensor canonical decomposition method based on data division and calculation distribution. Background technique [0002] The "Sunway Taihu Light" new supercomputer system developed by the National Parallel Computer Engineering Technology Research Center is supported by the National 863 Program. From 2016 to 2017, it has been rated as the world's most powerful supercomputer by the international TOP500 organization for four consecutive times, and the applications running on it have won the "Gordon Bell Award". The ultra-large-scale parallel applications implemented by users on this system can cover millions of cores, and their applications cover multiple fields, including deep learning, earthquake simulation, quantum circuit simulation, climate simulation, etc. [0003] The "Sunway Taihu Light" supercomputer...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F15/80G06F15/163G06F13/30
CPCG06F15/163G06F15/8053G06F13/30
Inventor 杨海龙敦明孙庆骁李云春
Owner BEIHANG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products