Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

A matrix transposition method and system based on a Shenwei 26010 processor

A matrix transposition and processor technology, applied in memory systems, electrical digital data processing, instruments, etc., can solve problems such as limited computing performance and cumbersome computing process, and achieve improved computing performance, low access delay, and transposition efficiency Enhanced effect

Pending Publication Date: 2019-03-08
成都申威科技有限责任公司
View PDF5 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

After the calculation is completed, if matrix transposition is required, the data needs to be accessed through direct memory access DMA, and the data is sent back to the master core for data transposition. The DMA sends the data to the slave core LDM, the operation process is very cumbersome, which greatly limits its operation performance

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A matrix transposition method and system based on a Shenwei 26010 processor
  • A matrix transposition method and system based on a Shenwei 26010 processor
  • A matrix transposition method and system based on a Shenwei 26010 processor

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0045] The principles and features of the present invention are described below in conjunction with the accompanying drawings, and the examples given are only used to explain the present invention, and are not intended to limit the scope of the present invention.

[0046] Such as figure 1 As shown, a matrix transposition method based on the Shenwei 26010 processor provided by the embodiment of the present invention, the Shenwei 26010 processor includes 4 core groups, and one core group includes 1 master core and 64 slave cores , including the following steps:

[0047] S1. Divide the matrix A stored in the main core into 64 sub-matrices, and number the 64 sub-matrices.

[0048] S2. Number the 64 slave cores corresponding to the numbers of the 64 sub-matrices, and respectively read the 64 sub-matrices into the slave cores whose numbers correspond to the sub-matrices.

[0049] S3. Transpose the sub-matrixes in each of the slave cores respectively to obtain 64 transposed slave c...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to a matrix transposition method based on a Shenwei 26010 processor, comprising the following steps of S1 dividing a matrix A stored in a main core into 64 sub-matrices, and numbering the 64 sub-matrices; S2 numbering the 64 slave cores corresponding to the numbers of the 64 submatrices, and respectively reading the 64 submatrices into the slave cores corresponding to the submatrices; S3 transposing submatrices in each slave kernel to obtain 64 transposed slave kernels; S4 arranging 64 transposed slave cores into matrix B in the form of 8*8 according to the numbering order of the slave cores, and transposing the matrix B through inter-core register communication to obtain matrix C; S5 storing the matrix C in the main core, and completing the transposition. By decomposing the larger matrix into smaller blocks, the transposing efficiency is improved by transposing the block matrix in parallel with the matrix transmission.

Description

technical field [0001] The invention relates to the field of matrix transposition methods for processors, in particular to a matrix transposition method and system based on the Shenwei 26010 processor. Background technique [0002] The Shenwei 26010 processor is a high-performance computing processor independently developed by my country. The processor uses the extended ALPHA architecture instruction set, a processor uses 4 core groups, each core group consists of an operation control core (main core, 64-bit RISC structure general processor unit) and an operation core array, namely 8 It consists of 64 computing cores (slave cores) with a mesh structure of 8 columns. Both the main core and the slave core support 256-bit vector floating-point instruction extension; each slave core contains 32 registers, 64KB user-controllable LDM and 16KB program space, and the delay of direct access to the local LDM is extremely small, and the slave core hardware pipeline supports Simultane...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F9/345G06F17/16
CPCG06F9/345G06F17/16
Inventor 胡波李一明秦旭彭星洪李晋
Owner 成都申威科技有限责任公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products