Neural network model block compression method, training method, computing device and system

A network model and neural network technology, applied in the field of neural networks, can solve the problems of inability to compress, reduce the running speed of the memristor array, and cannot adapt to the neural network model of the chip, and achieve the effect of saving resource overhead.

Active Publication Date: 2019-05-21
TSINGHUA UNIV
View PDF6 Cites 9 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0019] Although these existing technologies can greatly reduce the size of the model, they are not suitable for the application of chips based on memristors and TrueNorth that can perform matrix-vector multiplication operations. neural network model
For example...

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Neural network model block compression method, training method, computing device and system
  • Neural network model block compression method, training method, computing device and system
  • Neural network model block compression method, training method, computing device and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0043] In order to enable those skilled in the art to better understand the present invention, the present invention will be further described in detail below in conjunction with the accompanying drawings and specific embodiments.

[0044] image 3 A schematic diagram showing an application scenario 1000 of the neural network network model block compression technology according to the present invention.

[0045] Such as image 3 As shown, the general inventive concept of the present disclosure is: perform preliminary neural network training on the neural network application 1100, learn to obtain the network model 1200, and perform block compression on the network model 1200 at a predetermined compression rate through the network model block compression method 1300 , and then re-training, and then compression-retraining-recompression-retraining..., such iterations, in order to fine-tune and learn to improve the accuracy rate, until the predetermined iteration termination requi...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

A network model block compression method for a neural network, comprising: a weight matrix obtaining step, obtaining a weight matrix of a trained network model of a neural network; a weight matrix blocking step, dividing the weight matrix into an array consisting of a plurality of initial sub-blocks according a predetermined array size; a step of concentrating the weighted elementto be cut, according to the absolute values and values of the weights of the matrix elements in the sub-blocks, concentrating the matrix elements with smaller weights to the sub-blocks to be cut through row and columnexchange, so that the absolute value and value of the weights of the matrix elements in the sub-blocks to be cut are made smaller relative to the absolute values and values of the weights of the matrix elements in other sub-blocks that are not in the sub-blocks to be cut; a sub-block cut step, cutting the weights of the matrix elements in the above-mentioned sub-blocks to be cut, and obtaining the final weight matrix to implement compression of the network model of the neural network. The resource overhead can be saved, and a large-scale neural network can be arranged under the condition of limited resources.

Description

technical field [0001] The present invention generally relates to the technical field of neural networks, and more specifically relates to a network model block compression method, a training method, a computing device and a hardware system for neural networks. Background technique [0002] With the gradual failure of Moore's Law, the progress of traditional chip technology has slowed down, and people have to face new applications and new devices. In recent years, neural network (Neural Network, NN) computing has made breakthroughs, and has achieved high accuracy in many fields such as image recognition, language recognition, and natural language processing. However, neural networks require massive computing resources. Traditional It is difficult for general-purpose processors to meet the computing needs of deep learning, and designing dedicated chips has become an important development direction. [0003] Specifically, the modeling of neural networks is usually constructed...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06N3/08G06N3/063
CPCG06N3/063G06N3/08
Inventor 张悠慧季宇张优扬
Owner TSINGHUA UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products