Neural network model block compression method, training method, computing device and system

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A network model and neural network technology, applied in the field of neural networks, can solve the problems of inability to compress, reduce the running speed of the memristor array, and cannot adapt to the neural network model of the chip, and achieve the effect of saving resource overhead.

Active Publication Date: 2019-05-21

TSINGHUA UNIV

View PDF6 Cites 9 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0019] Although these existing technologies can greatly reduce the size of the model, they are not suitable for the application of chips based on memristors and TrueNorth that can perform matrix-vector multiplication operations. neural network model

For example, because the weights cut out by weight clipping are not concentrated, the number of required arrays cannot be reduced; using weight sharing will slow down the running speed of memristor arrays; the weight encoding of memristor arrays is fixed and cannot be compressed

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0043] In order to enable those skilled in the art to better understand the present invention, the present invention will be further described in detail below in conjunction with the accompanying drawings and specific embodiments.

[0044] image 3 A schematic diagram showing an application scenario 1000 of the neural network network model block compression technology according to the present invention.

[0045] Such as image 3 As shown, the general inventive concept of the present disclosure is: perform preliminary neural network training on the neural network application 1100, learn to obtain the network model 1200, and perform block compression on the network model 1200 at a predetermined compression rate through the network model block compression method 1300 , and then re-training, and then compression-retraining-recompression-retraining..., such iterations, in order to fine-tune and learn to improve the accuracy rate, until the predetermined iteration termination requi...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

A network model block compression method for a neural network, comprising: a weight matrix obtaining step, obtaining a weight matrix of a trained network model of a neural network; a weight matrix blocking step, dividing the weight matrix into an array consisting of a plurality of initial sub-blocks according a predetermined array size; a step of concentrating the weighted elementto be cut, according to the absolute values and values of the weights of the matrix elements in the sub-blocks, concentrating the matrix elements with smaller weights to the sub-blocks to be cut through row and columnexchange, so that the absolute value and value of the weights of the matrix elements in the sub-blocks to be cut are made smaller relative to the absolute values and values of the weights of the matrix elements in other sub-blocks that are not in the sub-blocks to be cut; a sub-block cut step, cutting the weights of the matrix elements in the above-mentioned sub-blocks to be cut, and obtaining the final weight matrix to implement compression of the network model of the neural network. The resource overhead can be saved, and a large-scale neural network can be arranged under the condition of limited resources.

Description

technical field [0001] The present invention generally relates to the technical field of neural networks, and more specifically relates to a network model block compression method, a training method, a computing device and a hardware system for neural networks. Background technique [0002] With the gradual failure of Moore's Law, the progress of traditional chip technology has slowed down, and people have to face new applications and new devices. In recent years, neural network (Neural Network, NN) computing has made breakthroughs, and has achieved high accuracy in many fields such as image recognition, language recognition, and natural language processing. However, neural networks require massive computing resources. Traditional It is difficult for general-purpose processors to meet the computing needs of deep learning, and designing dedicated chips has become an important development direction. [0003] Specifically, the modeling of neural networks is usually constructed...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G06N3/08G06N3/063

CPCG06N3/063G06N3/08

Inventor 张悠慧季宇张优扬

Owner TSINGHUA UNIV

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Neural network model block compression method, training method, computing device and system

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology