Unlock instant, AI-driven research and patent intelligence for your innovation.
Memory management method and device for neural network reasoning
What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology of neural network and memory management, applied in reasoning methods, biological neural network models, multiprogramming devices, etc.
Active Publication Date: 2021-03-09
SENSLAB INC
View PDF9 Cites 0 Cited by
Summary
Abstract
Description
Claims
Application Information
AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology
Problems solved by technology
[0010] The technical problem solved by the present invention is: how to reduce the memory occupation of neural network reasoning, to reduce hardware consumption and the hardware cost required for running neural network
Method used
the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more
Image
Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
Click on the blue label to locate the original text in one second.
Reading with bidirectional positioning of images and text.
Smart Image
Examples
Experimental program
Comparison scheme
Effect test
Embodiment 1
[0087] As described below, an embodiment of the present invention provides a memory management method for neural network reasoning.
[0088] refer to figure 1 The flow chart of the memory management method for neural network reasoning is shown in detail below through specific steps:
[0089] S101. Divide a memory space, where the divided memory space includes at least a first-type area and a second-type area.
[0090] Wherein, the first type of area can only be used to store FM data with a life cycle of 1, and the second type of area can be used to store FM data with any life cycle.
[0091] In other embodiments, the divided memory space includes at least one of the first type of area and the second type of area.
[0092] The writing method of the first type of area is:
[0093]This memory area is used to store FM data with a lifetime of only 1, that is, the OFM data of the nth layer of the neural network will only be used by its next layer, so the network will start to cal...
Embodiment 2
[0170] As described below, an embodiment of the present invention provides a memory management device for neural network reasoning.
[0171] The memory management device of the neural network reasoning comprises:
[0172] a processor adapted to load and execute instructions of a software program;
[0173] A memory adapted to store a software program comprising instructions for performing the following steps:
[0174] Divide the memory space, and the divided memory space includes at least the first type of area and the second type of area; wherein, the first type of area will only be used to store FM data with a lifetime of 1, and the second type Regions can be used to store FM data of any life cycle;
[0175] Analyze the neural network to be allocated memory space to obtain the number of layers with multiple inputs in the neural network;
[0176] Determine whether to enable the first type of area and whether to enable the second type of area according to the number of layer...
the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More
PUM
Login to View More
Abstract
A memory management method and device for neural network reasoning, the method comprising: dividing the memory space into a first type of area and a second type of area, the first type of area will only be used to store FM data with a lifetime of 1, The second type of area can be used to store FM data of any life cycle; the neural network to be allocated memory space is analyzed, and according to the number of multi-input layers in the neural network and the total number of layers of the neural network, the Determine whether to enable the first type of area and whether to enable the second type of area; allocate memory space for the FM data of each layer in the neural network from the first type of area and / or the second type of area. The present invention self-adaptively selects a suitable memory management strategy according to the structure of the neural network, and optimizes memory usage. The present invention uses a greedy algorithm to search for an optimized memory allocation scheme layer by layer, which can reduce the memory usage of neural network reasoning and minimize memory usage as much as possible.
Description
technical field [0001] The invention relates to the technical field of artificial intelligence, in particular to a memory management method and device for neural network reasoning. Background technique [0002] Thanks to the efficiency and accuracy of deep neural networks, especially in tasks such as detection, recognition and classification, the application of deep neural networks in daily life has continued to expand and diverge in recent years. As a result, various embedded neural network processors (NPUs) emerged as the times require. [0003] However, deep neural networks usually occupy a large amount of memory, which increases the requirements for hardware and directly leads to an increase in the production cost of hardware. Therefore, how to reduce the memory usage of the deep neural network is an urgent problem to be solved at present, which can greatly reduce the hardware requirements of the deep neural network and save costs. [0004] The existing neural network ...
Claims
the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More
Application Information
Patent Timeline
Application Date:The date an application was filed.
Publication Date:The date a patent or application was officially published.
First Publication Date:The earliest publication date of a patent with the same application number.
Issue Date:Publication date of the patent grant document.
PCT Entry Date:The Entry date of PCT National Phase.
Estimated Expiry Date:The statutory expiry date of a patent right according to the Patent Law, and it is the longest term of protection that the patent right can achieve without the termination of the patent right due to other reasons(Term extension factor has been taken into account ).
Invalid Date:Actual expiry date is based on effective date or publication date of legal transaction data of invalid patent.