The invention discloses a memory management method and device for neural network reasoning, and the method comprises the steps: dividing a memory space into a first type of region and a second type ofregion, enabling the first type of region to be only used for storing FM data with the life cycle of 1, and enabling the second type of region to be used for storing FM data with any life cycle; analyzing the neural network to which the memory space is to be allocated, and determining whether to start a first type of region and a second type of region according to the number of layers with multiple inputs in the neural network and the total number of layers of the neural network; and allocating a memory space to the FM data of each layer in the neural network from the first type of region and/or the second type of region. According to the method, a proper memory management strategy is adaptively selected according to the structure of the neural network, and memory use is optimized. According to the method, a greedy algorithm is used for searching for the optimal memory allocation scheme layer by layer, memory occupation of neural network reasoning can be reduced, and memory usage is minimized as much as possible.