A weight adjustment-based resistive random access memory training device and training method
By using a weight-adjustment-based resistive random access memory (RRAM) training device and method, the difficulties in miniaturization of CMOS digital computers and the efficiency problems of the von Neumann architecture are solved, achieving efficient and accurate neural network training, overcoming the nonlinear effects of RRAM, and improving training speed and parallelism.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- SHANGHAI INSTITUTE OF TECHNICAL PHYSICS CHINESE ACADEMY OF SCIENCES
- Filing Date
- 2022-07-11
- Publication Date
- 2026-06-16
AI Technical Summary
Existing CMOS-based digital computers face difficulties in miniaturization, and the memory wall and power wall caused by the traditional von Neumann architecture affect the computational performance of neural networks. Existing resistive random access memory training methods suffer from nonlinear effects and immature training methods.
A weight-adjustable resistive random access memory (RRAM) training device is adopted, including a RRAM array, a conductivity read/write device, and peripheral circuits. The weight planning graph is drawn by deploying a neural network, forward propagation calculation, and backpropagation algorithm. Training is performed using pulse signals, and the training process is optimized by differential circuits and pulse update methods.
It accelerates the training process without sacrificing accuracy, improves training efficiency and accuracy, reduces the nonlinear effects of resistive switching memory, and enhances training speed and parallelism.
Smart Images

Figure CN117436498B_ABST
Abstract
Description
Technical Field
[0001] This invention belongs to the hardware field of artificial intelligence brain-like computing, and more specifically, relates to a fast neural network learning scheme based on resistive random access memory. Background Technology
[0002] Digital computers based on Complementary Metal-Oxide-Semiconductor (CMOS) have made significant progress over the past decade, particularly in transistor integration technology. For a long time, the number of transistors that can be placed on an integrated circuit doubled every 18 months, a development that has followed Moore's Law for over half a century. However, with the continuous introduction of new processes and the decreasing number of atoms in crystals, existing materials and technologies are finding it increasingly difficult to lay out components on smaller scales. Furthermore, at smaller scales, some devices cannot be analyzed simply using the physics of semiconductor components; quantum mechanics must be considered, making integrated circuit design more complex. In addition, considering the atomic size, the volume of some devices or coatings cannot be reduced, further limiting the number of transistors that can be placed on an integrated circuit.
[0003] The development of artificial intelligence has also increased the demand for dedicated chips for neural network computing. The current hardware platforms for implementing this algorithm (such as CPU, GPU, FPGA) are based on the von Neumann architecture with memory-computation separation. Due to the design of separating main memory and processor, the von Neumann architecture consumes a lot of time and energy during the data interaction process between the two. This "memory wall" and "power wall" also affect the efficiency of this architecture in computing large-scale neural networks.
[0004] Based on these two needs, a new type of AI chip that overcomes the "memory wall" and "power wall" and is not based on CMOS has been put on the researchers' agenda. Researchers hope to use memristors, a non-volatile device, to build neural network hardware accelerators for AI tasks. Memristors offer two main advantages for neural network hardware accelerators: first, the conductance characteristics of memristors can better establish Hodgkin-Huxley models to simulate the basic functions of neurons; second, memristor-based cross arrays can efficiently and quickly perform vector-matrix multiplication and convolution operations.
[0005] The resistance-switching characteristic of a memristor refers to the fact that, under the influence of a specific external electrical signal, the resistance value of the device will switch between (at least) two stable resistance values, and the resistance state can be maintained when the external electrical signal is removed. This type of memristor used in the field of information storage is also known as resistive random access memory.
[0006] Currently, neural network training methods based on resistive random access memory (RRAM) mainly include two types: off-site training and native training. Off-site training involves training the neural network parameters before deploying it to the memristor array. Its advantage is that any (i.e., state-of-the-art) training algorithm can achieve optimal performance in software without incurring too much overhead in hardware. Native training, on the other hand, involves deploying the network in the RRAM array before training. Its advantages are that it can adapt to hardware changes and has stronger real-time performance. However, native training methods are not mature enough, the theory is not perfect, and the training process must consider the impact of RRAM device edge effects and nonlinearity. Summary of the Invention
[0007] This invention proposes a training device and method for a resistive random access memory (RRAM) based on weight adjustment. The training device includes a RRAM array, an embedded conductivity read / write device, and peripheral circuitry. The training method includes deploying a neural network, forward propagation calculation, backpropagation calculation, drawing a weight planning graph using the backpropagation algorithm, and updating the weights using pulses. The key features are: the memristor array is composed of a differential circuit made of specially designed RRAMs; the RRAM array is initialized by using pulse initialization to adjust the RRAM conductivity to a minimum value; the training method maps the backpropagation algorithm to the corresponding weight planning graph; and the method uses pulse signals simultaneously applied to a pair of RRAMs during training, and performs low-cost reads after each pulse application to ensure high training efficiency and accuracy.
[0008] Specifically, the method for deploying neural networks in a resistive random access memory array is as follows:
[0009] The node weight values between corresponding layers of the neural network form a weight matrix, which is mapped to the conductance value of the resistive random access memory (RRAM) at the corresponding crossover point in the RRAM array. A pair of RRAM conductance values, denoted as w1 and w2 respectively, are used, and their difference w = w1 - w2 or w = w2 - w1 represents the weight of a certain crossover point.
[0010] Specifically, the conductance update method of the resistive switching memory is as follows: the conductance is adjusted by applying positive and negative pulses. When a positive pulse is applied, the conductance value of the resistive switching memory gradually increases with the number of pulses, and when a negative pulse is applied, the conductance value of the resistive switching memory gradually decreases with the number of pulses. Generally speaking, this adjustment relationship is neither linear nor uniform.
[0011] Specifically, the initialization of the resistive random access memory should involve repeatedly using negative pulses to bring the device conductance close to or to its minimum value.
[0012] Specifically, the resistive random access memory array implements vector-matrix multiplication for neural network forward propagation computation in the following manner:
[0013] Apply a corresponding input voltage pulse signal to each row of the resistive random access memory array;
[0014] The input electrical signal is multiplied by the conductance of each resistive random access memory device at the intersection of the resistive random access memory array, and the results are superimposed and output from the corresponding column in the form of current. The calculation meaning in the corresponding neural network is the weighted sum of the activation values of multiple neurons.
[0015] The specific weighted summation operation can be represented by Ohm's law. j =∑ i V i G i,j G i,j The conductance value at the corresponding array intersection is used to represent the weight, considering that the conductance value cannot actually be negative.
[0016] Specifically, the training method utilizing the characteristics of differential circuit architecture has its forward propagation computation expressed as follows: in Let be the activation value of the j-th neuron in the l-th layer. This represents the bias of the j-th neuron in the l-th layer. f represents the weight connecting the k-th neuron in layer l-1 to the j-th neuron in layer l, and f represents the activation function.
[0017] Specifically, the rules for drawing the weighted planning graph are as follows:
[0018] (1) According to and backpropagation calculation Calculate the backpropagation error for each neuron, and according to Δw ij =η·a i ·δ j Calculating the weights requires a bias, where a i and δ j These are the corresponding neuron activation values and errors, where η is the learning rate, and δ... j Using a as the ordinate, with a as the median. i Plot different Δw values in a rectangular coordinate system for the x-axis. ij Different curves with appropriate values;
[0019] (2) Make the first mark based on the curve graph, and mark Δw ij =a and Δw ij The region between the two curves is denoted as t (t is the smaller of the absolute values of a and b), where Δw ij The curve with the smallest positive value, Δwij The curve with the largest negative number and a i The region enclosed by 0 is marked as 0 (the coordinate system only considers the first and third quadrants);
[0020] (3) Starting from the origin, divide the parameter space into several rectangular regions and re-mark them. If the area of the first marked t0 part in region A is the largest, then re-mark region A as t0.
[0021] (4) The resulting remarked graph is the weighted planning graph, and the marking is the weighted planning conductance value calculated using the BP algorithm.
[0022] Specifically, the training method using a differential circuit architecture is as follows:
[0023] Two resistive random access memory devices, w1 and w2, are used, with the larger one having a conductance twice that of the smaller one. Together, they form the weights of the neural network. This ensures that the weights of the corresponding neural network will only change monotonically when a positive or negative pulse is applied.
[0024] Specifically, the steps of the single-pulse update method are as follows:
[0025] (1) Read the conductance of a pair of resistive random access memory w1 and w2 through the conductance reading device (where the larger conductance is defined as w1), and determine whether the conductance difference w1-w2 is greater than the critical value. The critical value should be a small positive value.
[0026] (2) When the difference in conductance is greater than the critical value, the weight of the network is represented as w = w1 - w2. Compare the current weight with the corresponding value of the weight planning graph. If the difference between the two is within the error range, complete this round of training. If it exceeds the error range and the current weight is large, apply a single negative pulse to w1 and w2 at the same time. If it exceeds the error range and the current weight is small, apply a single positive pulse to w1 and w2 at the same time. After applying the pulse, return to step (1).
[0027] (3) When the difference in conductance is less than the critical value, the weight of the network is represented as w = w2 - w1. Compare the current weight with the corresponding value of the weight planning graph. If the difference between the two is within the error range, complete this round of training. If it exceeds the error range and the current weight is large, apply a single positive pulse to w1 and w2 at the same time. If it exceeds the error range and the current weight is small, apply a single negative pulse to w1 and w2 at the same time. After applying the pulse, return to step (1).
[0028] Specifically, the electrical conductivity read / write device is embedded in each column of the resistive random access memory array, and together with the array, it completes the training of the neural network. The peripheral circuit generates neural network inference results to assist in the training after it is connected to the output of the array.
[0029] Specifically, in the array of resistive random access memories (RRAMs), the two RRAMs w1 and w2 deployed to form a weighted array meet the following requirements: w1 and w2 must maintain the same composition ratio (including the materials used, doping ratio, and processing technology), but their area dimensions are limited to be different. The larger of the two RRAMs should be twice the size of the smaller one, so as to ensure that w only increases monotonically when a positive pulse is applied and only decreases monotonically when a negative pulse is applied.
[0030] Specifically, the conductance read / write device includes a sensing amplifier and a write driver. When reading the conductance, the conductance value of the resistive random access memory is read through the sensing amplifier; while during training, the voltage is applied through the write driver to change the conductance value of the resistor.
[0031] Specifically, the peripheral circuitry includes an analog-to-digital converter, a multiplexer, and a shift adder.
[0032] In summary, the technical solutions conceived by this invention have the following beneficial effects compared with the prior art:
[0033] 1. This invention innovates the practical deployment of backpropagation theory by presenting the planned bias of each neuron activation value in an intuitive form using a weighted planning graph, thereby accelerating training without sacrificing accuracy.
[0034] 2. The pulse update method used in this invention is innovative. By selecting a pair of devices with the same properties but different sizes, it is ensured that the pulse will only change monotonically. The training results are not affected by the nonlinearity of the resistive random access memory by the conductance read / write devices deployed on each resistive random access memory in the array. The differential devices apply the pulse together to ensure the speed and parallelism of the training. Attached Figure Description
[0035] Figure 1 A flowchart illustrating the implementation of this invention;
[0036] Figure 2 The electrical conductivity read / write device described in this invention;
[0037] Figure 3 This is a schematic diagram of the in-memory computing module provided in an embodiment of the present invention;
[0038] Figure 4 This is a schematic diagram of backpropagation as described in an example of the present invention;
[0039] Figure 5 This is the weighted planning graph in an example of the present invention. Detailed Implementation
[0040] To make the objectives, technical solutions, and advantages of this invention clearer, the invention will be further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the examples described herein are merely illustrative and are not intended to limit the scope of the invention.
[0041] like Figure 1 As shown, a training device and method for a weighted resistive random access memory (RRAM) are disclosed. The system includes a RRAM array and peripheral circuitry. The steps include adjusting the RRAM conductance using negative pulses until it approaches the minimum value G. min The initial conductance G0 of the resistive random access memory (RRAM) is read using a conductance read / write device. Training data is then input into the RRAM array as voltage pulses and propagated forward through the deployed neural network until the last layer. The backpropagation algorithm is then used to calculate the error of each neuron. Based on the backpropagation results, a weight planning graph is drawn, and positive or negative electrical pulses are applied according to the weight planning graph. The conductance read / write device is then used to determine whether to terminate the training.
[0042] like Figure 2 As shown, the conductivity read / write device includes a sensing amplifier and a write driver. When reading the conductivity, the conductivity value of the resistive switching memory is read through the sensing amplifier; while during the training process, the voltage is applied through the write driver to change the conductivity value of the resistor.
[0043] The peripheral circuitry includes an analog-to-digital converter, a multiplexer, and a shift adder.
[0044] like Figure 3 The in-memory computing module shown is a combination of a resistive random access memory (RRAM) array and peripheral circuitry. These components provide, on the one hand, vector-matrix multiplication operations between the electrical signal vector and the RRAM conductance matrix for the forward propagation process, and on the other hand, provide weights for the backward propagation. Each node unit of the in-memory computing module should contain one RRAM. Typically, in-memory computing modules save time in transferring data to the processor, and because their architecture is well-suited for solving complex vector-matrix multiplication problems, they are often used as the core of the entire computing module.
[0045] As a non-volatile device that incorporates resistance switching and state retention characteristics, resistive random access memory (RRAM) can be used as a memory to store node-to-node weights in a neural network, and it can also participate in computation itself.
[0046] The conductivity update method of the resistive switching memory is as follows: by applying positive and negative pulses, the conductivity increases when a positive pulse is applied and decreases when a negative pulse is applied.
[0047] In the array of resistive random access memories (RRAMs), each pair of RRAMs represents a weight value of a neural network. Specifically, the difference in conductance between the two RRAMs is used to represent w = w1 - w2 or w = w2 - w1, because the weight of the neural network can be negative, while the conductance of the RRAM can only be non-negative. The two RRAMs w1 and w2 that form a weight must meet the following requirements: In terms of process, w1 and w2 must be consistent in their composition ratio (including the materials used, doping ratio, and processing technology), but their area dimensions are limited to be different. The larger of the two RRAMs should be twice the size of the smaller one to ensure that w only increases monotonically when a positive pulse is applied and only decreases monotonically when a negative pulse is applied.
[0048] The training method utilizing the characteristics of differential circuit architecture has its forward propagation computation expressed as follows: in Let be the activation value of the j-th neuron in the l-th layer. This represents the bias of the j-th neuron in the l-th layer. f represents the weight connecting the k-th neuron in layer l-1 to the j-th neuron in layer l, and f represents the activation function.
[0049] The resistive random access memory array implements vector-matrix multiplication in the forward propagation computation in the following manner:
[0050] Apply a corresponding input voltage pulse signal to each row of the resistive random access memory array;
[0051] The input electrical signal is multiplied by the conductance of each resistive random access memory device at the intersection of the resistive random access memory array, and the results are superimposed and output from the corresponding column in the form of current. The calculation meaning in the corresponding neural network is the weighted sum of the activation values of multiple neurons.
[0052] The weighted summation operation, by pooling the current generated by each row of the memory cell in a certain column, can be expressed by Ohm's law based on Kirchhoff's superposition principle. j =∑ i V i G i,j G i,j I represents the conductance value at the corresponding array intersection point. j This represents the weighted output current of each column, V. i The input information in each row represents the processed voltage pulse. Considering that the conductance value cannot actually be negative, the conductance of two resistive random access memories is used to represent the weight.
[0053] The input signal used in this invention example is decoded from the MNIST dataset, which has a training set of 60,000 examples and a test set of 10,000 examples. It is an excellent database for those who want to experiment with learning techniques and pattern recognition methods on real-world data while minimizing the cost of preprocessing and formatting. This dataset is widely used in artificial intelligence recognition tasks, making it suitable for both training and testing the method described in this invention.
[0054] like Figure 4 As shown, this invention example designs a neural network model including an input layer, two hidden layers, and an output layer. The activation function is the Leaky ReLU function. The input layer contains 784 neurons, corresponding to 28×28 pixel values of a handwritten font image, and the output layer contains 10 neurons.
[0055] like Figure 5 As shown in the example, the method for drawing a weighted programming graph described in this invention is as follows:
[0056] according to and backpropagation calculation Calculate the backpropagation error for each neuron, and according to Δw ij =η·a i ·δ j Calculating the weights requires a bias, where a i and δ j These represent the corresponding neuron activation values and errors, respectively, with η being the learning rate of 0.1 (unit: μS). -2 ), with δ j Using a as the ordinate, with a as the median. i Plot different Δw values for the x-axis. ij Different curves with appropriate values such as -1, -2, -4, -6, 1, 2, 4, 6, etc.
[0057] Make a mark on the curve graph, and mark Δw ij =a and Δw ij The region between the two curves is denoted as t (t is the smaller of the absolute values of a and b), where Δw ij =1 and Δw ij =-1 and a i The region enclosed by 0 is marked as 0 (the coordinate system only considers the first and third quadrants);
[0058] Starting from the origin, the coordinate system is divided into several rectangular regions with a coordinate axis unit of μS, and then remarked. If the area of region A marked t0 is the largest, then region A is remarked as t0.
[0059] The resulting relabeled graph is the weighted programming graph, and the labels are the weighted programming conductance values calculated using the BP algorithm.
[0060] The invention achieves a recognition accuracy of 97%, which is close to the best standard of the deep learning framework PyTorch.
[0061] This invention fully considers the deployment problem of the backpropagation algorithm when implementing online learning of neural networks using resistive random access memory (RRAM) hardware, as well as the deployment design of RRAM and the pulse training method, making the original learning of RRAM hardware simple, efficient, and highly practical.
Claims
1. A training method implemented on a weight-adjusted resistive random access memory (RRAM) training device, characterized in that, The training device includes a resistive random access memory array, an embedded conductivity read / write device, and peripheral circuitry. The conductivity read / write device is embedded in each column of the resistive random access memory array and works together with the array to complete the training of the neural network. The peripheral circuitry, including an analog-to-digital converter, a shift adder, and a multiplexer, is responsible for generating neural network inference results to assist in training after interfacing with the array's output. The resistive random access memory array is used to store and compute network weight values. The array employs a differential circuit architecture and uses... and These are the conductance values of two resistive random access memory (RRAM) devices, where the conductance of the larger RRAM device is twice that of the smaller RRAM device. Together, they constitute the weights of the neural network. Specifically, the difference between the two RRAM devices represents a weight of the neural network. ; The training method includes the following steps: 1) Deploy a neural network in the resistive random access memory (RRAM) array and initialize the RRAM's conductance to the minimum value G. min And use a conductivity read / write device to read the conductivity G value after initialization; 2) Input a set of training data into the neural network deployed in the resistive random access memory array, perform vector-matrix multiplication on the input signal and the weights in the resistive random access memory array, complete the forward propagation calculation, and iterate to the last layer. 3) Use the backpropagation algorithm to draw a weight planning graph, which is used to estimate the weights based on the neuron activation values and error values. How to adjust; 4) Training is performed using a training method that utilizes the characteristics of differential circuit architecture. After each pulse update, the conductivity value after bias is read using a conductivity read / write device, compared with the weight planning graph, and then updated using a single pulse to continue learning. 5) If the error between the weight represented by the read resistive variable memory conductance value and the weight planning graph is within the allowable range, then no update operation is required; The method for drawing the weighted programming graph described in step 3) is as follows: Calculate the backpropagation error for each neuron, and follow the... Calculating the weights requires a bias, where and These are the corresponding neuron activation values and errors. The learning rate is... Using the vertical axis as the ordinate, with Plot different values in a rectangular coordinate system for the x-axis. Different curves with appropriate values; (2) Make the first mark based on the curve graph, and The region between the two curves is labeled t, where t is the smaller of the absolute values of a and b. The curve with the smallest positive value. The curve with the largest negative number and The enclosed area is marked as 0, and the coordinate system only considers the first and third quadrants; (3) Starting from the origin, divide the parameter space into several rectangular regions and relabel them. If the first label in region A is... If the area of a region is the largest, then region A will be relabeled as... ; (4) The resulting remarked graph is the weighted planning graph, and the marking is the weighted planning conductance value calculated using the BP algorithm.
2. The training method implemented on a weight-adjustment-based resistive random access memory training device according to claim 1, characterized in that: In step 1), the conductance of the resistive random access memory is initialized to the minimum value G. min The specific methods are as follows: Pulse-based conductance adjustment involves applying positive and negative pulses to regulate the conductance of the resistive random access memory (RRAM). When a positive pulse is applied, the RRAM's conductance gradually increases with the number of pulses, while when a negative pulse is applied, the RRAM's conductance gradually decreases with the number of pulses. This regulation relationship is neither linear nor uniform. The method to initialize the conductance to the minimum value is to continuously apply reverse pulses to the resistive switching memory device until the conductance read / write device measures that the conductance no longer changes significantly.
3. The training method implemented on a weight-adjustment-based resistive random access memory training device according to claim 1, characterized in that: In step 1), the specific method for deploying the neural network in the resistive random access memory array is as follows: The node weights between corresponding layers in a neural network form a weight matrix, which is mapped to the conductance values of the resistive random access memory (RRAM) at the corresponding intersection points in the RRAM array. A pair of RRAM conductance values are denoted as follows: and Their differences or This represents the weight of a certain intersection point.
4. The training method implemented on a weight-adjustment-based resistive random access memory training device according to claim 1, characterized in that: In step 2), the specific method for performing vector-matrix multiplication on the input signal and the weights in the array of resistive random access memory is as follows: Apply the corresponding input signal to each row of the resistive random access memory array; The input signal is multiplied by the conductance of each resistive random access memory device at the intersection of the resistive random access memory array, and the results are superimposed and output from the corresponding column in the form of current. The calculation meaning in the corresponding neural network is the weighted sum of the activation values of multiple neurons. The specific weighted summation operation is expressed using Ohm's law. ,in The value represents the conductance at the intersection of the corresponding array. Considering that the conductance value cannot actually be negative, but the weights of the neural network may be negative, the difference between the conductances of the two resistive switching memories is used to represent the weight value.
5. The training method implemented on a weight-adjustment-based resistive random access memory training device according to claim 1, characterized in that: In step 2), the forward propagation calculation method is as follows: in Let be the activation value of the j-th neuron in the l-th layer. This represents the bias of the j-th neuron in the l-th layer. Let w represent the weight connecting the k-th neuron in layer l (l-1) to the j-th neuron in layer l, and f denote the activation function; where each w corresponds to the conductance of a pair of resistive random access memories in the differential circuit, denoted as [w_k, w_j, w_j], [w ... and This set of conductance values is bound to a new weight. or Used for computation in this network.
6. The training method implemented on a weight-adjustment-based resistive random access memory training device according to claim 1, characterized in that: The single-pulse update method described in step 4) has the following steps: 1) The conductance of a pair of resistive random access memories is read using a conductance reading device, and the larger conductance is defined as... The one with smaller conductance is defined as Determine the difference If it is less than the critical value, proceed to the second step. If so, proceed to the third step. ; 2) For the current weight In this case, the current weight can be considered positive. Compare the current weight with the corresponding value in the weight planning graph. If the difference is within the error range, complete this round of training. If it exceeds the error range and the current weight is large, then move to... and Simultaneously apply a single negative pulse; if it exceeds the error range and the current weight is small, then... and Simultaneously apply a single positive pulse; after the pulse application is complete, return to step one. 3) For the current weight In this case, the current weight can be considered negative. Compare the current weight with the corresponding value in the weight planning graph. If the difference is within the error range, complete this round of training. If it exceeds the error range and the current weight is large, then move to... and Simultaneously apply a single positive pulse; if it exceeds the error range and the current weight is small, then... and Simultaneously apply a single negative pulse, and after the pulse application is complete, return to step one.