Method for neural network online training and inference based on cmos transistor hybrid memory circuit
By using a hybrid memory circuit based on CMOS transistors and CMOS capacitors, the compatibility and durability issues of the hybrid memory architecture of memristors and ferroelectric capacitors were solved, enabling low-power, low-cost online training and inference of neural networks.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- SHENZHEN HOTCHIP TECH
- Filing Date
- 2026-04-10
- Publication Date
- 2026-06-26
AI Technical Summary
In the existing technology, the hybrid memory architecture built with memristors and ferroelectric capacitors has problems such as low write endurance, high programming energy and poor compatibility with standard CMOS processes during training and inference, resulting in complex manufacturing and high cost.
Synaptic units are composed of CMOS transistors and CMOS capacitors. Multiple synaptic units form a hybrid memory array for online training and inference of neural networks. Analog weights are stored and digital weights are updated using CMOS transistor synaptic devices and CMOS capacitors to achieve weight synchronization, avoid external DACs, and use circuit parasitic parameters for digital-to-analog conversion.
It improves the write endurance of CMOS transistor synaptic devices, reduces energy consumption, simplifies the manufacturing process, lowers costs, adapts to battery-powered edge devices, and enables long-term stable training and inference of neural networks.
Smart Images

Figure CN121998009B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the fields of neuromorphic computing, artificial intelligence hardware, and integrated circuit technology, and in particular to a method for online training and inference of neural networks using hybrid memory circuits based on CMOS transistors. Background Technology
[0002] With the increasing demand for artificial intelligence applications on edge devices, how to achieve efficient online training and inference in resource-constrained environments has become a key challenge. In existing technologies, memristors and ferroelectric capacitors are often used to build a hybrid memory architecture to optimize the inference and training processes respectively.
[0003] However, this scheme uses a memristor array to store analog weights for inference and a ferroelectric capacitor array to store high-precision digital hidden weights for training. During training, the weights are frequently updated in the ferroelectric capacitors, and the most significant bit of the weight is periodically transferred to the memristor via a digital-to-analog converter circuit to achieve weight synchronization. Although this circuit performs well in terms of energy efficiency and accuracy, it relies on non-standard CMOS materials and has a complex manufacturing process. Furthermore, memristors suffer from low write endurance and high programming energy, while ferroelectric capacitors require special materials, such as doped capacitors. Due to its poor compatibility with standard CMOS processes, high integration complexity, and high cost, there is an urgent need for a solution based entirely on standard CMOS processes to improve reliability, reduce costs, and simplify the manufacturing process. Summary of the Invention
[0004] The purpose of this invention is to provide a method for online training and inference of neural networks using hybrid memory circuits based on CMOS transistors, so as to solve the problems mentioned in the background art.
[0005] To achieve the above objectives, the present invention provides the following technical solution: a method for online training and inference of neural networks using hybrid memory circuits based on CMOS transistors, comprising the following steps:
[0006] S1, CMOS hybrid memory circuit deployment, uses a set of CMOS transistor synaptic devices and a set of CMOS capacitors to form a synaptic unit, and multiple synaptic units to form a hybrid memory array, which is used to provide the core weight storage and multiply-accumulate operation carrier for online training and inference of neural networks;
[0007] S2. Online training based on hybrid memory neural networks, including:
[0008] S21. For each input sample, perform forward propagation using the simulated weights of the synaptic device;
[0009] S22, Error Backpropagation and Gradient Calculation;
[0010] S23. Iterative update of digital hidden weights: The hidden weights stored in the CMOS capacitor are digitally updated according to the gradient update rule. The update probability is controlled by the binary mask.
[0011] S24. Weight timing transmission and analog weight programming: Set the sample update threshold k. After completing the S21 to S23 operations for every k training samples, the training control circuit triggers the weight transmission mechanism to realize the synchronous update of digital hidden weights to analog weights.
[0012] S3. Based on the training results of the neural network, online inference is performed. Input inference samples and use only the analog weights stored in the CMOS transistor synaptic device to perform multiplication and accumulation operations to achieve low-power inference.
[0013] Preferably, the CMOS transistor synaptic device is an n-type MOSFET and is connected in a differential pair configuration to store positive and negative weights W+ and W-, respectively.
[0014] The CMOS capacitor array consists of minimum-size capacitors Cn, storing 10-bit signed integer digital hidden weights in binary form. And the most significant bit uses 4x 2x and 1x Differentiated capacitance;
[0015] A transmission line is connected between the gate of the CMOS transistor synaptic device and the CMOS capacitor array, through parasitic capacitance. The applied voltage directly controls the programming current of the synaptic device. .
[0016] Preferably, the CMOS hybrid memory circuit deployment further includes;
[0017] Training control circuit: By integrating an address decoder, driver circuit, inductive amplifier and timing control module, and connecting with the synaptic unit control, it is responsible for the accurate addressing of weight addresses, driving of weight updates, inductive amplification of weight signals, and timing coordination and logic control of each stage of training, inference and weight transmission during the neural network training process, ensuring the coordinated work of each hardware module.
[0018] Device pre-configuration: Body terminal bias voltage of CMOS transistor synaptic devices adjusted via an additional MOSFET. Configure it in synaptic mode, making it operate in the breakdown region and exhibit controllable resistance switching behavior. At this point, the device conductance G is related to... They are linearly correlated and satisfy the formula;
[0019]
[0020] in The conductivity adjustment factor is determined by the MOSFET device process parameters. This serves as the basic electrical conductance of the device, simulating the long-term or short-term plasticity of the synapse, in preparation for subsequent weight storage and computation.
[0021] Preferably, in step S21, the neural network training samples collected by the edge device are input into the vector. The input is fed into a hybrid memory array, and the training control circuit addresses the corresponding synaptic unit through an address decoder, utilizing the analog weight matrix already stored in the CMOS transistor synaptic device. This completes the inter-layer multiplication and accumulation operations of the neural network, realizes forward propagation, and outputs results that satisfy the core formula.
[0022]
[0023] in This is the bias term for the neural network, provided by the bias register of the training control circuit.
[0024] Preferably, step S22 involves outputting the prediction from the forward propagation. Compared with the true label of the sample In comparison, the loss value was calculated using the mean square error. The formula is:
[0025]
[0026] Where m is the number of training samples in the batch, and the loss value is updated based on the stochastic gradient descent gradient update rule. Regarding simulated weights The gradient value is obtained by taking the partial derivative. The backpropagation of the loss value is completed, and the formula is:
[0027]
[0028] The training control circuit transmits the gradient signal to the control terminal of the corresponding CMOS capacitor array, providing a basis for updating the digital weights.
[0029] Preferably, in step S23, the training control circuit determines the gradient signal based on the gradient signal. Combined with learning rate , Hidden weights of digital data stored in a CMOS capacitor array The core update formula for iterative binary number updates is:
[0030]
[0031] in , for the first The hidden weights of the next iteration It is a binary mask, with a value of 0 or 1, randomly generated by the training control circuit. It is used to control the probability of weight update and avoid overfitting caused by excessive weight updates. Furthermore, the non-destructive read operation of the CMOS capacitor ensures that no data is lost during the update process and that there is no need to rewrite the data.
[0032] Preferably, in step S24, the digital hidden weights in the CMOS capacitor array are read in parallel using an inductive amplifier. The most significant bit and sign bit S, where S=1 for positive and S=-1 for negative, are used to generate an analog voltage on the transmission line that is proportional to the stored digital weight value by utilizing the voltage gradient characteristics of the differential area capacitance. The formula is:
[0033]
[0034] in To store charge for the capacitor array, The value of a unit charge. This is the MSB place weight coefficient, which takes the value 0 or 1. This represents the total capacitance value of the MSB.
[0035] Secondly, the transmission line passes through the generated capacitance. voltage The programming current is directly controlled by applying it to the gate of the CMOS transistor synaptic device. , and Satisfies Ohm's Law ,in This is the equivalent resistance of the transmission line;
[0036] Simultaneous programming current Working in transistor synaptic devices, their conductance value G is programmed to simulate weights. With hidden weights Matching, satisfying ,in This is the conversion coefficient between weights and conductance, ultimately enabling the synchronous update of digital hidden weights to analog weights.
[0037] The entire process requires no external DAC, directly utilizing circuit parasitic parameters to achieve native digital-to-analog conversion, significantly reducing energy consumption.
[0038] Preferably, in the later stages of online training of the neural network, to determine training convergence, steps S21 to S24 need to be repeated, continuously inputting training samples and completing weight updates and synchronization, until the loss function value of the neural network is reached. Drop to preset threshold Or reach the maximum number of training iterations Online training is stopped at this point, and the analog weights stored in the CMOS transistor synaptic device are... These are the optimal weights after training.
[0039] Preferably, step S3 includes:
[0040] S31. Inference sample input: The inference data collected in real time by the edge device is used as the neural network input vector. Transferred to hybrid memory array;
[0041] S32. Low-power multiply-accumulate operation: The training control circuit addresses the corresponding synaptic unit according to the neural network architecture of the inference task, and only calls the optimal analog weights programmed in the CMOS transistor synaptic device. This completes the multiplication and accumulation operations of each layer of the neural network. The core formula is: This process involves no weight updates and no data transfer, minimizing hardware resource consumption.
[0042] Preferably, step S3 further includes inference result output and feedback, whereby the computation signal is amplified by the inductive amplifier of the training control circuit to obtain the final inference output result of the neural network. The results are then transmitted to the display or execution module of the edge device to achieve real-time output of the inference results; if the edge device needs to perform incremental online training, the inference error will be... As a new gradient trigger signal, when At that time, among them To achieve iterative optimization of weights, the online training process in step S2 is automatically triggered to set a preset error threshold.
[0043] The technical effects and advantages of this invention are as follows:
[0044] 1. This method of using a hybrid memory circuit based on CMOS transistors for online training and inference of neural networks employs CMOS transistor synaptic devices and CMOS capacitors to form synaptic units. Multiple synaptic units are combined to form a hybrid memory array, which provides the core weight storage and multiply-accumulate operation carrier for online training and inference of neural networks. The write endurance of CMOS transistor synaptic devices can reach more than 10^5 times, which is far superior to traditional memristors. Moreover, the read operation of CMOS capacitors is non-destructive, avoiding the lifespan loss caused by data rewriting and ensuring the hardware stability of long-term online training and inference of neural networks.
[0045] 2. The method of using a hybrid memory circuit based on CMOS transistors for online training and inference of neural networks only calls simulated weights for calculation during the inference stage, without additional data transmission; no external DAC is required during the weight transmission stage, and digital-to-analog conversion is achieved by utilizing circuit parasitic parameters, which greatly reduces the energy consumption of edge devices and is suitable for battery-powered edge scenarios.
[0046] 3. The method of using a hybrid memory circuit based on CMOS transistors for online training and inference of neural networks is implemented entirely based on standard CMOS technology. It does not require the introduction of non-standard materials such as ferroelectric or memristors, and is fully compatible with existing integrated circuit manufacturing processes. This reduces the cost of mass production of hardware and facilitates large-scale promotion and application. Attached Figure Description
[0047] Figure 1 This is a flowchart of the overall method of the present invention;
[0048] Figure 2 This is a flowchart illustrating the deployment of the CMOS hybrid memory circuit of the present invention.
[0049] Figure 3 This is a flowchart of the online training process for the neural network of this invention;
[0050] Figure 4 This is a flowchart of the online inference process of the neural network of the present invention. Detailed Implementation
[0051] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.
[0052] This invention provides, for example Figures 1 to 4 The method for online training and inference of neural networks using a hybrid memory circuit based on CMOS transistors, as shown, includes the following steps:
[0053] The S1 CMOS hybrid memory circuit deployment uses a set of CMOS transistor synaptic devices and a set of CMOS capacitors to form synaptic units. Multiple synaptic units form a hybrid memory array, which is used to provide the core weight storage and multiply-accumulate operation carrier for online training and inference of neural networks. At the same time, by integrating address decoders, driving circuits, sensing amplifiers and timing control modules, and connecting them with the synaptic units, it is responsible for realizing the accurate addressing of weight addresses, driving weight updates, sensing amplification of weight signals, and timing coordination and logic control of each stage of training, inference and weight transmission during neural network training, ensuring the coordinated operation of each hardware module.
[0054] The CMOS transistor synaptic device uses an n-type MOSFET connected in a differential pair configuration to store positive and negative weights W+ and W-, respectively. The body bias voltage of the CMOS transistor synaptic device is then adjusted via an additional MOSFET. Configure it in synaptic mode, making it operate in the breakdown region and exhibit controllable resistance switching behavior. At this point, the device conductance G is related to... They are linearly correlated and satisfy the formula;
[0055]
[0056] in The conductivity adjustment factor is determined by the MOSFET device process parameters. The basic electrical conductance of the device is used to simulate the long-term or short-term plasticity of the synapse, in order to prepare for subsequent weight storage and computation.
[0057] The CMOS capacitor array consists of a minimum-size capacitor Cn, which stores 10-bit signed integer digital hidden weights in binary form. And the most significant bit uses 4x 2x and 1x Differential capacitance; a transmission line connects the gate of the CMOS transistor synaptic device to the CMOS capacitor array, via parasitic capacitance. The applied voltage directly controls the programming current of the synaptic device. .
[0058] S2. Online training based on hybrid memory neural networks, including:
[0059] S21. For each input sample, forward propagation is performed using the simulated weights of the synaptic device. This process involves inputting the neural network training sample vectors collected by the edge device. The input is fed into a hybrid memory array, and the training control circuit addresses the corresponding synaptic unit through an address decoder, utilizing the analog weight matrix already stored in the CMOS transistor synaptic device. This completes the inter-layer multiplication and accumulation operations of the neural network, realizes forward propagation, and outputs results that satisfy the core formula.
[0060]
[0061] in The bias term for the neural network is provided by the bias register of the training control circuit;
[0062] S22, Error backpropagation and gradient calculation, this process involves calculating the predicted output from the forward propagation. Compared with the true label of the sample In comparison, the loss value was calculated using the mean square error. The formula is:
[0063]
[0064] Where m is the number of training samples in the batch, and the loss value is updated based on the stochastic gradient descent gradient update rule. Regarding simulated weights The gradient value is obtained by taking the partial derivative. The backpropagation of the loss value is completed, and the formula is:
[0065]
[0066] The training control circuit transmits the gradient signal to the control terminal of the corresponding CMOS capacitor array, providing a basis for updating the digital weights.
[0067] S23. Iterative update of digital hidden weights: The hidden weights stored in the CMOS capacitor are digitally updated according to the gradient update rule. The update probability is controlled by a binary mask. During this process, the training control circuit updates the weights based on the gradient signal. Combined with learning rate , Hidden weights of digital data stored in a CMOS capacitor array The core update formula for iterative binary number updates is:
[0068]
[0069] in , for the first The hidden weights of the next iteration It is a binary mask, with a value of 0 or 1, randomly generated by the training control circuit. It is used to control the probability of weight update and avoid overfitting caused by excessive weight updates. Furthermore, the non-destructive read operation of the CMOS capacitor ensures that no data is lost during the update process and that there is no need to rewrite the data.
[0070] S24. Weight timing transfer and analog weight programming: Set the sample update threshold k. After completing operations S21 to S23 for every k training samples, the training control circuit triggers the weight transfer mechanism to achieve synchronous update of digital hidden weights to analog weights. During this process, the digital hidden weights in the CMOS capacitor array are read in parallel through an inductive amplifier. The most significant bit and sign bit S, where S=1 for positive and S=-1 for negative, are used to generate an analog voltage on the transmission line that is proportional to the stored digital weight value by utilizing the voltage gradient characteristics of the differential area capacitance. The formula is:
[0071]
[0072] in To store charge for the capacitor array, The value of a unit charge. This is the MSB place weight coefficient, which takes the value 0 or 1. This represents the total capacitance value of the MSB.
[0073] Secondly, the transmission line passes through the generated capacitance. voltage The programming current is directly controlled by applying it to the gate of the CMOS transistor synaptic device. , and Satisfies Ohm's Law ,in This is the equivalent resistance of the transmission line;
[0074] Simultaneous programming current Working in transistor synaptic devices, their conductance value G is programmed to simulate weights. With hidden weights Matching, satisfying ,in This is the conversion coefficient between weights and conductance, ultimately enabling the synchronous update of digital hidden weights to analog weights.
[0075] The entire process requires no external DAC, directly utilizing circuit parasitic parameters to achieve native digital-to-analog conversion, significantly reducing energy consumption.
[0076] Furthermore, in the later stages of online neural network training, to determine training convergence, steps S21 to S24 need to be repeated, continuously inputting training samples and completing weight updates and synchronization until the loss function value of the neural network is reached. Drop to preset threshold Or reach the maximum number of training iterations Online training is stopped at this point, and the analog weights stored in the CMOS transistor synaptic device are... These are the optimal weights after training.
[0077] S3. Online inference based on neural network training results: Input inference samples and use only the analog weights stored in the CMOS transistor synaptic device to perform multiplication and accumulation operations to achieve low-power inference, specifically including:
[0078] S31. Inference sample input: The inference data collected in real time by the edge device is used as the neural network input vector. Transferred to a hybrid memory array;
[0079] S32. Low-power multiply-accumulate operation: The training control circuit addresses the corresponding synaptic unit according to the neural network architecture of the inference task, and only calls the optimal analog weights programmed in the CMOS transistor synaptic device. This completes the multiplication and accumulation operations of each layer of the neural network. The core formula is: This process involves no weight updates and no data transfer, minimizing hardware resource consumption.
[0080] S33. Inference Result Output and Feedback: The computation signal is amplified by the inductive amplifier of the training control circuit to obtain the final inference output result of the neural network. The results are then transmitted to the display or execution module of the edge device to achieve real-time output of the inference results; if the edge device needs to perform incremental online training, the inference error will be... As a new gradient trigger signal, when At that time, among them To achieve iterative optimization of weights, the online training process in step S2 is automatically triggered to set a preset error threshold.
[0081] Working principle: This method is built on a fully standard CMOS process and can be flexibly adapted and expanded according to the actual needs of edge computing devices and neural network architectures. Specifically:
[0082] 1. Weight precision adaptation: Supports arbitrary low-precision quantization weight configuration of 4 bits and above. The number of capacitors and the number of bits of the CMOS capacitor array can be adjusted according to the inference accuracy requirements to realize digital hidden weight storage with different precisions such as 8 bits and 10 bits, adapting to quantization neural networks of different scales.
[0083] 2. Network architecture adaptation: By adjusting the body bias voltage of the CMOS transistor synaptic device, the same device can be dynamically switched to neural mode or synaptic mode to adapt to the architectural requirements of different neural networks such as convolutional neural networks and recurrent neural networks without redesigning the hardware circuit.
[0084] 3. Device scale adaptation: The hybrid memory array adopts a unitized array design, which can flexibly adjust the number of synaptic units according to the hardware resources of edge devices, such as chip area and power consumption budget, to achieve large-scale adaptation from low unit count of micro-small edge sensors to high unit count of edge computing chips.
[0085] 4. Incremental Training and Continuous Learning: This method supports incremental online training for edge devices. When new training samples are collected, there is no need to retrain the entire neural network. The neural network can be continuously learned by updating the digital hidden weights through the CMOS capacitor array and completing the local analog weight programming, thus adapting to the needs of dynamic data changes in edge scenarios.
[0086] Furthermore, this method utilizes CMOS transistor synaptic devices, achieving a write endurance of over 10^5 cycles, far superior to traditional memristors. The CMOS capacitor read operation is non-destructive, avoiding lifespan loss due to data rewriting and ensuring hardware stability for long-term online training and inference of the neural network. Secondly, during the inference phase, only analog weights are used for computation, with no additional data transmission. The weight transmission phase requires no external DAC, utilizing circuit parasitic parameters for analog-to-digital conversion, significantly reducing the energy consumption of edge devices and adapting to battery-powered edge scenarios. Most importantly, the entire process is based on standard CMOS technology, eliminating the need for non-standard materials such as ferroelectrics and memristors, ensuring full compatibility with existing integrated circuit manufacturing processes, reducing hardware mass production costs, and facilitating large-scale application.
[0087] Finally, it should be noted that the above description is only a preferred embodiment of the present invention and is not intended to limit the present invention. Although the present invention has been described in detail with reference to the foregoing embodiments, those skilled in the art can still modify the technical solutions described in the foregoing embodiments or make equivalent substitutions for some of the technical features. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of the present invention should be included within the protection scope of the present invention.
Claims
1. A method for online training and inference of neural networks using hybrid memory circuits based on CMOS transistors, characterized in that, Includes the following steps: S1, CMOS hybrid memory circuit deployment, uses a set of CMOS transistor synaptic devices and a set of CMOS capacitors to form a synaptic unit, and multiple synaptic units to form a hybrid memory array, which is used to provide the core weight storage and multiply-accumulate operation carrier for online training and inference of neural networks; S2. Online training based on hybrid memory neural networks, including: S21. For each input sample, perform forward propagation using the simulated weights of the synaptic device; S22, Error Backpropagation and Gradient Calculation; S23. Iterative update of digital hidden weights: The hidden weights stored in the CMOS capacitor are digitally updated according to the gradient update rule. The update probability is controlled by the binary mask. S24. Weight timing transmission and analog weight programming: Set the sample update threshold k. After completing the S21 to S23 operations for every k training samples, the training control circuit triggers the weight transmission mechanism to realize the synchronous update of digital hidden weights to analog weights. S3. Online inference based on neural network training results: Input inference samples and perform multiplication and accumulation operations using only the analog weights stored in CMOS transistor synaptic devices to achieve low-power inference. The CMOS transistor synaptic devices are n-type MOSFETs connected in differential pairs to store positive and negative weights respectively. The CMOS capacitor array consists of minimum-size capacitors Cn, which store 10-bit signed integer hidden weights in binary form. Furthermore, the most significant bit employs differentiated capacitors of 4x, 2x, and 1x; a transmission line connects the gate of the CMOS transistor synaptic device to the CMOS capacitor array, and the programming current of the synaptic device is directly controlled by the voltage applied through the parasitic capacitance. The device is pre-configured by adjusting the body terminal bias voltage of the CMOS transistor synaptic device through an additional MOSFET, configuring it to synaptic mode, so that it operates in the breakdown region and exhibits controllable resistance switching behavior, thereby simulating the long-term or short-term plasticity of the synapse.
2. The method for online training and inference of neural networks using a hybrid memory circuit based on CMOS transistors according to claim 1, characterized in that, The CMOS hybrid memory circuit deployment also includes; Training control circuit: By integrating an address decoder, driver circuit, inductive amplifier and timing control module, and connecting with the synaptic unit control, it is responsible for the accurate addressing of weight addresses, driving of weight updates, inductive amplification of weight signals, and timing coordination and logic control of each stage of training, inference and weight transmission during the neural network training process.
3. The method for online training and inference of neural networks using a hybrid memory circuit based on CMOS transistors according to claim 2, characterized in that, In step S21, the neural network training samples collected by the edge device are input into the hybrid memory array. The training control circuit addresses the corresponding synaptic unit through the address decoder and uses the analog weights stored in the CMOS transistor synaptic device to complete the inter-layer multiplication and accumulation operation of the neural network, realize forward propagation and obtain the predicted output result.
4. The method for online training and inference of neural networks using a hybrid memory circuit based on CMOS transistors according to claim 3, characterized in that, Step S22 calculates the loss value by comparing the predicted output of the forward propagation with the true label of the sample. Based on gradient update rules such as stochastic gradient descent, the backpropagation of the loss value is completed through the training control circuit. The update gradient of the simulated weight of each synaptic unit is calculated layer by layer, and the gradient signal is transmitted to the control terminal of the corresponding CMOS capacitor array.
5. The method for online training and inference of neural networks using a hybrid memory circuit based on CMOS transistors according to claim 4, characterized in that, In step S23, the training control circuit performs binary digital iterative updates on the digital hidden weights stored in the CMOS capacitor array based on the gradient signal, and the update probability is controlled by the binary mask.
6. The method for online training and inference of neural networks using a hybrid memory circuit based on CMOS transistors according to claim 5, characterized in that, In step S24, the most significant bit and sign bit of the digital hidden weights in the CMOS capacitor array are read in parallel using an inductive amplifier. Utilizing the voltage gradient characteristics of the differential area capacitors, an analog voltage proportional to the stored digital weight value is generated on the transmission line. The transmission line transmits voltage through parasitic capacitance. The programming current is applied to the gate of the CMOS transistor synaptic device, directly controlling its programming current, thereby programming the conductance value of the CMOS transistor synaptic device and completing the synchronous update from digital hidden weights to analog weights.
7. The method for online training and inference of neural networks using a hybrid memory circuit based on CMOS transistors according to claim 6, characterized in that, In the later stages of online training of the neural network, to determine the convergence of training, steps S21 to S24 need to be repeated, continuously inputting training samples and completing weight updates and synchronization, until the loss function value of the neural network drops to a preset threshold or reaches the maximum number of training iterations, at which point online training stops. At this time, the simulated weights stored in the CMOS transistor synapse device are the optimal weights after training.
8. The method for online training and inference of neural networks using a hybrid memory circuit based on CMOS transistors according to claim 7, characterized in that, Step S3 includes: S31, Inference Sample Input: The inference data collected in real time by the edge device is used as the neural network input sample and transmitted to the hybrid memory array; S32, Low-power multiply-accumulate operation: The training control circuit addresses the corresponding synaptic unit according to the neural network architecture of the inference task, and only calls the optimal simulated weights programmed in the CMOS transistor synaptic device to complete the multiply-accumulate operation of each layer of the neural network. This process has no weight update and no data transmission, and the hardware resource consumption is minimal.
9. The method for online training and inference of neural networks using a hybrid memory circuit based on CMOS transistors according to claim 5, characterized in that, Step S3 also includes inference result output and feedback. The operation signal is amplified by the inductive amplifier of the training control circuit to obtain the final inference output result of the neural network, and the result is transmitted to the display or execution module of the edge device to realize the real-time output of the inference result. If the edge device needs to perform incremental online training, the inference error is used as a new gradient trigger signal. By setting a preset error threshold, when the threshold is reached, the online training process of step S2 is automatically triggered to realize the iterative optimization of the weights.