A method for training a neural network based on multi-synapse connection and local plasticity modulation

By constructing an incremental learning neural network through multi-synaptic connections and local plasticity modulation mechanisms, the catastrophic forgetting problem is solved, and stable learning and high accuracy are achieved in complex environments, making it suitable for edge computing devices.

CN121189428BActive Publication Date: 2026-06-26TIANJIN UNIV

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
TIANJIN UNIV
Filing Date
2025-09-23
Publication Date
2026-06-26

AI Technical Summary

Technical Problem

Existing incremental learning methods are prone to catastrophic forgetting when the model continuously learns new tasks, and are sensitive to the order of tasks, resulting in performance fluctuations.

Method used

By employing multi-synaptic connections and local plasticity modulation mechanisms, a multi-synaptic neural network is constructed by simulating the neural structure of the biological brain. The intensity of local synaptic activity is recorded using qualification traces, and synaptic weights are dynamically adjusted to achieve stable incremental learning.

Benefits of technology

It significantly mitigates catastrophic forgetting under fixed network capacity, improves the robustness and accuracy of the model in complex environments, and is suitable for resource-constrained edge computing devices.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN121189428B_ABST
    Figure CN121189428B_ABST
Patent Text Reader

Abstract

The application discloses a kind of based on the method for training of increment learning neural network of multisynaptic connection and local plasticity modulation, main steps include: constructing neural network based on multisynaptic connection, include fully connected layer and convolution layer: in fully connected layer, introduce multiple synapse connections between adjacent two layers of neurons, in convolution layer, design multiple parallel weight channels for each convolution kernel element;A eligibility trace is associated for each synapse weight, for recording local synapse activity intensity, and network parameters are initialized;Each task is allocated subnetwork by weight mask, eligibility trace is updated according to the output value of neuron, the model is trained using back propagation algorithm, and the allocated weight is frozen;According to the value of synapse modulation factor calculated according to eligibility trace intensity, update weight in combination with modulation factor;The application can effectively improve the accuracy of increment learning, realize zero forgetting, and keep the robustness to task order change.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the fields of neuromorphic computing and incremental learning, and in particular to a method for training neural networks using incremental learning based on multisynaptic connections and local plasticity modulation; a neuromorphic intelligent system for solving the catastrophic forgetting problem in neural networks and suitable for continuous learning scenarios. Background Technology

[0002] Deep neural networks have made groundbreaking progress in many fields, but they typically rely on the assumption of independent and identically distributed data and are trained once on static datasets. When the model needs to continuously learn a series of tasks, the learning of new tasks will overwrite the network weights optimized for old tasks, causing the model to quickly forget old knowledge. This problem is known as "catastrophic forgetting".

[0003] Incremental learning aims to develop models that can continuously acquire and retain knowledge from a series of tasks or data streams, thus simulating the human ability to learn progressively over time. This approach, also known as continuous learning or lifelong learning, shows great promise in building efficient systems that can adapt to dynamic environments. However, a core challenge of incremental learning is catastrophic forgetting, where the model's performance on previously learned tasks significantly declines after learning a new task.

[0004] To mitigate catastrophic forgetting, existing research has proposed various methods, which can be broadly categorized into three types: replay-based methods, regularization-based methods, and architecture-based methods. Among these, architecture-based methods have gradually gained widespread attention due to their ability to dynamically adjust network structure to adapt to new tasks. These methods primarily rely on two strategies: network expansion and parameter pruning. Network expansion increases the network capacity on top of the existing model to accommodate new tasks and reuses existing parameters to mitigate forgetting; parameter pruning involves pruning from a pre-allocated dense model, assigning a sub-network to each old task, and subsequent training is performed only on the unpruned parameters.

[0005] These strategies offer several advantages, such as more efficient memory usage and the potential to achieve forgetting-free learning under certain conditions. However, they also face some significant challenges. First, network expansion requires continuously increasing model capacity to handle new tasks, leading to a gradual increase in memory and computational overhead. Furthermore, both expansion and pruning strategies are typically sensitive to the task presentation order, causing significant fluctuations in model performance depending on the task sequence. Summary of the Invention

[0006] The purpose of this invention is to address the problems existing in the prior art by providing a method for training neural networks based on multisynaptic connections and local plasticity modulation incremental learning. By simulating the neural structure and working principle of the biological brain, the neural network can efficiently and stably learn multiple tasks continuously, while significantly mitigating catastrophic forgetting.

[0007] The objective of this invention is achieved by proposing a mechanism for multisynaptic neuron connectivity and local plasticity modulation, comprising the following steps:

[0008] 1. A method for incremental learning training a neural network based on multi-synaptic connections and local plasticity modulation, comprising the following steps:

[0009] S1. Obtain the image dataset for incremental learning and input it into the neural network in a random task order;

[0010] S2. Introduce multiple synaptic connections between adjacent neurons in the fully connected layer of the neural network; at the same time, add multiple parallel synaptic weight channels to each convolutional kernel element in the convolutional layer of the neural network to construct a neural network with multiple synaptic connections.

[0011] S3. Associate an eligibility trace with each synaptic weight of the multi-synaptic neural network using the following formula to record the intensity of local synaptic activity; and initialize the parameters of the multi-synaptic neural network.

[0012]

[0013] in: The decay time constant, Represents the Dirac function, For the index of pulse events, Indicates the first The moment when a pulse event occurs;

[0014] S4. Assign a sub-network to each random input task using a weight mask, update the eligibility trace using the following discrete-time formula, train the multi-synaptic neural network using the backpropagation algorithm, and freeze the assigned network weights.

[0015]

[0016] in: Indicates in Pulse output at any given moment;

[0017] S5. Construct a modulation factor model based on the eligibility trace strength of each synapse.

[0018] S6. The fusion modulation factor model is used to update the synaptic weights, train the neural network with multiple synaptic connections, and then test it.

[0019] Furthermore, in step S2, the multi-synaptic neural network obtains a multi-synaptic neuron spiking neural network by replacing the single synaptic weight of the spiking neuron with multiple parallel synaptic weights. membrane potential at time 1 Represented as:

[0020]

[0021] in: This indicates the number of neurons in the previous layer. This indicates the number of parallel synapses between the neuron and the afferent neuron. Indicates that neurons are Output at any moment Indicates the issuance threshold. , Represents the membrane potential time constant. Indicates the time step. Indicates in Time of the first The input to each afferent neuron has different decay coefficients at different parallel synapses.

[0022] Furthermore, the process of constructing a modulation factor model based on the eligibility trace strength of each synaptic association in step S5 specifically includes:

[0023] The effect of eligibility trace on synaptic plasticity is determined by a nonlinear function. Control, A segmented approach is adopted, adjusting the intensity and direction of synaptic weight changes based on the eligibility trace intensity; where:

[0024]

[0025] in: This indicates the critical point at which synaptic weight updates transition from enhancement to inhibition. Used to delineate the boundary between lower and medium modulation levels The maximum intensity of modulation is controlled. Low or medium levels of synaptic activity result in synaptic inhibition, while high levels of synaptic activity result in synaptic enhancement. When the intensity of synaptic activity is too high or too low, plasticity will be completely suppressed. When the intensity of activity is 0, no modulation occurs. Criticized as normalization The value of is used to reflect the relative intensity of local synaptic activity.

[0026] Furthermore, in step S6, the synaptic weights are updated by fusing the modulation factor model using the following formula:

[0027]

[0028] in: For learning rate, For gradient, modulation function Controls the magnitude and direction of synaptic weight updates.

[0029] Beneficial effects

[0030] This invention proposes an incremental learning method based on multi-synaptic connected neurons, which is biologically sound and highly efficient. This method achieves dynamic allocation of synaptic resources in a fixed-capacity neural network, effectively improving the network's representational and knowledge retention capabilities, significantly enhancing the system's scalability, and without requiring expansion of the network structure.

[0031] This invention proposes a synaptic plasticity modulation mechanism based on local synaptic activity intensity. It employs qualification traces to accurately record neural activity-dependent synaptic intensity, thereby achieving dynamic modulation of synaptic plasticity. This mechanism significantly enhances the model's robustness to changes in the order of input tasks and effectively improves its incremental learning ability in complex dynamic environments.

[0032] This invention's local update rules reduce reliance on global data, resulting in more efficient computation. This makes it ideal for deployment on resource-constrained edge devices such as mobile phones, robots, and IoT devices, enabling incremental learning; it is also well-suited for edge computing.

[0033] Experimental verification on multiple mainstream datasets shows that the method of this invention has excellent performance in both spiking neural networks and non-spiking neural networks. It can effectively improve the accuracy of incremental learning, achieve zero forgetting effect, and maintain robustness to changes in task order. It has strong practicality and broad application prospects. Attached Figure Description

[0034] Figure 1 This is a flowchart of an incremental learning method based on multisynaptic connections and local plasticity modulation proposed in this invention;

[0035] Figure 2 This is a schematic diagram of a multisynaptic neuron proposed in this invention.

[0036] Figure 3 This is a schematic diagram of the update process of the local synaptic plasticity modulation factor proposed in this invention.

[0037] Figure 4 This is a diagram showing the experimental results of an embodiment of the present invention. Detailed Implementation

[0038] The following is in conjunction with the appendix Figure 1 The present invention is described as follows:

[0039] This invention provides an incremental learning method based on multi-synaptic connections and local plasticity modulation, such as... Figure 1 As shown, it includes the following steps:

[0040] S1. Data Preparation and Task Sequence Generation: Obtain the image dataset for incremental learning and input it in a random task order to simulate a random task arrival sequence in the real world.

[0041] S2. Construct a neural network based on multi-synaptic connections, including fully connected layers and convolutional layers: The biomimetic core of this structure lies in:

[0042] In the fully connected layer, multiple synaptic connections are introduced between neurons in adjacent layers; in the convolutional layer, multiple parallel weight channels are designed for each convolutional kernel element; this structure simulates the situation where biological neurons are connected by a large number of synapses, providing a structural basis for complex plasticity changes.

[0043] Step S2 specifically includes: as follows Figure 2 As shown, in spiking neural networks, multiple parallel synaptic weights replace the single synaptic weights of spiking neurons. Based on multi-synaptic connected neurons, spiking neural networks... membrane potential at time 1 Represented as:

[0044]

[0045] in This indicates the number of neurons in the previous layer. This indicates the number of parallel synapses between the neuron and the afferent neuron. Indicates that neurons are Output at any moment Indicates the issuance threshold. , Represents the membrane potential time constant. Indicates the time step. Indicates in Time of the first The input to each afferent neuron has different decay coefficients at different parallel synapses.

[0046] In this embodiment, multisynaptic neurons are extended to artificial neural networks, representing each connection between two adjacent layers in a fully connected layer. Construct multiple parallel synaptic connections, represented as the following vector:

[0047]

[0048] The parallel synapses work together by summing, when they receive an input vector. At that time, neurons The input is represented as:

[0049]

[0050] in Indicates synapse The corresponding activation function.

[0051] S3. Associate and initialize biomimetic qualifications: Associate a qualification trace with each synaptic weight to record the intensity of local synaptic activity. This serves to simulate the chemical signals that record the history of local activity in biological synapses; it is key to achieving local plasticity and initializes the network parameters.

[0052] S4, Subnetwork Assignment and Local Activity-Based Training:

[0053] Task-specific subnetwork allocation: Subnetworks are allocated to each task using weight masks to simulate the phenomenon in the brain where different functions are handled by different neural circuits;

[0054] Eligibility trace update: Update the eligibility trace based on the neuron's output value.

[0055] Training and Freezing: The backpropagation algorithm is used to calculate the gradient of the loss function with respect to the weights, and the weights assigned to the current task are frozen after training to protect the learned knowledge.

[0056] Step S4 specifically includes:

[0057] like Figure 3 As shown, the modulated signal within a continuous time interval is represented as the cumulative sum of synaptic spike events. For the receiving... The first postsynaptic neuron with synaptic input. For each input synapse, the eligibility trace strength is defined as:

[0058]

[0059] in: The decay time constant, Represents the Dirac function, For the index of pulse events, Indicates the first The time of occurrence of each pulse event. In practical implementation, a discrete-time formula is used, and... The update is as follows:

[0060]

[0061] in Indicates in The pulse output at any given moment.

[0062] In this embodiment, local plasticity modulation is extended to artificial neural networks, neurons The input value is passed to a non-linear activation function. To simplify the model structure, a bias term is not used, resulting in a neuron. The output value is By correlating the synaptic activity intensity of each neuron with its output value, the qualification trace intensity is defined as follows:

[0063]

[0064] S5. Local Plasticity Adjustment and Weight Update: The core of this step is to simulate the regulatory effect of neuromodulation on synaptic plasticity.

[0065] The modulation factor is calculated based on the eligibility trace strength of each synapse, and the synaptic weights are updated in combination with the modulation factor.

[0066] Step S5 specifically includes:

[0067] The effect of eligibility traces on synaptic plasticity is determined by a nonlinear function. Control, A segmented approach is adopted, adjusting the intensity and direction of synaptic weight changes based on the eligibility trace intensity. Criticized as normalization The value of is used to reflect the relative intensity of local synaptic activity. Defined as:

[0068]

[0069] in: This indicates the critical point at which synaptic weight updates transition from enhancement to inhibition. Used to delineate the boundary between low and medium modulation intensities. The maximum intensity of modulation is controlled; lower or moderate levels of synaptic activity lead to synaptic inhibition, while higher levels lead to synaptic enhancement. When synaptic activity is too high or too low, plasticity is completely suppressed; when the activity intensity is zero, no modulation occurs. The synaptic weight update formula for fusing local synaptic plasticity modulation is as follows:

[0070]

[0071] in For learning rate, For gradient, modulation function Control the size and direction of synaptic weight updates.

[0072] S6. Obtain the trained model and test it.

[0073] Table 1 shows a comparison between the present invention and the latest incremental learning methods. The accuracy in the table is obtained using the same architecture based on mainstream PMNIST, 10-split CIFAR-100 and 5-Datasets. The smaller the backtransfer value, the greater the degree of forgetting. A backtransfer value of 0 indicates that catastrophic forgetting has been overcome. Among them, EWC is derived from existing technical literature: Kirkpatrick J, Pascanu R, Rabinowitz N, et al. Overcoming catastrophic forgetting in neural networks[J]. Proceedings of the national academy of sciences, 2017, 114(13): 3521-3526; HAT is derived from existing technical literature: Serra J, Suris D, Miron M, et al. Overcoming catastrophic forgetting with hard attention to the task[C] / / International conference on machine learning. PMLR, 2018: 4548-4557; GPM is derived from existing technical literature: Saha G, Garg I, Roy K. Gradient projection memory for continual learning[J]. arXiv preprint arXiv:2103.09762, 2021; HLOP is derived from existing technical literature: Xiao M, Meng Q, Zhang Z, et al. Hebbian learning based orthogonalprojection for continual learning of spiking neural networks[J]. arXivpreprint arXiv:2402.11984, 2024.

[0074] Table 1

[0075]

[0076] Figure 4The diagram illustrates the accuracy changes of different methods under different input orders. The bar chart compares the accuracy changes of the present invention with the current state-of-the-art methods for each task. The line graph shows the variance of the accuracy for each task under different input orders, with smaller variance indicating stronger robustness to changes in task order.

[0077] Among them, WSN is derived from existing technical literature: Kang H, Mina RJL, Madjid SRH, et al. Forget-free continual learning with winning subnetworks[C] / / InternationalConference on Machine Learning. PMLR, 2022: 10734-10750.

[0078] Example: An ANN-based incremental learning system for image classification

[0079] 1. Data and Tasks (S1): Using the CIFAR-100 dataset, it is divided into 10 tasks, each containing images of 10 categories. The tasks are presented to the model in a random order.

[0080] 2. Network Construction (S2): Construct an Artificial Neural Network (ANN). In the fully connected layers, replace the connection between each pair of neurons with 3 parallel synapses (M=3). In the convolutional layers, each element of each 3x3 convolutional kernel corresponds to 3 parallel weight channels.

[0081] 3. Initialization (S3): Initialize the network weights and assign a qualification trace to each synaptic weight (a total of 3 times the parameters of the standard network), with the initial value set to 0.

[0082] 4. Training the first task (S4, S5):

[0083] Model learning task 1 (e.g., containing 10 classes such as "airplane" and "car").

[0084] Forward propagation calculates the output, and backpropagation calculates the gradient. .

[0085] Update eligibility records.

[0086] Based on the eligibility trace strength, the modulation factor for each weight update is calculated using a piecewise function.

[0087] Update the weights.

[0088] After training is complete, a weight mask is generated to identify the weights most important for Task 1 and then frozen.

[0089] 5. Follow-up tasks for incremental learning:

[0090] When learning Task 2, only weights that were not frozen (i.e., weights that were not important in Task 1 and newly added synapses) are allowed to participate in the update.

[0091] Repeat steps S4 and S5 to assign and train a dedicated subnetwork for Task 2.

[0092] This process continues until all 10 tasks have been learned. The final model will maintain high accuracy on all tasks, demonstrating its effectiveness in overcoming catastrophic forgetting.

Claims

1. A method for incremental learning training neural networks based on multi-synaptic connections and local plasticity modulation, characterized in that, Includes the following steps: S1. Obtain the image dataset for incremental learning and input it into the neural network in a random task order; S2. Introduce multiple synaptic connections between adjacent neurons in the fully connected layer of the neural network; at the same time, add multiple parallel synaptic weight channels to each convolutional kernel element in the convolutional layer of the neural network to construct a neural network with multiple synaptic connections. S3. Associate an eligibility trace with each synaptic weight of the multi-synaptic neural network using the following formula to record the intensity of local synaptic activity; and initialize the parameters of the multi-synaptic neural network. ; in: The decay time constant, Represents the Dirac function, For the index of pulse events, Indicates the first The moment when a pulse event occurs; S4. Assign a sub-network to each random input task using a weight mask, update the eligibility trace using the following discrete-time formula, train the multi-synaptic neural network using the backpropagation algorithm, and freeze the assigned network weights. ; in: Indicates in Pulse output at any given moment; S5. Construct a modulation factor model based on the eligibility trace strength of each synapse. S6. The fusion modulation factor model is used to update the synaptic weights, train the neural network with multiple synaptic connections, and then test it.

2. The method for incremental learning training a neural network based on multi-synaptic connections and local plasticity modulation according to claim 1, characterized in that, In step S2, the multi-synaptic neural network obtains a multi-synaptic neuron spiking neural network by replacing the single synaptic weight of the spiking neuron with multiple parallel synaptic weights. membrane potential at time 1 Represented as: ; in: This indicates the number of neurons in the previous layer. This indicates the number of parallel synapses between the neuron and the afferent neuron. Indicates that neurons are Output at any moment Indicates the issuance threshold. , Represents the membrane potential time constant. Indicates the time step. Indicates in Time of the first The input to each afferent neuron has different decay coefficients at different parallel synapses.

3. The method for incremental learning training a neural network based on multi-synaptic connections and local plasticity modulation according to claim 1, characterized in that, The process of constructing a modulation factor model based on the eligibility trace strength of each synaptic association in step S5 specifically includes: The effect of eligibility trace on synaptic plasticity is determined by a nonlinear function. Control, A segmented approach is adopted, adjusting the intensity and direction of synaptic weight changes based on the eligibility trace intensity; where: ; in: This indicates the critical point at which synaptic weight updates transition from enhancement to inhibition. Used to delineate the boundary between lower and medium modulation levels The maximum intensity of modulation is controlled. Low or medium levels of synaptic activity result in synaptic inhibition, while high levels of synaptic activity result in synaptic enhancement. When the intensity of synaptic activity is too high or too low, plasticity will be completely suppressed. When the intensity of activity is 0, no modulation occurs. Criticized as normalization The value of is used to reflect the relative intensity of local synaptic activity.

4. The method for incremental learning training a neural network based on multi-synaptic connections and local plasticity modulation according to claim 1, characterized in that, In step S6, the synaptic weights are updated by fusing the modulation factor model using the following formula: ; in: For learning rate, For gradient, modulation function Controls the magnitude and direction of synaptic weight updates.