Spiking Neural Networks and Image Recognition Methods
By introducing multiple processing channels and decoding layers into the spiking neural network and utilizing a combination of excitatory and inhibitory spiking neurons, the problems of insufficient anti-interference ability and recognition accuracy of the spiking neural network are solved, achieving higher recognition accuracy and robustness.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- HUAWEI TECH CO LTD
- Filing Date
- 2020-07-17
- Publication Date
- 2026-06-30
Smart Images

Figure CN113962355B_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of neural network technology, and in particular to a spiking neural network and an image recognition method. Background Technology
[0002] The rise of deep learning in recent years has sparked a surge of research in artificial intelligence. However, due to the inherent limitations of deep learning, it has been found that it cannot achieve true artificial intelligence. Since the brain is considered the most superior intelligent entity in nature, researchers have begun to focus on its mechanisms and systems, developing brain-inspired AI technologies. Against this backdrop, brain-inspired intelligence has emerged. Spiking Neural Networks (SNNs), as the third generation of neural networks, are at the core of the brain-inspired intelligence field. As a brain-inspired machine learning algorithm, SNNs use discrete pulses for signal transmission and processing. Therefore, compared to traditional Artificial Neural Networks (ANNs) that process continuous analog signals, SNNs are closer to the working principle of the human brain's neural networks and are more suitable for revealing the essence of brain-inspired intelligence.
[0003] In related technologies, when using spiking neural networks for image recognition, deep convolutional spiking neural networks are generally used, and in order to pursue higher recognition accuracy, the depth of the spiking neural network is usually quite high. However, high depth can easily lead to spiking neural network failures, making the spiking neural network less resistant to interference. Therefore, there is a need to provide a spiking neural network that provides accurate recognition results and has strong anti-interference capabilities. Summary of the Invention
[0004] To address the issue of poor anti-interference capability, this application provides a spiking neural network and an image recognition method.
[0005] In a first aspect, this application provides a spiking neural network, which includes multiple processing channels; a first processing channel of the multiple processing channels includes N sets of first spiking neurons, which are used to obtain a first recognition result of a target image based on a first set of image data, wherein each set of first spiking neurons includes M spiking neurons, and the first set of image data includes image data of M image regions of a first size obtained based on the target image; a second processing channel of the multiple processing channels includes P sets of second spiking neurons, which are used to obtain a second recognition result of the target image based on a second set of image data, wherein each set of second spiking neurons includes Q spiking neurons, and the second set of image data includes image data of Q image regions of a second size obtained based on the target image; wherein M, N, P and Q are all greater than or equal to 2, and the first recognition result and the second recognition result are used to obtain a final recognition result of the target image.
[0006] The scheme shown in this application includes a spiking neural network that can include multiple processing channels. A first processing channel comprises N sets of first spiking neurons, each set containing M spiking neurons. The spiking neurons in the N sets are used to obtain a first recognition result of the target image based on a first set of image data. The first set of image data consists of image data of M image regions of a first size obtained from the target image. A second processing channel comprises P sets of second spiking neurons, each set containing Q spiking neurons. The spiking neurons in the P sets are used to obtain a second recognition result of the target image based on a second set of image data. The second set of image data consists of image data of Q image regions of a second size obtained from the target image. The first size is not equal to the second size. These first and second recognition results can be used to obtain the final recognition result of the target image.
[0007] In this way, since multiple processing channels can be used in parallel to determine the final recognition result of the target image, not only can the recognition accuracy be increased, but also the remaining channels can be used to determine the final recognition result of the target image when one channel fails, thus improving the anti-interference ability of the spiking neural network.
[0008] In one possible implementation, the spiking neural network further includes a decoding layer; the decoding layer is fully connected to multiple processing channels; the decoding layer is used to determine the final recognition result of the target image based on the first recognition result and the second recognition result.
[0009] The scheme shown in this application further includes a decoding layer, which is fully connected to multiple processing channels. The number of spiking neurons in the decoding layer is equal to the number of image categories that the spiking neural network can recognize. The spiking neurons in the decoding layer are used to determine the final recognition result of the target image using the first recognition result and the second recognition result. In this way, the decoding layer integrates the recognition results of multiple processing channels through full connectivity to obtain the final recognition result, making image recognition more accurate.
[0010] In one possible implementation, the first processing channel includes: a first input layer for receiving a first set of image data and encoding the first set of image data into a first set of pulse signals; and a first output layer including N sets of first spiking neurons for: receiving the first set of pulse signals, wherein different spiking neurons located in the same set of first spiking neurons are respectively used to receive pulse signals of different image regions in the first set of image data, and one spiking neuron is used to receive the pulse signal corresponding to a pixel in an image region; and obtaining a first recognition result of the target image based on the output signals of the spiking neurons in the N sets of first spiking neurons, wherein the output signals of the N spiking neurons used to receive pulse signals of the same image region correspond to the recognition result of one image region in the first set of image data, and the N spiking neurons used to receive pulse signals of the same image region are located in different sets of spiking neurons.
[0011] The scheme shown in this application includes a first processing channel comprising a first input layer and a first output layer. The first input layer receives a first set of image data of the target image and encodes the first set of image data into a first set of pulse signals according to the encoding information corresponding to the first processing channel. The first output layer includes N sets of first spiking neurons. These N sets of first spiking neurons can receive the first set of pulse signals output by the first input layer and output a signal. Based on this output signal, a first recognition result of the target image can be obtained. Different spiking neurons located in the same set of first spiking neurons are used to receive pulse signals from different image regions of the first set of image data, and one spiking neuron receives the pulse signal corresponding to one pixel of one image region. Thus, it is equivalent to one set of first spiking neurons corresponding to one image region of the target image, and different spiking neurons in one set of first spiking neurons corresponding to different image regions in the first set of image data. The output signals of the N spiking neurons used to receive pulse signals from the same image region correspond to the recognition result of one image region in the first set of image data, and the N spiking neurons used to receive pulse signals from the same image region are located in different sets of spiking neurons. In this way, N spiking neurons that receive pulse signals from the same image region compete to output pulse signals, which can extract the most sensitive features, thus making the final recognition result more accurate.
[0012] In one possible implementation, the second processing channel includes: a second input layer for receiving a second set of image data and encoding the second set of image data into a second set of pulse signals; and a second output layer including P sets of second spiking neurons for: receiving the second set of pulse signals, wherein different spiking neurons located in the same set of second spiking neurons are respectively used to receive pulse signals of different image regions in the second set of image data, and one spiking neuron is used to receive the pulse signal corresponding to a pixel in an image region; and obtaining a second recognition result of the target image based on the output signals of the spiking neurons in the P sets of second spiking neurons, wherein the output signals of the P spiking neurons used to receive pulse signals of the same image region correspond to the recognition result of one image region in the second set of image data, and the P spiking neurons used to receive pulse signals of the same image region are located in different sets of spiking neurons.
[0013] The scheme shown in this application includes a second processing channel comprising a second input layer and a second output layer. The second input layer receives a second set of image data of the target image and encodes the second set of image data into a second set of pulse signals according to the coding coefficients corresponding to the second processing channel. The second output layer comprises a set of P second spiking neurons, which can receive the second set of pulse signals output by the second input layer and output a signal. Based on the output signal, a second recognition result for the target image can be obtained. Different spiking neurons located in the same set of second spiking neurons receive pulse signals from different image regions of the second set of image data, and one spiking neuron receives the pulse signal corresponding to a pixel in one image region. Thus, it is equivalent to one set of second spiking neurons corresponding to one image region of the target image, and different spiking neurons in one set of second spiking neurons corresponding to different image regions in the second set of image data. The output signals of the P spiking neurons that receive pulse signals from the same image region correspond to the recognition result of one image region in the second set of image data, and the P spiking neurons that receive pulse signals from the same image region are located in different sets of spiking neurons. In this way, the P spiking neurons that receive pulse signals from the same image region compete to output pulse signals, which can extract sensitive features, thus making the final recognition result more accurate.
[0014] In one possible implementation, the output signals of N spiking neurons receiving pulse signals from the same image region are the pulse output signals of one or more of the N spiking neurons receiving pulse signals from the same image region. In this way, only the spiking neurons corresponding to the sensitive features among the spiking neurons receiving pulse signals from the same image region can output pulse signals for final classification, thus making the final recognition result more accurate.
[0015] In one possible implementation, the N spiking neurons receiving pulse signals from the same image region include multiple excitatory spiking neurons and one or more inhibitory spiking neurons. The inhibitory spiking neurons suppress the output pulse signals of the multiple excitatory spiking neurons. The output signals of the N spiking neurons receiving pulse signals from the same image region include: pulse output signals from one or more excitatory spiking neurons based on the received pulse signals from the same image region, under the inhibition of the inhibitory spiking neurons. Thus, because the inhibitory spiking neurons suppress the excitatory spiking neurons, the winner-takes-all situation of the excitatory spiking neurons is reduced, enabling the extraction of more sensitive features, thereby making the final recognition result more accurate.
[0016] In one possible implementation, the first output layer is further configured to: obtain recognition results for different parts of an image region from the first set of image data based on the output signals of every K spiking neurons out of N spiking neurons that receive pulse signals from the same image region, where N is an integer multiple of K. In this way, each K spiking neuron corresponds to one recognition result, which is the recognition result for a portion of an image region, thus allowing for the extraction of more detailed features and making the final recognition result more accurate.
[0017] In one possible implementation, the first spiking neuron set includes a first spiking neuron for outputting a pulse signal when the membrane voltage of the first spiking neuron is greater than or equal to a membrane voltage threshold, wherein the membrane voltage of the first spiking neuron is determined based on the received neurotransmitter and the received pulse signal.
[0018] The scheme shown in this application refers to any spiking neuron included in the first spiking neuron set as a first spiking neuron. When the membrane voltage of a first spiking neuron is greater than or equal to a preset membrane voltage threshold, it can output a pulse signal. The membrane voltage of the first spiking neuron is determined based on the received neurotransmitter and the received pulse signal. Thus, by considering the neurotransmitter, when a pulse signal of the same type as the neurotransmitter is received in the next pulse response stage, a faster response or stronger suppression can be achieved, thereby improving the convergence speed and recognition accuracy of the spiking neural network.
[0019] Secondly, an image recognition method is provided, comprising: acquiring a first set of image data and a second set of image data, wherein the first set of image data includes image data of M image regions of a first size obtained from a target image, and the second set of image data includes image data of Q image regions of a second size obtained from the target image, wherein M and Q are both greater than or equal to 2; obtaining a first recognition result of the target image based on the first set of image data; obtaining a second recognition result of the target image based on the second set of image data; and obtaining a final recognition result of the target image based on the first recognition result and the second recognition result.
[0020] The scheme shown in this application uses an image recognition method executed by a spiking neural network system. This system acquires a first set of image data and a second set of image data for the target image. The first set of image data includes image data from M image regions of a first size obtained from the target image, and the second set of image data includes image data from Q image regions of a second size obtained from the target image. The first size is not equal to the second size, and Q is not equal to M.
[0021] The spiking neural network system obtains a first recognition result of the target image based on a first set of image data, and a second recognition result based on a second set of image data. The spiking neural network system uses the first and second recognition results to determine the final recognition result of the target image. Thus, since multiple recognition results can be used to determine the final recognition result of the target image, recognition accuracy can be increased. Furthermore, using multiple recognition results for the target image also improves its anti-interference capability.
[0022] In one possible implementation, the spiking neural network system includes a first input layer and a first output layer, wherein the first output layer includes a set of N first spiking neurons. The method further includes: multiple spiking neurons in the first input layer encode a first set of image data into a first set of pulse signals; obtaining a first recognition result of a target image based on the first set of image data, including: the first output layer receiving the first set of pulse signals, wherein different spiking neurons located in the same set of first spiking neurons are respectively used to receive pulse signals of different image regions in the first set of image data, and one spiking neuron is used to receive the pulse signal corresponding to a pixel in an image region; the first output layer obtains the first recognition result of the target image based on the output signals of the spiking neurons in the set of N first spiking neurons, wherein the recognition result of one image region in the first set of image data is obtained based on the output signals of the N spiking neurons used to receive pulse signals of the same image region, and the N spiking neurons used to receive pulse signals of the same image region are located in different sets of spiking neurons.
[0023] The scheme shown in this application includes a spiking neural network system comprising a first input layer and a first output layer. The first output layer comprises a set of N first spiking neurons. Multiple spiking neurons in the first input layer encode the first set of image data into a first set of pulse signals according to the encoding information corresponding to the first set of image data. The first output layer then receives the first set of pulse signals input from the first input layer. Different spiking neurons located in the same set of first spiking neurons are used to receive pulse signals from different image regions in the first set of image data. Furthermore, one spiking neuron is used to receive the pulse signal corresponding to a pixel in one image region. The first output layer obtains the recognition result of one image region in the first set of image data based on the output signals of the N spiking neurons used to receive pulse signals from the same image region. The N spiking neurons used to receive pulse signals from the same image region are located in different sets of spiking neurons. The recognition results of each image region in the first set of image data constitute the first recognition result of the target image. Thus, by obtaining recognition results by image region, more detailed features of the target image can be extracted, resulting in a more accurate final recognition result of the target image.
[0024] In one possible implementation, obtaining the first recognition result of the target image based on the first set of image data further includes: the first output layer obtaining the recognition result of different parts of an image region in the first set of image data based on the output signals of each of the K spiking neurons among the N spiking neurons that receive the pulse signal of the same image region, where N is an integer multiple of K.
[0025] The scheme shown in this application allows the first output layer to obtain recognition results for different parts of an image region in the first set of image data based on the output signals of every K spiking neurons out of N spiking neurons that receive pulse signals from the same image region. Thus, the first recognition result can also include recognition results for different parts of M image regions of a first size, thereby extracting more detailed features from the target image and making the final recognition result of the obtained target image more accurate.
[0026] In one possible implementation, the spiking neural network system further includes a second input layer and a second output layer, wherein the second output layer includes a set of P second spiking neurons. The method further includes: multiple spiking neurons in the second input layer encode a second set of image data into a second set of pulse signals; obtaining a second recognition result of the target image based on the second set of image data includes: the second output layer receiving the second set of pulse signals, wherein different spiking neurons located in the same set of second spiking neurons are respectively used to receive pulse signals of different image regions in the second set of image data, and one spiking neuron is used to receive the pulse signal corresponding to a pixel in one image region; the second output layer obtains the second recognition result of the target image based on the output signals of the spiking neurons in the set of P second spiking neurons, wherein the recognition result of one image region in the second set of image data is obtained based on the output signals of the P spiking neurons used to receive pulse signals of the same image region, and the P spiking neurons used to receive pulse signals of the same image region are located in different sets of spiking neurons.
[0027] The scheme presented in this application includes a spiking neural network system comprising a second input layer and a second output layer. The second output layer comprises a set of N second spiking neurons. Multiple spiking neurons in the second input layer encode the second set of image data into a second set of pulse signals according to the encoding information corresponding to the second set of image data. The second output layer then receives the second set of pulse signals input from the second input layer. Different spiking neurons within the same set of second spiking neurons are used to receive pulse signals from different image regions within the second set of image data. Furthermore, each spiking neuron receives the pulse signal corresponding to a single pixel within an image region. The second output layer obtains the recognition result of one image region from the second set of image data based on the output signals of the N spiking neurons used to receive pulse signals from the same image region. These N spiking neurons are located in different sets of spiking neurons. The recognition results for each image region in the second set of image data constitute the second recognition result of the target image. Thus, by obtaining recognition results from image regions, more detailed features of the target image can be extracted, resulting in a more accurate final recognition result for the target image.
[0028] In one possible implementation, obtaining a second recognition result of the target image based on the second set of image data further includes: the second output layer obtaining recognition results of different parts of an image region in the second set of image data based on the output signals of each of the K spiking neurons among the N spiking neurons that receive pulse signals of the same image region, where N is an integer multiple of K.
[0029] The scheme shown in this application allows the second output layer to obtain recognition results for different parts of an image region from the second set of image data based on the output signals of every K spiking neurons out of N spiking neurons that receive pulse signals from the same image region. Thus, the second recognition result can also include recognition results for different parts of M image regions of a second size, thereby extracting more detailed features from the target image and making the final recognition result of the obtained target image more accurate.
[0030] In one possible implementation, the first and second recognition results are obtained in parallel. This allows for the parallel acquisition of multiple recognition results used to determine the final recognition outcome of the target image, thus improving both recognition accuracy and speed.
[0031] Thirdly, this application provides a computing device including a processor and a communication interface, wherein the processor is connected to the communication interface.
[0032] A communication interface is provided for acquiring a target image, and a processor is provided for implementing the functions of the spiking neural network described in the first aspect and the image recognition method described in the second aspect.
[0033] Fourthly, this application provides a computer-readable storage medium storing instructions that, when executed on a computing device, cause the computing device to implement the functions of the spiking neural network described in the first aspect and the image recognition method described in the second aspect.
[0034] Fifthly, this application provides a computer program product containing instructions that, when run on a computing device, causes the computing device to perform the functions of the spiking neural network described in the first aspect. Attached Figure Description
[0035] Figure 1 This is a schematic diagram of the architecture of a spiking neural network provided in an exemplary embodiment of this application;
[0036] Figure 2 This is a schematic diagram of the architecture of a spiking neural network provided in an exemplary embodiment of this application;
[0037] Figure 3 This is a schematic diagram illustrating the connection between the output layer and the decoding layer in a spiking neural network provided in an exemplary embodiment of this application;
[0038] Figure 4 This is a schematic diagram of the architecture of a spiking neural network provided in an exemplary embodiment of this application;
[0039] Figure 5 This is a schematic diagram of the architecture of the first channel provided in an exemplary embodiment of this application;
[0040] Figure 6 This is a schematic diagram of the architecture of the second channel provided in an exemplary embodiment of this application;
[0041] Figure 7 This is a schematic diagram of image segmentation provided in an exemplary embodiment of this application;
[0042] Figure 8 This is a schematic diagram of an inhibitory spiking neuron inhibiting an excitatory spiking neuron, provided in an exemplary embodiment of this application;
[0043] Figure 9 This is a schematic diagram of the division of spiking neurons receiving the same image region provided in an exemplary embodiment of this application;
[0044] Figure 10 This is a spiking neuron response flow provided in an exemplary embodiment of this application;
[0045] Figure 11 This is a schematic flowchart of an image recognition method provided in an exemplary embodiment of this application;
[0046] Figure 12 This is a schematic diagram of the structure of an image recognition device provided in an exemplary embodiment of this application;
[0047] Figure 13 This is a schematic diagram of the structure of an image recognition device provided in an exemplary embodiment of this application. Detailed Implementation
[0048] To make the objectives, technical solutions, and advantages of this application clearer, the embodiments of this application will be described in further detail below with reference to the accompanying drawings.
[0049] To facilitate understanding of the embodiments of this application, the concepts of the terms involved are first introduced below:
[0050] Spike-timing-dependent plasticity (STDP) is a algorithm that updates the connection weights between neurons in the brain. The goal is to determine the bond between two neurons whose outputs are temporally close. STDP includes unsupervised and supervised versions; this application employs an unsupervised STDP algorithm.
[0051] Unsupervised learning algorithms dominate human and animal learning. Humans discover the inherent structure of the world through observation, rather than being told the name of every objective thing. Unsupervised learning algorithms are primarily designed for training on unlabeled datasets, requiring the application of unsupervised learning rules to adaptively adjust the connection weights or structure of neural networks. In other words, without the supervision of a "teacher" signal, the neural network must discover patterns, such as statistical features, correlations, or categories, from the input data and achieve classification or decision-making through its output.
[0052] Receptive field: The receptive field is the size of the visual perception area. In convolutional neural networks, the receptive field is defined as the size of the region on the original image mapped by the pixels in the feature map output by each layer of the convolutional neural network.
[0053] Neurotransmitters refer to the neurotransmitters in spiking neurons, which are the carriers of pulse signals in SNNs.
[0054] This application provides a spiking neural network that includes multiple processing channels, such as two processing channels or three processing channels.
[0055] like Figure 1 As shown, the first processing channel 100 in the multiple processing channels may include N sets of first spiking neurons 200, where N is greater than or equal to 2. Each set of first spiking neurons 200 includes M spiking neurons 201. The target image is the image to be recognized, and the target image can be any type of image, such as an image containing numbers, which could be a license plate image. If the target image is a red-green-blue (RGB) image, it can be converted from an RGB image to a grayscale image using any method. If the target image itself is a grayscale image, no processing is required.
[0056] The target image can be divided into M image regions of a first size, and the image data of the M image regions of the first size constitute the first set of image data. The set of N first spiking neurons 200 can be used to obtain the first recognition result of the target image based on the first set of image data.
[0057] The second processing channel 300 in the multiple processing channels may include P sets of second spiking neurons 400, where P is greater than or equal to 2. Each set of second spiking neurons 400 includes Q spiking neurons 401. The target image can be divided into Q image regions of a second size, and the image data of the Q image regions of the second size constitute a second set of image data. The P sets of second spiking neurons 400 can be used to obtain a second recognition result of the target image based on the second set of image data. Here, Q is not equal to M, and all are greater than or equal to 2. P is not equal to N, and all are greater than or equal to 2.
[0058] The first and second recognition results can be used to comprehensively determine the final recognition result of the target image. In other words, the recognition of the target image by each channel can be used to determine the final recognition result of the target image.
[0059] In this way, the spiking neural network includes multiple processing channels. When determining the final recognition result of the target image, it combines the recognition results of multiple processing channels, rather than using the recognition result of only one processing channel. Therefore, the final recognition result of the target image can be more accurate. Moreover, since the spiking neural network includes multiple processing channels, as long as at least one processing channel is working properly, the final recognition result of the target image can be determined, thus improving the anti-interference ability of the spiking neural network.
[0060] In one possible implementation, such as Figure 2 As shown, the spiking neural network also includes a decoding layer 500, which is fully connected to multiple processing channels. In the decoding layer 500, the number of spiking neurons equals the number of possible categories of the target task, which is also equal to the number of categories that the spiking neural network can recognize. The spiking neurons in the decoding layer 500 are voltage-based non-leakage neuron models. The membrane voltage threshold of the output pulse signal of the spiking neurons in the decoding layer 500 is infinite, indicating that the spiking neurons in the decoding layer 500 have no ability to output pulse signals. The decoding layer 500 can be used to determine the final recognition result of the target image based on the first recognition result and the second recognition result. Specifically, after receiving the recognition results from each processing channel, the spiking neurons in the decoding layer 500 will have a membrane voltage response. The category corresponding to the spiking neuron with the largest cumulative membrane voltage in the decoding layer 500 is determined as the category of the target image.
[0061] For example, in a certain channel, such as the first processing channel 100, each excitatory spiking neuron 1, 2, 3, ..., i, ..., l-1, l is connected to each spiking neuron in the decoding layer 500. The fact that excitatory spiking neurons are connected to each spiking neuron in the decoding layer 500 indicates that only excitatory spiking neurons can output recognition results to the decoding layer. An excitatory spiking neuron is defined as follows: for a given spiking neuron, receiving a pulse signal from its output makes it more likely to trigger an output pulse signal; therefore, that spiking neuron is an excitatory spiking neuron. Figure 3 As shown, excitatory spiking neuron i is connected to each spiking neuron of decoding layer 500, which includes c neurons. The connection weights between excitatory spiking neuron i and each spiking neuron of decoding layer 500 are wi1, wi2, ..., wic.
[0062] In one possible implementation, such as Figure 4 As shown, the first processing channel 100 includes a first input layer 101 and a first output layer 102. The first input layer 101 is used to receive a first set of image data, encode the first set of image data, and obtain a first set of pulse signals. Specifically, the first input layer 101 includes spiking neurons 1011 for encoding, and the number of spiking neurons 1011 for encoding is equal to the number of pixels in the target image. Each spiking neuron 1011 for encoding encodes a pixel in the target image according to the encoding coefficient corresponding to the first processing channel 100, and different spiking neurons 1011 for encoding encode different pixels. When encoding the target image, a frequency encoding method can be used. For each pixel, the pixel value of the pixel is multiplied by the encoding coefficient corresponding to the first processing channel 100 to obtain the pulse frequency corresponding to each pixel in the target image. Based on the pulse frequency corresponding to each pixel, a pulse sequence corresponding to the pulse frequency of each pixel is generated, and the pulse sequences corresponding to all pixels in the target image constitute the first set of pulse signals. It should be noted that this describes frequency encoding of pixels, but other encoding methods, such as time encoding, can also be used. This application embodiment does not limit the specific encoding method.
[0063] The first output layer 102 includes N sets of first spiking neurons 200. The first output layer 102 receives a first set of pulse signals output by the first input layer 101. Different spiking neurons 201 located in the same set of first spiking neurons 200 are respectively used to receive pulse signals from different image regions in the first set of image data, and one spiking neuron 201 is used to receive the pulse signal corresponding to a pixel in one image region. N is greater than or equal to the number of pixels in the target image. Thus, it is equivalent to the pulse signal corresponding to a pixel potentially being input to different spiking neurons 201.
[0064] The first output layer 102 is used to obtain a first recognition result of the target image based on the output signals of the spiking neurons 201 in the N first spiking neuron sets 200. The output signals of the N spiking neurons 201 that receive the pulse signals of the same image region correspond to the recognition result of one image region in the first set of image data. The N spiking neurons 201 that receive the pulse signals of the same image region are located in different first spiking neuron sets 200.
[0065] The second processing channel 300 includes a second input layer 301 and a second output layer 302. The second input layer 301 receives a second set of image data, encodes the second set of image data, and obtains a second set of pulse signals. Specifically, the second input layer 301 includes spiking neurons 3011 for encoding, and the number of spiking neurons 3011 for encoding is equal to the number of pixels in the target image. Each spiking neuron 3011 for encoding encodes a pixel in the target image according to the encoding coefficient corresponding to the second processing channel 300, and different spiking neurons encode different pixels. When encoding the target image, a frequency encoding method can be used. For each pixel, the pixel value of the pixel is multiplied by the encoding coefficient corresponding to the second processing channel 300 to obtain the pulse frequency corresponding to each pixel in the target image. Based on the pulse frequency corresponding to each pixel, a pulse sequence corresponding to the pulse frequency of each pixel is generated. The pulse sequences corresponding to all pixels in the target image constitute the second set of pulse signals.
[0066] The second output layer 302 includes P sets of second spiking neurons 400. The second output layer 302 receives the second set of pulse signals output by the second input layer 301. Different spiking neurons 401 located in the same set of second spiking neurons 400 receive pulse signals from different image regions in the second set of image data, and one spiking neuron 401 receives the pulse signal corresponding to a pixel in one image region. P is greater than or equal to the number of pixels in the target image. Thus, it is equivalent to the pulse signal corresponding to a pixel potentially being input to different spiking neurons 401.
[0067] The second output layer 302 is used to obtain a second recognition result of the target image based on the output signals of the spiking neurons 401 in the set of P second spiking neurons 400. It is used to receive the recognition result of an image region in the second set of image data corresponding to the output signals of the spiking neurons 401 in the same image region. The spiking neurons 401 in the same image region are located in different sets of second spiking neurons 400.
[0068] For the first processing channel, for example, Figure 5As shown, N equals 100, meaning there are 100 spiking neurons used to receive pulse signals from the same image region. M equals 9, meaning the target image is divided into 9 image regions of the first size. Each first spiking neuron set 200 contains 9 spiking neurons 201, each corresponding one-to-one with one of the 9 first-size image regions of the target image. In this correspondence, assuming the target image is divided into 3*3 first-size image data, from left to right, the first row of image regions is a, b, c; the second row is d, e, f; and the third row is m, n, o. Each first spiking neuron set 200 contains 3*3 spiking neurons 201. From left to right, the first row of spiking neurons 201 is 1, 2, 3; the second row is 4, 5, 6; and the third row is 7, 8, 9. a, b, c correspond to 1, 2, 3 respectively; d, e, f correspond to 4, 5, 6 respectively; and m, n, o correspond to 7, 8, 9 respectively. In each set of first spiking neurons 200, one spiking neuron 201 corresponds to one pixel, such as 1 corresponding to one pixel in a.
[0069] For the second processing channel, for example, Figure 6 As shown, P equals 64, meaning there are 64 spiking neurons used to receive pulse signals from the same image region. M equals 4, meaning the target image is divided into four second-size image regions. Each set of two second spiking neurons 400 contains four spiking neurons 401, each corresponding one-to-one with one of the four second-size image regions of the target image. In this correspondence, assuming the target image is divided into 2*2 second-size image data, from left to right, the first row of image regions is a1, b1, and the second row is d1, e1. Each set of two second spiking neurons 400 contains 2*2 spiking neurons 401. From left to right, the first row of spiking neurons 401 is m1, n1, and the second row is m2, n2. a1 and b1 correspond to m1 and n1 respectively, and d1 and e1 correspond to m2 and n2 respectively. Each spiking neuron 401 in each set of two second spiking neurons 400 corresponds to one pixel.
[0070] It's important to note that in spiking neural networks, the encoding coefficients are different for each processing channel. The connection weights of spiking neurons in the output layer that connect to the same image region are not shared. This can also be understood as the connection weights between spiking neurons in the input layer encoding an image region and those in the output layer connecting to the same image region not being shared. This is because in competitive learning, spiking neurons connected to the same image region compete with each other. Only the spiking neuron most sensitive to a certain feature of the input image can output a spiking signal, suppressing other spiking neurons. In other words, only the winner of the competition can be used as the final classification prediction.
[0071] It should also be noted that in spiking neural networks, for different processing channels, adjacent image regions within the segmented image regions of the target image may or may not overlap. For example, ... Figure 7 As shown, in the first processing channel 100, adjacent image regions in the nine first-size image regions of the target image have overlapping parts; in the second processing channel 300, adjacent image regions in the four second-size image regions of the target image do not overlap.
[0072] In addition, the spiking neural network includes multiple processing channels. In each processing channel, the division of the target image can be different. That is, in each processing channel, the size of each image region after the target image is divided is different, i.e., the first size and the second size mentioned above are different.
[0073] In one possible implementation, in the first processing channel 100, the output signals of the N spiking neurons 201 used to receive pulse signals from the same image region are the pulse output signals of one or more of the N spiking neurons 201 that receive pulse signals from the same image region. That is, the N spiking neurons 201 used to receive pulse signals from the same image region can only output pulse signals if they receive a sensitive portion of the target image.
[0074] In one possible implementation, in the first processing channel 100, the N spiking neurons 201 for receiving pulse signals from the same image region include multiple excitatory spiking neurons and one or more inhibitory spiking neurons. The one or more inhibitory spiking neurons are used to suppress the output pulse signals of the multiple excitatory spiking neurons. The output signals of the N spiking neurons 201 receiving pulse signals from the same image region include: the pulse output signals of one or more excitatory spiking neurons among the multiple excitatory spiking neurons under the inhibition of the one or more inhibitory spiking neurons.
[0075] In this embodiment, when the excitatory spiking neurons and inhibitory spiking neurons in the first output layer 102 can be two layers of spiking neurons, the number of excitatory spiking neurons used to receive pulse signals from the same image region is the same as the number of inhibitory spiking neurons. The first set of pulse signals output from the first input layer 101 is input to the excitatory spiking neurons of the first output layer 102. The excitatory spiking neurons receive the pulse signals input from the first input layer 101 and may output pulse signals. When there are output pulse signals, they can be input to an inhibitory spiking neuron. The output pulse signal of the inhibitory spiking neuron can be input to other excitatory spiking neurons among the excitatory spiking neurons used to receive pulse signals from the same image region, excluding the excitatory spiking neuron itself. Under the inhibition of multiple inhibitory spiking neurons, the output signal of the N spiking neurons 201 that receive pulse signals from the same image region is the pulse output signal of one or more excitatory spiking neurons among the multiple excitatory spiking neurons. This pulse signal is the pulse signal output based on the received pulse signals from the same image region.
[0076] For example, such as Figure 8 As shown, the number of excitatory spiking neurons used to receive pulse signals from the same image region is the same as the number of inhibitory spiking neurons, both being N / 2. The output of excitatory spiking neuron A can be input to inhibitory spiking neuron B, and the output of inhibitory spiking neuron B can be input to other excitatory spiking neurons besides excitatory spiking neuron A among the excitatory spiking neurons used to receive pulse signals from the same image region. Figure 8 The dashed line indicates that the inhibitory spiking neuron sends an inhibitory spiking signal to the excitatory spiking neuron. Figure 8 The solid line represents the excitatory spiking neuron sending an excitatory spiking signal to the inhibitory spiking neuron.
[0077] It should be noted here that for a spiking neuron, if receiving the pulse signal output by that spiking neuron makes it more likely to trigger an output pulse signal, then that spiking neuron is an excitatory spiking neuron. Conversely, for a spiking neuron, if receiving the pulse signal output by that spiking neuron inhibits the triggering of an output pulse signal, then that spiking neuron is an inhibitory spiking neuron. Since an excitatory spiking neuron used to receive pulse signals from the same image region transmits the pulse signal to a connected inhibitory spiking neuron, triggering that inhibitory spiking neuron to transmit an inhibitory pulse signal to other excitatory spiking neurons besides the excitatory spiking neuron, it can inhibit those other excitatory spiking neurons. Here, both the excitatory spiking neuron and the other excitatory spiking neurons are used to receive pulse signals from the same image region.
[0078] It should also be noted that even though excitatory spiking neurons can output multiple pulse signals, generally only the first pulse signal is output to inhibitory spiking neurons, while the remaining pulse signals are directly output to the decoding layer 500 for final recognition of the target image.
[0079] In one possible implementation, the first output layer 102 is further configured to obtain recognition results for different parts of an image region in the first set of image data based on the output signals of every K spiking neurons 201 out of N spiking neurons 201 that receive pulse signals from the same image region, where N is an integer multiple of K. This is equivalent to dividing an image region into N / K sub-image regions, with the output signals of every K spiking neurons 201 yielding the recognition result for one sub-image region, and different output signals from different K spiking neurons 201 yielding recognition results for different sub-image regions. For example, as... Figure 9 As shown, 100 spiking neurons 201 that receive pulse signals from the same image region are divided into 4 groups of K spiking neurons 201, where K equals 25. Figure 9 The two dashed lines divide the 100 spiking neurons 201 into 4 groups. In this way, since each K spiking neuron 201 corresponds to a smaller image region of the target image, more detailed features can be extracted, making it easier to obtain accurate recognition results.
[0080] In one possible implementation, the first spiking neuron set 200 includes a first spiking neuron 201, which outputs a pulse signal when its membrane voltage is greater than or equal to a membrane voltage threshold. The membrane voltage of the first spiking neuron 201 is determined based on the received neurotransmitter and the received pulse signal, and the membrane voltage threshold is the minimum voltage required for the output pulse signal. If the first spiking neuron 201 is an excitatory spiking neuron, the neurotransmitter and pulse signal received by the first spiking neuron 201 can be from spiking neurons 1011 in the first input layer 101 or from inhibitory spiking neurons; if the first spiking neuron 201 is an inhibitory spiking neuron, the neurotransmitter and pulse signal received by the first spiking neuron 201 are from spiking neurons 1011 in the first input layer 101.
[0081] It should be noted that the above description refers to the principle of the first processing channel 100. The principle of the second processing channel 300 is the same as that of the first processing channel 100, and will not be repeated here.
[0082] In one possible implementation, this application embodiment also provides a response model of the spiking neuron 201 in the first output layer 102 of the first processing channel 100 in the spiking neural network:
[0083] Based on the Spike Response Model (SRM) rules, the spiking neuron 201 of the first output layer 102 of the first processing channel in this embodiment is a spiking neuron model with time-segment memory. For each spiking neuron 201, the spiking neuron 201 acquires the synaptic current or conductance value at the end of the pulse response phase within the interval between two output pulse signals. This current or conductance value has inhibitory or excitatory effects, with excitation corresponding to triggering excitation. The interval between two output pulse signals is one pulse response phase. By accumulating or canceling all neurotransmitters from different synapses within one pulse response phase, the neurotransmitter category of the ultimately dominant neurotransmitter is determined. The spiking neuron 201 retains category memory for this neurotransmitter category, and in the next pulse response phase, i.e., the time interval between the current output pulse signal and the next output pulse signal, when it receives a pulse signal of the same type, it can respond more quickly or strengthen inhibition.
[0084] First, we provide the definition of the 201 response formula for the spiking neuron:
[0085]
[0086] In equation (1), s = tt i / f , t i / f Represents time t i At time t f Previously; v 应 Let be the membrane voltage of spiking neuron 201 at time t.
[0087] Δv fit For voltage adaptation, for a single spiking neuron 201, it represents the difference between the membrane voltage response to the received pulse signal before the output pulse signal and the membrane voltage response after the output pulse of the spiking neuron 201.
[0088] t i t represents the time it takes for spiking neuron 201 to receive pulse signals from other spiking neurons 201. f ε represents the time it takes for spiking neuron 201 to output a pulse signal after multiple excitations. (t) This is a function of neurotransmitter transmission variation at a single synapse, i.e., the synaptic model. τ is the decay time constant of the spiking neuron 201 itself. Define V... P For t i The membrane voltage balance at time t, defined as V P =V e +V i V e The positive equilibrium potential is V. e It can be predefined, such as a value between 0 and 100 millivolts, V iThe time t when the pulse signal is received i Membrane voltage of spiking neuron 201.
[0089] Definition of the synaptic model:
[0090] In equation (2), ε (i) Representing neurotransmitters, there are two types: inhibitory and excitatory. 'i' represents synapse i. C ni τ is a random number within a certain range, from 0 to 1, representing the proportion of neurotransmitters that spiking neuron 201 chooses to receive when they arrive. τ is the decay time constant of the neurotransmitter.
[0091] In equation (2), the exponential decay term The amount of neurotransmitter transmitted is related to the time when the spiking neuron 201 receives the pulse. The increase in neurotransmitter is largest at the moment of receipt and then decays over time, simulating the diffusion behavior of neurotransmitters within the spiking neuron 201. In this embodiment, the spiking neuron 201 can be a conductance-type spiking neuron 201. In a conductance-type spiking neuron 201, the neurotransmitter category is specifically represented by the conductance of the transmission, and the conductance value is non-negative.
[0092] In this embodiment, since a single synapse may transmit multiple pulses during the response of a single spiking neuron 201, which has a temporal potential accumulation effect on the spiking neuron 201, for synapses with more than one output pulse signal, the arrival time of other pulse signals of a single synapse besides the first pulse signal can be processed into pulse signals transmitted by a virtual synapse, while retaining the timestamp of the pulse signal transmitted by the original synapse. This makes the synapse and the pulse signal have a one-to-one correspondence, and the spiking neuron 201 and the pulse signal have a one-to-many relationship.
[0093] The following describes the global response of the 201 spiking neuron:
[0094] The overall response process of spiking neuron 201 can include the following aspects:
[0095] (1) When the spiking neuron 201 receives the pulse signal transmitted by the synapse, it updates the amount of neurotransmitter that has arrived. As pulses transmitted by different synapses continue to arrive, the neurotransmitter received by the spiking neuron 201 undergoes numerical iteration. After the transmitted pulse signal arrives, the neurotransmitter that arrives in the spiking neuron 201 begins to decay from this point in time.
[0096] (2) In the spiking neural network, the connection weights between excitatory and inhibitory spiking neurons in the first output layer 102, i.e., synaptic weights, represent the probability of receiving a pulse signal, i.e., a random proportion. When a pulse signal arrives, the spiking neurons 201 in the first output layer 102 receive a certain amount of neurotransmitter according to a random proportion. After the pulse signal arrives, it causes a change in the membrane voltage of the spiking neurons 201. This change is continuous, and the amount of change is the membrane voltage equilibrium value V. P The product of the neurotransmitter mass ε(t). Similar to neurotransmitter decay, the membrane voltage of the spiking neuron 201 also begins to decay from this time point, with each pulse signal causing an incremental change in the membrane voltage.
[0097] (3) When the cumulative change in membrane voltage of spiking neuron 201 reaches the membrane voltage threshold, spiking neuron 201 outputs a pulse signal and transmits neurotransmitters to one or more spiking neurons 201 via synapses. Specifically, for an excitatory spiking neuron receiving a pulse signal from the input layer, it outputs a pulse signal by transmitting neurotransmitters to an inhibitory spiking neuron via synapses. For an inhibitory spiking neuron receiving a pulse signal from an excitatory spiking neuron, it outputs a pulse signal by transmitting neurotransmitters to multiple excitatory spiking neurons (excluding the excitatory spiking neuron that outputs the pulse signal to the inhibitory spiking neuron) via synapses. For an excitatory spiking neuron 201 outputting a pulse signal to the decoding layer, it transmits the signal to multiple spiking neurons 501 in the decoding layer via synapses. After outputting a pulse signal, spiking neuron 201 discharges itself, and its voltage continuously decreases to the initial resting voltage of the next pulse response phase. This initial resting voltage is determined by the neurotransmitter class that dominates after the continuous accumulation and updating of neurotransmitters from the previous pulse response phase.
[0098] Based on the above (1), (2) and (3), the input of the spiking neuron 201 is randomly received to receive the neurotransmitter update, potential response, etc., which is matched and consistent with the frequency coding and STDP rule adjustment to maintain the overall balance of the spiking neural network.
[0099] The following describes the dynamic equations governing the specific changes in membrane voltage of spiking neuron 201:
[0100] For each pulse response phase, the membrane voltage accumulation phase is as follows:
[0101] During the phase from 0 to t1 (0 < t < t1), f indicates that the discharge flag is equal to 0.
[0102] Membrane voltage of spiking neuron 201
[0103] In equation (3), before the spiking neuron 201 receives the pulse signal, its voltage is the resting voltage v. r Δv fit =vr neurotransmitter reception ε (t) =0, s=0.
[0104] During the t1 to t2 phase, i.e., t1≤t<t2, a pulse signal is received at time t1, f represents the discharge flag being equal to 0, and the membrane voltage of the spiking neuron 201 is:
[0105]
[0106] In equation (4), Δv fit1 =V p1 -v r V p1 This represents the membrane voltage balance at time t1.
[0107] During the phase from t to t3, i.e., the phase from t to t3, a pulse signal is received again at time t, where f represents the discharge flag being equal to 0, and the membrane voltage of the spiking neuron 201 is:
[0108]
[0109] In equation (5), Δv fit2 =V p2 -v r ,
[0110] The above description of the process from time 0 to t3 describes the accumulation of membrane voltage in spiking neuron 201 after receiving the pulse signal, before it outputs a pulse signal. The above ε... (0) Originally defined as the change in neurotransmitter transmitted at a single synapse, here ε (0) ε (1) To accumulate neurotransmitters, it means that the spiking neuron 201 receives neurotransmitters from different synapses at the same time (allowing the number of synapses connected to the same spiking neuron 201 to release pulse signals at the same time to be greater than one). Multiple moments before the output pulse signal is excited may receive non-unique neurotransmitters.
[0111] Based on the descriptions in equations (3) to (5) above, it can be seen that when the spiking neuron 201 receives the pulse signal for the i-th time (t... i ≤t<t f , t i t is the time for receiving the pulse signal. f (For the time of the output pulse signal), the membrane voltage of the spiking neuron 201 can be expressed as:
[0112]
[0113] In equation (6),
[0114] The membrane voltage accumulation process of spiking neuron 201 is actually the accumulation of multiple potential increments with decaying properties, due to the adaptive voltage Δv fit Due to the adjustment effect, the potential increment of spiking neuron 201 is large when it receives the pulse signal. When the pulse signals arrive at concentrated times, due to Δv fit The function of this is to increase the potential increment, so that a larger increment base can be obtained to facilitate the rapid excitation of the spiking neuron 201 to output a pulse signal. When the pulse signal arrives at a dispersed time point, the membrane voltage of the spiking neuron 201 has decayed over a period of time and thus the value becomes smaller. After adaptive voltage adjustment, the potential increment base is small, making it difficult to excite the output pulse signal.
[0115] When the accumulated membrane voltage v of spiking neuron 201 is greater than or equal to the membrane voltage threshold v0, spiking neuron 201 outputs a pulse signal, and the discharge flag f = 1. At t ≥ t f At time: f = 1, the membrane voltage of spiking neuron 201 is:
[0116]
[0117] In equation (7), Δv fit =v r +Δv,ε (t) =1. ε (t) =1 because after spiking neuron 201 outputs a pulse signal, the neurotransmitter in spiking neuron 201 neither increases nor decreases, and it recovers, hence the value is 1. η() represents the value of spiking neuron 201 at time t. f The refractory period after the output pulse signal at any given time.
[0118] After outputting a pulse signal, spiking neuron 201 enters a refractory period. The refractory period is defined as the time during which an organism, after responding to a stimulus, will not respond again even if further stimulation is given. For example, a refractory period of 2 milliseconds. Based on the cumulative results of different neurotransmitter categories previously received by spiking neuron 201, the dominant neurotransmitter category is determined to confirm the memory category of spiking neuron 201 during this pulse response phase. Based on this memory category, the initiation voltage for the next pulse response phase is adjusted. The incremental potential of the membrane voltage of spiking neuron 201 is consistent with the memory category. For example, when the neurotransmitter category is excitatory memory, the incremental potential is positive; conversely, when the neurotransmitter category is inhibitory memory, the incremental potential is negative. The incremental potential is the product of an adjustment coefficient α and a potential difference, where α is 10%, and the potential difference is the difference between the resting voltage and the membrane voltage threshold. This incremental potential can be expressed by the formula:
[0119]
[0120] In equation (8), α is an adjustment coefficient with a value between [0,1], v0 is the membrane voltage threshold, and v r Where is the resting voltage, W is the number of pulse signals received by the spiking neuron 201, I is the number of excitatory pulse signals in the received pulse signal W, and WI is the number of inhibitory pulse signals in the received pulse signal W. This indicates the accumulation of neurotransmitters in an excitatory manner. This indicates inhibitory neurotransmitter accumulation. Here, for a single spiking neuron 201, neurotransmitter accumulation is either excitatory or inhibitory, therefore... It generally won't equal 0.
[0121] Subsequently, when the membrane voltage threshold v of the spiking neuron 201 is greater than or equal to v0, the spiking neuron 201 continues to output a pulse signal, f+ = 1.
[0122] It should be noted here that in the pulse response process of spiking neuron 201, regardless of the type of neurotransmitter, the process is cumulative. For excitatory neurotransmitters, the sum is taken directly, and for inhibitory neurotransmitters, the opposite number is taken. That is, the process for inhibitory neurotransmitters is actually subtraction.
[0123] In addition, to better understand the embodiments of this application, the response flow of the spiking neuron 201 is also provided. For example... Figure 10 As shown, at the initial moment, in step 1001, the spiking neuron 201 neither outputs nor receives a pulse signal. f is the identifier of the output pulse signal, and i is the number of pulse signals received by the spiking neuron 201. At this time, f = 0, i = 0, and the membrane voltage of the spiking neuron 201 is v = v r .
[0124] Step 1002: The spiking neuron 201 detects whether it has received a spiking signal.
[0125] Step 1003: If the spiking neuron 201 receives a pulse signal, then determine whether the current membrane voltage of the spiking neuron 201 is less than the membrane voltage threshold v0.
[0126] Step 1004: If the current membrane voltage is less than the membrane voltage threshold v0, then define the time for receiving the pulse signal as t. i At this point, i = i + 1, the membrane voltage balance V when the spiking neuron 201 receives the pulse signal. pi =v e +v i ,at this time, Δv fit(i) =V pi -v r , where ε (i)The neurotransmitter that transmits pulse signals is divided into two types: inhibition type and excitation type, C. ni C represents the proportion of the mass received by spiking neuron 201 at a single synaptic terminal in the presynapse, i.e., the excitation effect of a single pulse. ni C is a random number between 0 and 1, and when the neurotransmitter is of the suppression type. ni For negative random numbers, when the neurotransmitter is excitation type, C ni It is a positive random number. Therefore, the suppressed pulse signal is actually a subtraction term in the calculation process.
[0127] Step 1005: The spiking neuron 201 continues to receive pulse signals, causing the membrane voltage to increase nonlinearly. The membrane voltage of the spiking neuron 201 is updated according to equation (6), and the process returns to determine whether the current membrane voltage of the spiking neuron 201 is less than the membrane voltage threshold v0.
[0128] Step 1006: If the current membrane voltage is greater than or equal to the membrane voltage threshold v0, then the spiking neuron 201 outputs a pulse signal at time t. f The pulse signal output identifier is f = f + 1, ε (t) =1.
[0129] Step 1007, spiking neuron 201 continues to determine t f +t ref The condition < t is used to determine whether the current time t is within the refractory period. ref The refractory period is t, where t is the current time.
[0130] Step 1008, if t f +t ref If <t holds, then the pulse neuron 201 determines the incremental potential; if t f +t ref If <t is not true, then wait for t. f +t ref <t holds true.
[0131] Step 1009: Based on the incremental potential, the spiking neuron 201 determines the starting voltage of the next pulse response phase.
[0132] In step 1010, the spiking neuron 201 continues to determine whether a pulse signal has been received. If no pulse signal is received, it returns to the step of determining the starting voltage of the next pulse response phase of the spiking neuron 201. If a pulse signal is received, it returns to the process of determining whether the membrane voltage of the spiking neuron 201 is less than the membrane voltage threshold, i.e., it returns to step 1003.
[0133] Based on the above description, it can be seen that the membrane voltage of the spiking neuron 201 undergoes a three-layer multiply-accumulate iterative cycle. The first layer is the neurotransmitter ∑ε that arrives simultaneously when the pulse signal is received.(i) The update process involves three layers: the second layer updates the membrane voltage response after the arrival of the pulse signal, and the third layer updates the membrane voltage response after the spiking neuron 201 outputs a pulse. The discharge flag f is incremented after the spiking neuron 201 triggers an output pulse. For a single spiking neuron 201, the value of the discharge flag f increases by 1 for each pulse signal output. Thus, the sum of f obtained from all spiking neurons 201 represents the total number of pulses output by all spiking neurons 201 in the spiking neural network.
[0134] In this embodiment, during the response process of the spiking neuron 201 in the first processing channel 100, after the spiking neuron 201 outputs a pulse signal, the starting voltage of the next pulse response stage is determined based on the neurotransmitter category received by the spiking neuron 201 before the output pulse signal. Therefore, in the next pulse response stage, that is, within the time interval between the current output pulse signal and the next output pulse signal, when a pulse signal of the same type is received, the response can be faster or the suppression can be strengthened, thereby improving the overall classification accuracy and convergence speed.
[0135] It should be noted that the above description is of the response process of the spiking neuron 201 in the first processing channel 100. For each processing channel, the response process of the spiking neuron is the same as described above, and will not be repeated here.
[0136] In addition, the process of training a spiking neural network is also provided in the embodiments of this application:
[0137] The device used to train a spiking neural network can be called an image recognition device. The image recognition device acquires a set of sample images, which are unlabeled. The initial spiking neural network only establishes the framework, without connection weights; the connection weights between the output layer and the decoding layer of each processing channel are 0. Frequency encoding is performed on the images in the sample image set at the input layer of the spiking neural network to obtain the pulse signals of the sample images in each processing channel. Then, the image recognition device uses an unsupervised learning algorithm and the encoded pulse signals to determine the connection weights between the input and output layers of each processing channel in the spiking neural network. The unsupervised learning algorithm can be any type, such as STDP. Finally, the connection weights between the input and output layers of each processing channel are updated in the initial spiking neural network.
[0138] The image recognition device then determines the response intensity of each spiking neuron in the decoding layer to various image types. Based on this response intensity, it determines the connection weights between the output layer of each processing channel and the decoding layer of the spiking neural network. Thus, the trained spiking neural network is a spiking neural network with added connection weights between the input and output layers of each processing channel, as well as between the output and decoding layers. This spiking neural network can be used to identify image categories.
[0139] In this way, since the connection weights between the decoding layer and the output layers of each processing channel are fixed at 0 during training, the decoding layer is not included in the training process of the spiking neural network, thereby improving the network training speed. Furthermore, it reduces the number of parameters generated during spiking neural network training and the storage capacity for storing these parameters, minimizing information loss during the decoding process. Additionally, since the spiking neural network includes multiple processing channels (let's say H, where H is greater than 1), the original requirement of T sample images without channel division is reduced to only T / H sample images after channel division, further improving training speed.
[0140] Furthermore, during the training of a spiking neural network, the various subnetworks within the network are trained in parallel. Each processing channel can be considered a subnetwork. Multiple spiking neurons in a single processing channel that process a single image region can be called a subnetwork, and multiple spiking neurons in a single processing channel that process a portion of an image region can also be called a subnetwork. This division of the spiking neural network into multiple subnetworks for parallel and independent learning improves the learning speed compared to learning from a single network.
[0141] In one possible implementation, the coding coefficients can be adjusted during the training of the spiking neural network. Specifically, for the coding coefficients of any processing channel, the image recognition device determines the number of pulses received by the decoding layer. Based on this number, the coding coefficients are adjusted to make the number of pulses received by the decoding layer more appropriate. For example, when the number of pulses received by the decoding layer is less than a preset value, the coding coefficients are increased to increase the pulse frequency of the input pulse signal. When the number of pulses received by the decoding layer is greater than the preset value, the coding coefficients are decreased to decrease the pulse frequency of the input pulse. In this way, the number of pulses received by the decoding layer can be made more appropriate.
[0142] It's important to note that in a spiking neural network, the time step for encoding an image is the time interval between two consecutive encoded input pulse signals. This time step is the training time step for the spiking neural network for an input, excluding the delay. In other words, it ensures that the spiking neural network has continuous input during computation, except for the delay time. The delay time is optional and not strictly defined, typically on the order of single-digit milliseconds.
[0143] In the embodiments of this application, as described above, Q is not equal to M. In some cases, Q can also be equal to M. When Q is equal to M, the first size is equal to the second size. Since the encoding information of different processing channels is different, the pulse signals corresponding to the target images input to the output layer of different processing channels are different.
[0144] In this application embodiment, an image recognition method is also provided, wherein the execution subject of the method may be a spiking neural network system, such as... Figure 11 As shown, the process of this method is as follows:
[0145] Step 1101: The spiking neural network system receives the first set of image data and the second set of image data.
[0146] The first set of image data includes image data of M image regions of a first size obtained from the target image, and the second set of image data includes image data of Q image regions of a second size obtained from the target image. M and Q are both greater than or equal to 2, M is not equal to Q, and the first size is not equal to the second size.
[0147] In this embodiment, the spiking neural network system can acquire a first set of image data and a second set of image data of the target image. Here, the spiking neural network system directly receives the first set of image data and the second set of image data as input, or the spiking neural network system receives the target image, divides the target image into a first set of image data, and divides the target image into a second set of image data.
[0148] Step 1102: The spiking neural network system obtains the first recognition result of the target image based on the first set of image data.
[0149] In this embodiment, the spiking neural network system can extract features from the first set of image data to obtain the first recognition result of the target image.
[0150] Step 1103: The spiking neural network system obtains the second recognition result of the target image based on the second set of image data.
[0151] In this embodiment, the spiking neural network system can extract features from the second set of image data to obtain a second recognition result of the target image.
[0152] Step 1104: The spiking neural network system obtains the final recognition result of the target image based on the first recognition result and the second recognition result.
[0153] In this embodiment, the spiking neural network system uses both the first and second recognition results to obtain the final recognition result of the target image.
[0154] In this way, the spiking neural network system can output multiple recognition results for the target image. When determining the final recognition result of the target image, it combines multiple recognition results rather than using only one, thus making the final recognition result more accurate. Moreover, since the spiking neural network system can output multiple recognition results for the target image, as long as at least one recognition result can be output, the final recognition result of the target image can be determined, thereby improving the anti-interference capability of the spiking neural network system.
[0155] In one possible implementation, the spiking neural network system includes a first input layer and a first output layer, wherein the first output layer includes a set of N first spiking neurons. The method further includes: multiple spiking neurons in the first input layer encoding a first set of image data into a first set of pulse signals; the first output layer receiving the first set of pulse signals, wherein different spiking neurons located in the same set of first spiking neurons are respectively used to receive pulse signals of different image regions in the first set of image data, and one spiking neuron is used to receive the pulse signal corresponding to a pixel in an image region; the first output layer obtains a first recognition result of the target image based on the output signals of the spiking neurons in the set of N first spiking neurons, wherein the recognition result of one image region in the first set of image data is obtained based on the output signals of the N spiking neurons used to receive pulse signals of the same image region, and the N spiking neurons used to receive pulse signals of the same image region are located in different sets of spiking neurons.
[0156] The scheme shown in this application involves a spiking neural network system where the first input layer comprises multiple spiking neurons. Each spiking neuron uses the encoding information corresponding to the first set of image data to encode each pixel in the first set of image data into a pulse sequence, which is the first set of pulse signals for the target image. The first input layer outputs the first set of pulse signals to the first output layer. The first output layer comprises N sets of first spiking neurons. Different spiking neurons in the same set are used to receive pulse signals from different image regions in the first set of image data. Furthermore, one spiking neuron is used to receive the pulse signal corresponding to one pixel in one image region. The first output layer obtains the recognition result of one image region in the first set of image data based on the output signals of the N spiking neurons used to receive pulse signals from the same image region. The recognition results of the M image regions in the first set of image data constitute the first recognition result. The N spiking neurons used to receive pulse signals from the same image region are located in different sets of spiking neurons. Thus, by obtaining recognition results from different image regions, more detailed features of the target image can be extracted, resulting in a more accurate final recognition result for the target image.
[0157] In one possible implementation, the first output layer obtains the recognition result of different parts of an image region in the first set of image data based on the output signals of each of the K spiking neurons out of the N spiking neurons that receive the pulse signal of the same image region, where N is an integer multiple of K.
[0158] In the scheme shown in this application, the first output layer also obtains the recognition result of an image region in the first set of image data based on the output signals of N spiking neurons that receive pulse signals from the same image region. That is, it obtains the recognition results of different parts in M images of the first size. In this way, since more detailed features in an image region can be obtained, the final recognition result of the target image can be more accurate.
[0159] In one possible implementation, the spiking neural network system further includes a second input layer and a second output layer, wherein the second output layer includes a set of N second spiking neurons. The method further includes: multiple spiking neurons in the second input layer encoding a second set of image data into a second set of pulse signals; the second output layer receiving the second set of pulse signals, wherein different spiking neurons located in the same set of second spiking neurons are respectively used to receive pulse signals of different image regions in the second set of image data, and one spiking neuron is used to receive the pulse signal corresponding to a pixel in an image region; the second output layer obtains a second recognition result of the target image based on the output signals of the spiking neurons in the set of N second spiking neurons, wherein the recognition result of one image region in the second set of image data is obtained based on the output signals of P spiking neurons used to receive pulse signals of the same image region, and the P spiking neurons used to receive pulse signals of the same image region are located in different sets of spiking neurons.
[0160] The scheme shown in this application involves a second input layer in a spiking neural network system comprising multiple spiking neurons. Each spiking neuron uses the encoding information corresponding to the second set of image data to encode each pixel in the second set of image data into a pulse sequence, which is the second set of pulse signals for the target image. The second input layer outputs the second set of pulse signals to the second output layer. The second output layer comprises a set of P second spiking neurons. Different spiking neurons in the same set are used to receive pulse signals from different image regions in the second set of image data. Furthermore, one spiking neuron is used to receive the pulse signal corresponding to one pixel in one image region. The second output layer obtains the recognition result of one image region in the second set of image data based on the output signals of the P spiking neurons used to receive pulse signals from the same image region. The recognition results of the Q image regions in the second set of image data constitute the second recognition result. The P spiking neurons used to receive pulse signals from the same image region are located in different sets of spiking neurons. Thus, by obtaining recognition results from different image regions, more detailed features of the target image can be extracted, resulting in a more accurate final recognition result for the target image.
[0161] In one possible implementation, the second output layer obtains the recognition results of different parts of an image region in the second set of image data based on the output signals of each of the K spiking neurons out of the P spiking neurons that receive the pulse signals of the same image region, where P is an integer multiple of K.
[0162] In the scheme shown in this application, the second output layer also obtains the recognition result of an image region in the second set of image data based on the output signals of P spiking neurons that receive pulse signals from the same image region. That is, it obtains the recognition results of different parts in P images of the second size. In this way, since more detailed features in an image region can be obtained, the final recognition result of the target image can be more accurate.
[0163] Optionally, the first recognition result and the second recognition result are obtained in parallel, that is, the process of the first output layer obtaining the first recognition result and the second output layer obtaining the second recognition result are performed in parallel.
[0164] In one possible implementation, in the first output layer, the N spiking neurons receiving pulse signals from the same image region include multiple excitatory spiking neurons and one or more inhibitory spiking neurons. The one or more inhibitory spiking neurons are used to suppress the output pulse signals of the multiple excitatory spiking neurons. Under the suppression of the one or more inhibitory spiking neurons, the first output layer obtains a first recognition result of the target image based on the pulse signals output by one or more of the multiple excitatory spiking neurons based on the received pulse signals from the same image region.
[0165] Figure 12 This is a structural diagram of the image recognition device provided in an embodiment of this application. The image recognition device includes multiple recognition modules;
[0166] The first recognition module 1210 among the plurality of recognition modules includes N sets of first spiking neurons. The N sets of first spiking neurons are used to obtain a first recognition result of the target image based on a first set of image data. Each set of first spiking neurons includes M spiking neurons. The first set of image data includes image data of M image regions of a first size obtained based on the target image.
[0167] The second recognition module 1220 among the plurality of recognition modules includes P sets of second spiking neurons. The P sets of second spiking neurons are used to obtain a second recognition result of the target image based on a second set of image data. Each set of second spiking neurons includes Q spiking neurons, and the second set of image data includes image data of Q image regions of a second size obtained based on the target image.
[0168] Where M, N, P and Q are all greater than or equal to 2, the first recognition result and the second recognition result are used to obtain the final recognition result of the target image.
[0169] In one possible implementation, such as Figure 13As shown, the image recognition device further includes a decoding module 1230;
[0170] The decoding module 1230 is fully connected to the plurality of recognition modules;
[0171] The decoding module 1230 is used to determine the final recognition result of the target image based on the first recognition result and the second recognition result.
[0172] In one possible implementation, the first identification module 1210 includes:
[0173] A first input layer is used to receive the first set of image data and encode the first set of image data into a first set of pulse signals;
[0174] The first output layer, comprising the N sets of first spiking neurons, is used for:
[0175] The first group of pulse signals is received, wherein different pulse neurons located in the same first set of pulse neurons are respectively used to receive pulse signals of different image regions in the first group of image data, and one pulse neuron is used to receive the pulse signal corresponding to a pixel in one image region.
[0176] The first recognition result of the target image is obtained based on the output signals of the spiking neurons in the N sets of first spiking neurons, wherein the output signals of the N spiking neurons used to receive the pulse signals of the same image region correspond to the recognition result of one image region in the first set of image data, and the N spiking neurons used to receive the pulse signals of the same image region are located in different sets of spiking neurons.
[0177] In one possible implementation, the second identification module 1220 includes:
[0178] The second input layer is used to receive the second set of image data and encode the second set of image data into a second set of pulse signals;
[0179] The second output layer, comprising a set of P second spiking neurons, is used for:
[0180] The second set of pulse signals is received, wherein different pulse neurons located in the same set of second pulse neurons are respectively used to receive pulse signals of different image regions in the second set of image data, and one pulse neuron is used to receive the pulse signal corresponding to one pixel of one image region;
[0181] The second recognition result of the target image is obtained based on the output signals of the spiking neurons in the set of P second spiking neurons, wherein the output signals of the P spiking neurons used to receive the pulse signals of the same image region correspond to the recognition result of one image region in the second set of image data, and the P spiking neurons used to receive the pulse signals of the same image region are located in different sets of spiking neurons.
[0182] In one possible implementation, the output signal of the N spiking neurons receiving pulse signals from the same image region is the pulse output signal of one or more of the N spiking neurons receiving pulse signals from the same image region.
[0183] In one possible implementation, the N spiking neurons receiving pulse signals from the same image region include multiple excitatory spiking neurons and one or more inhibitory spiking neurons. The one or more inhibitory spiking neurons are used to suppress the output pulse signals of the multiple excitatory spiking neurons. The output signals of the N spiking neurons receiving pulse signals from the same image region include: pulse output signals output by one or more excitatory spiking neurons from the multiple excitatory spiking neurons based on the received pulse signals from the same image region, under the suppression of the one or more inhibitory spiking neurons.
[0184] In one possible implementation, the first output layer is further configured to: obtain the recognition result of different parts of an image region in the first set of image data based on the output signals of each of the K spiking neurons out of the N spiking neurons that receive pulse signals of the same image region, where N is an integer multiple of K.
[0185] In one possible implementation, the first spiking neuron set includes a first spiking neuron, which is configured to output a pulse signal when the membrane voltage of the first spiking neuron is greater than or equal to a membrane voltage threshold, wherein the membrane voltage of the first spiking neuron is determined based on the received neurotransmitter and the received pulse signal.
[0186] In this embodiment of the application, a computing device is also provided, which includes a communication interface and a processor. The communication interface is used to acquire a target image, and the processor is connected to the communication interface to implement the functions of the aforementioned spiking neural network and the aforementioned image recognition method. The processor may be a central processing unit (CPU) or the like.
[0187] It should be noted that the image recognition device provided in the above embodiments is only illustrated by the division of the above functional modules. In actual applications, the above functions can be assigned to different functional modules as needed, that is, the internal structure of the device can be divided into different functional modules to complete all or part of the functions described above.
[0188] In the above embodiments, implementation can be achieved entirely or partially through software, hardware, firmware, or any combination thereof. When implemented using software, it can be implemented entirely or partially in the form of a computer program product. The computer program product includes one or more computer instructions. When these computer program instructions are loaded and executed on a server or terminal, they generate all or part of the processes or functions described in the embodiments of this application. The computer instructions can be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another. For example, the computer instructions can be transmitted from one website, computer, server, or data center to another website, computer, server, or data center via wired (e.g., coaxial cable, fiber optic cable, digital subscriber line) or wireless (e.g., infrared, wireless, microwave, etc.) means. The computer-readable storage medium can be any available medium accessible to the server or terminal, or a data storage device such as a server or data center that integrates one or more available media. The available medium can be a magnetic medium (e.g., floppy disk, hard disk, and magnetic tape), an optical medium (e.g., digital video disk (DVD), etc.), or a semiconductor medium (e.g., solid-state drive, etc.).
Claims
1. A method for constructing a spiking neural network, characterized in that, The method includes: constructing the spiking neural network, wherein the spiking neural network includes multiple processing channels and a decoding layer; The first processing channel among the plurality of processing channels includes N sets of first spiking neurons. The N sets of first spiking neurons are used to obtain a first recognition result of the target image based on a first set of image data. Each set of first spiking neurons includes M spiking neurons. The first set of image data includes image data of M image regions of a first size obtained based on the target image. The second processing channel among the plurality of processing channels includes P sets of second spiking neurons. The P sets of second spiking neurons are used to obtain a second recognition result of the target image based on a second set of image data. Each set of second spiking neurons includes Q spiking neurons. The second set of image data includes image data of Q image regions of a second size obtained based on the target image. M, N, P, and Q are all greater than or equal to 2. The first recognition result and the second recognition result are used to obtain the final recognition result of the target image. The decoding layer is fully connected to the plurality of processing channels; The decoding layer is used to determine the final recognition result of the target image based on the first recognition result and the second recognition result.
2. The method according to claim 1, characterized in that, The first processing channel includes: A first input layer is used to receive the first set of image data and encode the first set of image data into a first set of pulse signals; The first output layer, comprising the N sets of first spiking neurons, is used for: The first group of pulse signals is received, wherein different pulse neurons located in the same first set of pulse neurons are respectively used to receive pulse signals of different image regions in the first group of image data, and one pulse neuron is used to receive the pulse signal corresponding to a pixel in one image region. The first recognition result of the target image is obtained based on the output signals of the spiking neurons in the N sets of first spiking neurons, wherein the output signals of the N spiking neurons used to receive the pulse signals of the same image region correspond to the recognition result of one image region in the first set of image data, and the N spiking neurons used to receive the pulse signals of the same image region are located in different sets of spiking neurons.
3. The method according to claim 1 or 2, characterized in that, The second processing channel includes: The second input layer is used to receive the second set of image data and encode the second set of image data into a second set of pulse signals; The second output layer, comprising a set of P second spiking neurons, is used for: The second set of pulse signals is received, wherein different pulse neurons located in the same set of second pulse neurons are respectively used to receive pulse signals of different image regions in the second set of image data, and one pulse neuron is used to receive the pulse signal corresponding to one pixel of one image region; The second recognition result of the target image is obtained based on the output signals of the spiking neurons in the set of P second spiking neurons, wherein the output signals of the P spiking neurons used to receive the pulse signals of the same image region correspond to the recognition result of one image region in the second set of image data, and the P spiking neurons used to receive the pulse signals of the same image region are located in different sets of spiking neurons.
4. The method according to claim 2, characterized in that, The output signal of the N spiking neurons that receive pulse signals from the same image region is the pulse output signal of one or more of the N spiking neurons that receive pulse signals from the same image region.
5. The method according to claim 4, characterized in that, The N spiking neurons that receive pulse signals from the same image region include multiple excitatory spiking neurons and one or more inhibitory spiking neurons. The one or more inhibitory spiking neurons are used to suppress the output pulse signals of the multiple excitatory spiking neurons. The output signals of the N spiking neurons that receive pulse signals from the same image region include: pulse signals output by one or more excitatory spiking neurons based on the received pulse signals from the same image region, under the suppression of the one or more inhibitory spiking neurons.
6. The method according to any one of claims 2, 4, and 5, characterized in that, The first output layer is further configured to: obtain the recognition result of different parts of an image region in the first set of image data based on the output signals of each of the K spiking neurons in the N spiking neurons that receive the pulse signal of the same image region, where N is an integer multiple of K.
7. The method according to any one of claims 2, 4, and 5, characterized in that, The first spiking neuron set includes a first spiking neuron, which is used to output a pulse signal when the membrane voltage of the first spiking neuron is greater than or equal to a membrane voltage threshold, wherein the membrane voltage of the first spiking neuron is determined based on the received neurotransmitter and the received pulse signal.
8. An image recognition method, characterized in that, The method is performed by a spiking neural network system, the spiking neural network system comprising the spiking neural network as described in any one of claims 1 to 7, the method comprising: Acquire a first set of image data and a second set of image data, wherein the first set of image data includes image data of M image regions of first size obtained from the target image, and the second set of image data includes image data of Q image regions of second size obtained from the target image, where M and Q are both greater than or equal to 2; A first recognition result of the target image is obtained based on the first set of image data; A second recognition result of the target image is obtained based on the second set of image data; Based on the first recognition result and the second recognition result, the final recognition result of the target image is obtained.
9. The method according to claim 8, characterized in that, The spiking neural network system includes a first input layer and a first output layer, wherein the first output layer includes a set of N first spiking neurons, and the method further includes: The multiple spiking neurons in the first input layer encode the first set of image data into a first set of pulse signals; The step of obtaining the first recognition result of the target image based on the first set of image data includes: The first output layer receives the first group of pulse signals, wherein different pulse neurons located in the same first set of pulse neurons are respectively used to receive pulse signals of different image regions in the first group of image data, and one pulse neuron is used to receive the pulse signal corresponding to a pixel in an image region. The first output layer obtains a first recognition result of the target image based on the output signals of the spiking neurons in the N sets of first spiking neurons. The recognition result of one image region in the first set of image data is obtained based on the output signals of the N spiking neurons used to receive the pulse signals of the same image region. The N spiking neurons used to receive the pulse signals of the same image region are located in different sets of spiking neurons.
10. The method according to claim 9, characterized in that, The step of obtaining the first recognition result of the target image based on the first set of image data further includes: The first output layer obtains the recognition result of different parts of an image region in the first set of image data based on the output signals of each of the K spiking neurons out of the N spiking neurons that receive the pulse signal of the same image region, where N is an integer multiple of K.
11. The method according to any one of claims 8 to 10, characterized in that, The spiking neural network system further includes a second input layer and a second output layer, wherein the second output layer comprises a set of P second spiking neurons, and the method further includes: Multiple spiking neurons in the second input layer encode the second set of image data into a second set of pulse signals; The step of obtaining the second recognition result of the target image based on the second set of image data includes: The second output layer receives the second set of pulse signals, wherein different pulse neurons located in the same set of second pulse neurons are respectively used to receive pulse signals of different image regions in the second set of image data, and one pulse neuron is used to receive the pulse signal corresponding to a pixel in an image region; The second output layer obtains the second recognition result of the target image based on the output signals of the spiking neurons in the set of P second spiking neurons. The recognition result of one image region in the second set of image data is obtained based on the output signals of the P spiking neurons used to receive the pulse signals of the same image region. The P spiking neurons used to receive the pulse signals of the same image region are located in different sets of spiking neurons.
12. The method according to any one of claims 8 to 10, characterized in that, The first identification result and the second identification result are obtained in parallel.
13. A computing device, characterized in that, The computing device includes: The communication interface is used to acquire the target image; A processor, connected to the communication interface, is used to implement the image recognition method according to any one of claims 8-12.
14. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores instructions that, when executed by a computing device, cause the computing device to implement the image recognition method according to any one of claims 8-12.