Method, device and computer readable storage medium for object recognition

By using spiking neurons in a spiking neural network to determine membrane voltage and weights, the recognition difficulties caused by incomplete AER event streams are solved, and accurate object recognition is achieved under incomplete event streams.

CN113221605BActive Publication Date: 2026-06-23HUAWEI TECH CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
HUAWEI TECH CO LTD
Filing Date
2020-02-06
Publication Date
2026-06-23

Smart Images

  • Figure CN113221605B_ABST
    Figure CN113221605B_ABST
Patent Text Reader

Abstract

A method, device and computer readable storage medium for object recognition, the method comprising: receiving a first feature pulse sequence of an object to be recognized, the first feature pulse sequence comprising a plurality of pulses obtained according to a first partial AER event of the object to be recognized in a first time period, obtaining a first membrane voltage at an end time of a first preset time period according to the plurality of pulses and a weight, and determining the object to be recognized as a first target object according to a membrane voltage of a pulse neuron, the membrane voltage of the pulse neuron comprising the first membrane voltage. In the technical solution provided in the application, the pulse neuron in the pulse neural network SNN can perform object recognition according to the partial AER event of the object to be recognized in the preset time period, so that the object to be recognized can be recognized in the case that the input AER event is incomplete.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of image processing, and more specifically, to a method, apparatus, and computer-readable storage medium for object recognition. Background Technology

[0002] In today's society, computer vision applications are ubiquitous, with wide applications in object tracking, image recognition, and video surveillance. Specifically, visual sensors can be used to identify objects in acquired image data. Traditional visual sensors use "frame scanning" as their image acquisition method. However, as the performance requirements for speed and other aspects of vision systems increase in practical applications, traditional visual sensors have encountered developmental bottlenecks such as excessive data volume and limited frame rates. AER sensors based on biomimetic visual perception models, with their advantages of high speed, low latency, and low redundancy, have become a research hotspot in the field of machine vision systems.

[0003] In techniques for identifying objects acquired by address event representation (AER) sensors using spiking neural networks (SNNs), a complete pulse sequence generated from the AER event stream of the object to be identified needs to be input into the SNN. The SNN can only identify the object after generating a complete pulse sequence based on this input. In this technique, if the AER event stream of the object to be identified input to the SNN is incomplete, accurate identification of the object is impossible.

[0004] Therefore, how to identify the object based on some AER events of the object to be identified has become a technical problem that urgently needs to be solved. Summary of the Invention

[0005] This application provides a method, apparatus, and computer-readable storage medium for object recognition. The spiking neurons in the spiking neural network (SNN) can perform object recognition based on a portion of multiple AER events of the object to be recognized, thereby enabling the identification of the target behavior to which multiple AER events belong even when the input AER events are incomplete.

[0006] Firstly, a method for object recognition is provided. This method is executed by spiking neurons in a spiking neural network (SNN). During execution, the spiking neuron receives a first feature pulse sequence of the object to be recognized. The first feature pulse sequence includes multiple pulses, which are obtained based on a first part of the AER events of the object to be recognized within a first time period. The spiking neuron obtains a first membrane voltage at the end of a first preset time period based on the multiple pulses and weights, and determines the object to be recognized as a first target object based on the membrane voltage of the spiking neuron. The membrane voltage of the spiking neuron includes the first membrane voltage.

[0007] In the above technical solution, the spiking neurons in the spiking neural network (SNN) can identify objects based on a portion of the AER events of the object to be identified within a preset time period, thus enabling the identification of the object to be identified even when the input AER events are incomplete.

[0008] In one possible implementation, if the first membrane voltage is greater than a first preset threshold, the object to be identified is determined to be the first target object.

[0009] In the above technical solution, if the voltage of the first membrane is greater than the preset threshold, the object to be identified can be determined as the first target object, which is simpler to implement.

[0010] In another possible implementation, if the first ratio between the first membrane voltage and the second membrane voltage is greater than a second preset threshold, the object to be identified is determined to be the first target object, wherein the second membrane voltage is the sum of the membrane voltages of the multiple spiking neurons in the SNN at the end of the first preset time period.

[0011] In another possible implementation, the method further includes: receiving a second feature pulse sequence of the object to be identified, the second feature pulse sequence containing multiple pulses, the second feature pulse sequence being obtained based on a second portion of AER events of the object to be identified within a second preset time period; obtaining a third membrane voltage of the spiking neuron at the end of the second preset time period based on the multiple pulses in the second feature pulse sequence and the set weights; and determining the object to be identified as a first target object based on the first membrane voltage and the third membrane voltage.

[0012] In another possible implementation, if the average or weighted average of the first membrane voltage and the third membrane voltage is greater than a first preset threshold, the object to be identified is determined to be the first target object.

[0013] In another possible implementation, if the average or weighted average of the first ratio and the second ratio is greater than a second preset threshold, the object to be identified is determined to be the first target object, wherein the second ratio is the ratio between the third membrane voltage and the sum of the membrane voltages of the multiple spiking neurons in the SNN at the end of the second preset time period.

[0014] In another possible implementation, the AER event includes the timestamp and address information that generated the AER event.

[0015] In another possible implementation, the weight is determined based on the deviation between the predicted and actual results of the object to be identified over multiple different time periods.

[0016] In the above technical solution, weight updates can be performed in segments. This allows for full utilization of the spatial and temporal information of the target object carried by the pulses in the pulse sequence within that time period, improving the efficiency and accuracy of synaptic weight training. This enables the SNN to identify the target object based on the trained weights, according to the pulse firing frequency of the spiking neuron or the membrane voltage of the spiking neuron.

[0017] Secondly, an object recognition apparatus is provided, which is applied to spiking neurons in a spiking neural network (SNN), the apparatus comprising:

[0018] The receiving module is used to receive a first feature pulse sequence of the object to be identified, wherein the first feature pulse sequence contains multiple pulses and is obtained based on a first part of the AER events of the object to be identified within a first preset time period.

[0019] The acquisition module is used to obtain the first membrane voltage of the spiking neuron at the end of the first preset time period based on the plurality of pulses in the first characteristic pulse sequence and the set weights.

[0020] The determination module is used to determine the object to be identified as a first target object based on the membrane voltage of the spiking neuron, wherein the membrane voltage of the spiking neuron includes the first membrane voltage.

[0021] In one possible implementation, the determining module is specifically used to: determine the object to be identified as the first target object if the first membrane voltage on the first spiking neuron is greater than a first preset threshold.

[0022] In another possible implementation, the determining module is specifically used to: determine the object to be identified as the first target object if the first ratio between the first membrane voltage and the second membrane voltage is greater than a second preset threshold, wherein the second membrane voltage is the sum of the membrane voltages of the multiple spiking neurons in the SNN at the end of the first preset time period.

[0023] In another possible implementation, the receiving module is further configured to: receive a second feature pulse sequence of the object to be identified, the second feature pulse sequence comprising multiple pulses, the second feature pulse sequence being obtained based on a second portion of AER events of the object to be identified within a second preset time period;

[0024] The acquisition module is also used to: acquire the third membrane voltage of the spiking neuron at the end of the second preset time period based on the plurality of pulses in the second characteristic pulse sequence and the set weights;

[0025] Specifically, the determining module is used to determine the object to be identified as the first target object based on the first membrane voltage and the third membrane voltage.

[0026] In another possible implementation, the determining module is specifically used to: determine the object to be identified as the first target object if the average or weighted average of the first membrane voltage and the third membrane voltage is greater than a first preset threshold.

[0027] In another possible implementation, the determining module is specifically used to: determine the object to be identified as the first target object if the average or weighted average of the first ratio and the second ratio is greater than a second preset threshold, wherein the second ratio is the ratio between the third membrane voltage and the sum of the membrane voltages of the multiple spiking neurons in the SNN at the end of the second preset time period.

[0028] In another possible implementation, the AER event includes the timestamp and address information that generated the AER event.

[0029] In another possible implementation, the weight is determined based on the deviation between the predicted and actual results of the object to be identified over multiple different time periods.

[0030] Thirdly, a computing device for object recognition is provided, including a communication interface and a processor. The processor is used to control the communication interface to send and receive information, is connected to the communication interface, and is used to execute the object recognition method of the first aspect or any possible implementation thereof.

[0031] Optionally, the processor can be a general-purpose processor, which can be implemented in hardware or software. When implemented in hardware, the processor can be a logic circuit, integrated circuit, etc.; when implemented in software, the processor can be a general-purpose processor that reads software code stored in memory. This memory can be integrated into the processor or located outside the processor and exist independently.

[0032] Fourthly, a computer-readable medium is provided that stores program code, which, when run on a computing device, causes the computing device to perform the methods described in the first aspect or possible implementations thereof.

[0033] Fifthly, a computer program product is provided, comprising: computer program code, which, when run on a computing device, causes the computing device to perform the methods described in the first aspect or possible implementations of the first aspect. Attached Figure Description

[0034] Figure 1 This is a structural architecture diagram of a computing device 100 provided in an exemplary embodiment of this application.

[0035] Figure 2 This is a flowchart illustrating how an SNN can be used to identify multiple AER events of an object to be identified, as provided in an exemplary embodiment of this application.

[0036] Figure 3 This is an SNN implementation structure diagram of an object recognition method provided in an exemplary embodiment of this application.

[0037] Figure 4 This is a schematic diagram of an exemplary embodiment of the present application for extracting a feature map.

[0038] Figure 5 This is a schematic flowchart illustrating an object recognition method provided in an exemplary embodiment of this application.

[0039] Figure 6 This is a schematic flowchart illustrating a method for training synaptic weights in an SNN, provided in an exemplary embodiment of this application.

[0040] Figure 7 This is a schematic flowchart illustrating a method for adjusting synaptic weights provided in an exemplary embodiment of this application.

[0041] Figure 8 This is a schematic block diagram of an object recognition device 800 provided in an exemplary embodiment of this application. Detailed Implementation

[0042] The technical solutions in this application will now be described with reference to the accompanying drawings.

[0043] To facilitate understanding of the embodiments of this application, the concepts of the terms involved are first introduced below:

[0044] Address event representation (AER) sensors are neuromorphic devices that mimic the mechanisms of the human retina. An AER sensor consists of multiple pixels, each monitoring changes in light intensity within a specific area. When the change exceeds a threshold, the AER event corresponding to that pixel is recorded; otherwise, it is not. Each AER event includes the location information (address information) of the pixel that triggered the event, the time of occurrence (timestamp), and polarity. Polarity characterizes whether the pixel perceived a change in light from dark to bright (represented by a value of 1) or from bright to dark (represented by a value of -1). Therefore, the AER sensor ultimately outputs the AER event from each pixel. Compared to traditional cameras, AER sensors offer advantages such as asynchronous scene processing, high temporal resolution, and sparse representation, providing significant advantages in data transmission speed and data redundancy. It should be noted that the asynchronous scene processing mentioned above refers to each pixel acquiring its own AER event.

[0045] The AER event stream includes multiple AER events. Each AER event includes the address information of the pixel where the AER event occurred, the timestamp of the occurrence, and the polarity, etc.

[0046] Gabor filters are linear filters used for texture analysis. They are widely used in computer vision applications to extract features from images and videos. Specifically, they allow only textures corresponding to their frequencies to pass through, while suppressing the energy of other textures. A Gabor filter can be represented by a scale *s* and a direction *θ*. Different combinations of scales *s* and directions *θ* correspond to different convolutional kernels, and different combinations of scales *s* and directions *θ* correspond to different filters. Research has shown that simple cells in the visual cortex of the mammalian brain can be modeled using Gabor filters, with each Gabor filter simulating a neuron with a receptive field of a certain scale. It should be understood that a receptive field is a region of stimulation that a neuron responds to or innervates.

[0047] Spiking neural networks (SNNs), often hailed as the third generation of artificial neural networks, use neurons that mimic the voltage changes and transmission processes of biological nerve cells to perform object recognition from input. Information is transmitted between neurons in a spiking neural network in the form of pulses, based on discrete values ​​occurring at certain points in time, rather than continuous values. The generation of a pulse is determined by differential equations representing various biological processes, the most important of which is the neuron's membrane voltage. The neuron's membrane voltage changes with the input pulse; once the neuron's membrane voltage reaches a certain value, it generates a pulse and sends pulse signals to the neurons connected to it.

[0048] In a SNN, two neurons can be connected by a single synapse or multiple synaptic connections; this application does not impose a specific limitation on this. Each synapse has modifiable synaptic weights, and multiple pulses transmitted by the presynaptic neuron can generate different postsynaptic membrane voltages depending on the magnitude of the synaptic weights.

[0049] Before introducing the object recognition method provided in the embodiments of this application, the application scenarios and system architecture to which the embodiments of this application are applicable will be introduced first.

[0050] In today's society, computer vision applications are ubiquitous, with wide applications in areas such as object tracking, image recognition, and video surveillance. Specifically, it can be used to identify objects in acquired image data through visual sensors.

[0051] Traditional vision sensors use frame scanning for image acquisition. However, with increasing demands for speed and other performance in practical applications of vision systems, traditional sensors have encountered developmental bottlenecks such as excessive data volume and limited frame rates. AER sensors, based on biomimetic visual perception models, have become a research hotspot in the field of machine vision systems due to their advantages of high speed, low latency, and low redundancy. Unlike traditional vision sensors that record light intensity values, AER sensors only record events where the change in light intensity exceeds a threshold; events where the change in light intensity is less than the threshold are not recorded. This significantly reduces the redundancy of visual information.

[0052] Compared to traditional vision sensors, AER sensors offer advantages such as asynchronous scene capture, high temporal resolution, and sparse representation, providing significant advantages in data transmission speed and data redundancy. Therefore, embodiments of this application can utilize an AER sensor to acquire image data. This AER sensor can be applied to any image capture scenario where the primary content to be recorded is changing, such as dashcams and surveillance equipment.

[0053] Because AER sensors output asynchronous discrete events—for example, an AER sensor outputs multiple AER event streams, each of which can include multiple AER events, and each AER event is an asynchronous discrete event—and because neurons in a SNN transmit information in the form of spikes and identify the object to be identified corresponding to those spikes, an SNN can be used to identify the AER event streams (including multiple AER events) of the object to be identified acquired by the AER sensor.

[0054] In techniques for identifying objects acquired by AER sensors using SNNs, a complete pulse sequence generated from the AER event stream of the object needs to be input into the SNN. The SNN can only identify the object after generating the complete pulse sequence based on this input. However, this technique cannot accurately identify the object when the AER event stream of the object input to the SNN is incomplete. Therefore, a method for object identification is needed that can identify the object even when the AER event stream of the object input to the SNN is incomplete.

[0055] This application provides a method for object recognition, which can identify the object to be identified and obtain the recognition result even when the AER event stream of the object to be identified input to the SNN is not yet complete.

[0056] The object recognition method provided in this application can be executed by an object recognition device, which can be a hardware device, such as a server or terminal computing device. The object recognition device can also be a software device, specifically a software system running on a hardware computing device. This application does not limit the location where the object recognition device is deployed. For example, the object recognition device can be deployed on a server.

[0057] An object recognition device can logically be composed of multiple parts, such as a receiving module, an acquisition module, and a determining module. The various components of an object recognition device can be deployed in different systems or servers. These components can operate in one of three environments: a cloud computing system, an edge computing system, or a terminal computing device, or any two of these environments. The cloud computing system, edge computing system, and terminal computing device are connected by communication paths and can communicate with each other.

[0058] The following is combined with Figure 1 The description will take an object recognition device as an example of a computing device.

[0059] Figure 1An exemplary diagram of a possible architecture of the computing device 100 of this application is provided. Figure 1 As shown, the computing device 100 may include a processor 101, a memory 102, a communication interface 103, and a bus 104.

[0060] In computing device 100, the number of processors 101 can be one or more. Figure 1 Only one of the processors, 101, is shown.

[0061] Optionally, processor 101 may be a central processing unit (CPU). If computing device 100 has multiple processors 101, the multiple processors 101 may be of different types or may be the same. Optionally, the multiple processors of computing device 100 may also be integrated into a multi-core processor. Processor 101 can be used to execute the steps of the object recognition method. In practical applications, processor 101 may be a very large-scale integrated circuit. An operating system and other software programs are installed in processor 101, thereby enabling processor 101 to access devices such as memory 102.

[0062] It is understood that in this embodiment of the application, the processor 101 is described using a CPU as an example. In actual applications, it can also be other application-specific integrated circuits (ASICs).

[0063] Memory 102 stores computer instructions and data. Memory 102 may store computer instructions and data required to implement the object recognition method provided in this application. For example, memory 102 stores instructions for the receiving module to perform the execution step in the object recognition method provided in this application. As another example, memory 102 stores instructions for the obtaining module to perform the execution step in the object recognition method provided in this application. As yet another example, memory 102 stores instructions for the determining module to perform the execution step in the object recognition method provided in this application. Memory 102 may be any one or any combination of the following storage media: non-volatile memory (such as read-only memory (ROM), solid-state disk (SSD), hard disk drive (HDD), optical disk, etc.), and volatile memory.

[0064] The communication interface 103 can be any one or any combination of the following devices: a network interface (such as an Ethernet interface), a wireless network card, or other devices with network access capabilities. The communication interface 103 is used for data communication between the computing device 100 and other computing devices 100 or terminals. In this application, the communication interface 103 can be used to receive a first characteristic pulse sequence of the object to be identified.

[0065] Figure 1 Bus 104 is represented by a thick line. Bus 104 connects processor 101 to memory 102 and communication interface 103. In this way, through bus 104, processor 101 can access memory 102 and can also use communication interface 103 to interact with other computing devices 100 or terminals.

[0066] In this application, the computing device 100 executes computer instructions stored in the memory 102 to implement the object recognition method provided in this application. For example, instructions that cause the computing device 100 to perform the object recognition method, performed by a receiving module. Another example is instructions that cause the computing device 100 to perform the object recognition method, performed by an acquiring module. Yet another example is instructions that cause the computing device 100 to perform the object recognition method, performed by a determining module.

[0067] Before implementation, let's first combine... Figures 2-3 The overall system framework applicable to the embodiments of this application will be described in detail.

[0068] Figure 2 This is a flowchart illustrating how an SNN can be used to identify multiple AER events of an object to be identified, as provided in an embodiment of this application. See also... Figure 2 The flowchart may include steps 210-230, which will be described in detail below.

[0069] Step 210: The SNN acquires the AER event stream from the AER sensor.

[0070] In this embodiment, the AER event stream obtained by the SNN from the AER sensor may include multiple AER events of the object to be identified. Each AER event includes the address information, timestamp, and polarity of the pixel where the AER event occurred. It should be understood that the object to be identified refers to an object in the AER event stream whose category or action has not been determined.

[0071] Specifically, the AER sensor can detect changes in light intensity for each pixel. When the change exceeds a threshold, it records the AER event for that pixel; otherwise, it does not record an AER event. Each AER event includes the address information of the pixel that triggered the event, a timestamp, and polarity. The polarity characterizes whether the pixel perceived a change in light from dark to bright (represented by a value of 1) or from bright to dark (represented by a value of -1). Thus, the AER event stream obtained from the AER sensor by the SNN can include multiple AER events.

[0072] There are several ways for SNNs to acquire AER event streams from AER sensors, and this application does not impose any specific limitations. The following describes several possible implementation methods in detail.

[0073] In one possible implementation, after receiving a processing request for the AER event of the object to be identified, the SNN sends an AER event acquisition request to the AER sensor to which the AER event stream belongs. Upon receiving the AER event acquisition request, the AER sensor can send its AER event to the SNN. In this way, the SNN can acquire the AER event of the object to be identified from the AER sensor.

[0074] In another possible implementation, the AER sensor is configured with an AER event upload cycle. At each upload cycle, the AER sensor can send the AER events collected between the last upload and the current upload to the SNN. The SNN can then obtain the AER events of the object to be identified from the AER sensor.

[0075] In another possible implementation, the AER sensor sends the acquired AER events to the SNN whenever an AER event is acquired. In this way, the SNN can also obtain the AER event stream of the object to be identified from the AER sensor.

[0076] It should be noted that this application obtains an AER event stream over a period of time, and identifies the objects to be identified in the AER event stream over this period of time, such as a period of 1 minute.

[0077] Step 220: The SNN extracts features from the acquired AER event stream.

[0078] SNN can process the acquired AER event stream of the object to be identified, so that the SNN can identify the object. In this application, the processing can include extracting multiple feature maps of the AER event stream and encoding the multiple feature maps of the AER event stream.

[0079] Taking the extraction of multiple feature maps from an AER event stream using a SNN as an example, an SNN can extract the spatial features of an AER event to obtain its feature map. Alternatively, an SNN can also extract both spatial and temporal features of an AER event to obtain its feature map; this application does not impose any specific limitations. When extracting the feature map of an AER event, both the temporal and address information of the AER event are extracted. The original temporal and spatial information of the AER event are both contained within the extracted feature map, enabling the feature map to more comprehensively represent the original data. This, in turn, leads to more accurate recognition results when identifying the object to be identified.

[0080] As an example, the following is a detailed description of the specific implementation process of extracting the time and address information of AER events using SNN to obtain multiple feature maps of the AER event stream.

[0081] The AER event stream corresponds to multiple feature maps, each containing partial spatial and temporal information of the object to be identified. The spatial and temporal information is obtained based on the timestamp and address information of each AER event. The spatial information indicates the spatial characteristics of the object to be identified, while the temporal information indicates its temporal characteristics.

[0082] When SNNs extract partial spatial information from the AER events of the object to be identified, they can use convolution operations. Specifically, filters can be used when extracting feature maps. The filter can be any type of filter that can extract features, such as a Gabor filter or a difference of Gaussian (DOG) filter, etc. This application does not make any specific limitations on this.

[0083] Taking the Gabor filter as an example, a Gabor filter can be represented by the filter's scale *s* and orientation *θ*. With a fixed scale *s* and orientation *θ*, the convolution kernel at this scale and orientation *θ* can be calculated using the Gabor filter's functional expression. After determining the convolution kernel, it is used to extract a feature map. The number of feature values ​​in the feature map is the same as the number of pixels in the AER sensor, and the number of feature values ​​in each row of the feature map corresponds one-to-one with the number of pixels in each row of the AER sensor.

[0084] In the initial feature map, all feature values ​​can be zero. For example, if the AER sensor has 5*5 pixels, then the feature map contains 5*5 feature values. With a fixed scale *s* and orientation *θ*, whenever an AER event is processed by convolution, the convolution kernel corresponding to that scale *s* and orientation *θ* is applied to the receptive field of the feature map at the location of that AER event. Specifically, as an example, the convolution kernel is... The AER event is located at position (m, n) in the feature map. The value 'e' at the center of the convolution kernel is superimposed on the feature value at position (m, n) in the feature map. Then, 'a' is superimposed on the feature value at position (m-1, n-1), 'b' on the feature value at position (m, n-1), 'c' on the feature value at position (m+1, n-1), and so on. This process covers the convolution kernel with the feature map, thus obtaining the feature map for the AER event.

[0085] When extracting partial temporal information from the AER events of an object to be identified, SNNs can use a spatial information decay method over time to extract this temporal information, thus reducing the influence of AER event timestamps on the spatial information. Specifically, for any location in the feature map, the receptive field covering the AER events at that location is identified, and then the feature values ​​at that location are decayed using the timestamps of these AER events. This ensures that AER events older than the current time have a smaller impact on the feature values ​​in the current feature map, and AER events younger than the current time have a greater impact on the feature values ​​in the current feature map.

[0086] Taking the encoding of multiple extracted feature maps by a Sub-Neural Network (SNN) as an example, an SNN can encode multiple feature maps extracted from an AER event stream into a pulse sequence. It should be noted that during the encoding process, features with larger eigenvalues ​​in the feature maps are considered more likely to generate pulses, corresponding to the minimum delay time, and will trigger the pulse first. Features with smaller eigenvalues ​​in the feature maps will trigger pulses later or not at all. Thus, the triggering time of the pulses in the pulse sequence is based on the eigenvalues ​​in the feature maps. Since each eigenvalue reflects some spatial and temporal information of the object to be identified, the pulses in the pulse sequence also carry some spatial and temporal information of the object to be identified.

[0087] Step 230: The SNN identifies the object to be identified based on the input pulse sequence.

[0088] SNN can extract the pulse sequence corresponding to the AER event stream of the object to be identified, and then identify the object based on the pulse sequence to obtain the identification result.

[0089] The following is combined with Figure 3 ,right Figure 2 The structure diagram of the SNN described in [reference needed] is used for illustration. See [reference needed] Figure 3 The structure diagram may include an S1 layer, a C1 layer, an encoding layer, and a recognition layer.

[0090] Layer S1 is used to implement the SNN processing in step 210. Specifically, it can extract features from the acquired AER event stream to obtain multiple feature maps. See also: Figure 4 Assuming the convolution kernel is 3*3, that is Assuming the AER sensor comprises 6x6 pixels, with a scale s of 3 and an orientation θ of 45 degrees, the feature map output by layer C1 is 6x6. When there is no AER event input, the value at each location in the feature map is 0, i.e. An AER event is input at pixel position (4,4) and time 100ms. Layer S1 adds a convolution kernel at position (4,4) on the feature map, resulting in the following feature map. However, as time progresses and reaches 200ms, the feature values ​​in the feature map decay to some extent compared to 100ms.

[0091] Layer C1 performs dimensionality reduction on the feature maps output by layer S1, a process also known as pooling. Specifically, layer C1 divides each feature map output by layer S1 into adjacent n*n regions. For example, it divides each feature map output by layer S1 into adjacent 2*2 regions. For each feature map, layer C1 selects the maximum value from each 2*2 region to obtain a new feature map. It's clear that layer C1 only changes the dimensionality of the feature maps, not the number of feature maps. For instance, if layer S1 outputs 16 feature maps, each 128*128 in size, 16 new feature maps will be obtained, each 64*64 in size. Thus, by reducing the dimensionality of the feature maps through layer C1, the processing load of subsequent encoding and recognition layers can be reduced.

[0092] It should be noted that if the feature map output by the S1 layer has a small dimension, or if the processing capabilities of the encoding and recognition layers are strong, the processing of the C1 layer may be omitted.

[0093] The encoding layer is used to implement the SNN processing in step 220. Specifically, it encodes multiple feature maps of the acquired AER event stream into a pulse sequence. For details, please refer to the description of the encoding process above; it will not be repeated here.

[0094] The recognition layer receives the pulse sequence output from the encoding layer and identifies the AER event stream of the object to be identified from the AER sensor. Specifically, the pulse sequence can be input into each recognition neuron in the recognition layer, and the object to be identified is determined based on the membrane voltage on each recognition neuron.

[0095] It should be understood that this recognition layer consists of a single layer of fully connected neurons (i.e., the recognition neurons or spiking neurons mentioned later). The number of neurons included in the recognition layer is equal to N*P*M (N and P may be equal or unequal), where N*P is the size of the feature map (the feature map output by layer C1), and M is the number of directions θ.

[0096] The following is combined with Figure 5 This application provides a detailed description of a method for object recognition using an identification layer, which is executed by spiking neurons in an SNN. Figure 5 The method shown can be described in steps 510-530, which will be described in detail below.

[0097] Step 510: Receive the first feature pulse sequence of the object to be identified.

[0098] The spiking neural network in a SNN can receive a first feature pulse sequence of an object to be identified, which includes multiple pulses. This first feature pulse sequence is obtained through feature extraction and encoding based on a first portion of AER events of the object to be identified within a first preset time period. This first feature pulse sequence contains temporal and spatial information about the time and space in which the first portion of the AER events occurred. For details, please refer to [link to relevant documentation]. Figure 2 The relevant descriptions in the document will not be repeated here.

[0099] It should be noted that when dividing multiple AER events of an object to be identified into different time periods, the multiple AER events of the object to be identified can be segmented according to a uniform time interval, or they can be segmented according to a non-uniform time interval. This application does not make a specific limitation on this.

[0100] In this application, the object to be identified can be an object such as a pedestrian or a vehicle, or it can be a process of change such as an action or behavior.

[0101] Step 520: Obtain the first membrane voltage of the spiking neuron at the end of the first preset time period based on multiple pulses in the first characteristic pulse sequence and the set weights.

[0102] The membrane voltage of a spiking neuron changes according to stimulation by multiple pulses of the first feature pulse sequence. For example, the membrane voltage of a spiking neuron accumulates based on the spatial information contained in multiple pulses, and also decays based on the temporal information contained in these pulses. The magnitude of the membrane voltage change of a spiking neuron is related to the weights set on the neuron. These weights are determined based on the deviation between the predicted and actual results of the object to be identified at multiple different time periods. This will be discussed in detail below. Figures 7-8The training process for this weight will be described in detail here.

[0103] In this application, the spiking neuron can obtain the accumulated membrane voltage at the end of a first preset time period based on multiple pulses of the input first characteristic pulse sequence and the set weights.

[0104] Specifically, in one possible implementation, the membrane voltage of the spiking neuron can also be calculated using the following formula (1).

[0105] A possible formula for calculating the membrane voltage of a spiking neuron is shown below:

[0106]

[0107] Where t represents the current time, which can also be called the end time of each time interval;

[0108] V(t) represents the cumulative membrane voltage on the spiking neuron at time t;

[0109] w i This represents the weight value on synapse i;

[0110] t i This represents the pulse time on synapse i, and can also be called the timestamp of receiving the characteristic pulse sequence;

[0111] V rest This represents the resting potential of a spiking neuron.

[0112] It should be noted that the resting potential is generally 0. Over time, the feature value corresponding to each pixel in the feature map either decreases or increases towards the resting potential. A decrease towards the resting potential occurs when feature values ​​greater than 0 decrease towards 0, such as changing from 1 to 0.5. An increase towards the resting potential occurs when feature values ​​less than 0 increase towards 0, such as changing from -1 to -0.5.

[0113] The formula for calculating function K is shown below:

[0114]

[0115] Where V0 is a preset constant;

[0116] τ m and τ s These represent the decay time constants of membrane integration and synaptic current, respectively;

[0117] exp() is a decay function that represents the degree of decay.

[0118] Step 530: Determine the object to be identified as the first target object based on the membrane voltage of the spiking neuron.

[0119] In this application, there are multiple ways to determine the object to be identified as the first target object based on the membrane voltage of the spiking neuron. Several possible implementation methods are described in detail below.

[0120] In one possible implementation, the membrane voltage of the spiking neuron includes a first membrane voltage. If the spiking neuron represents a first target, and if the first membrane voltage is greater than a first preset threshold, the object to be identified can be determined as the first target object represented by the spiking neuron.

[0121] In another possible implementation, the membrane voltage of the spiking neuron includes a first membrane voltage. If the first ratio between the first membrane voltage and the second membrane voltage is greater than a second preset threshold, the object to be identified is determined to be the first target object represented by the spiking neuron. The second membrane voltage is the sum of the membrane voltages of multiple spiking neurons in the SNN at the end of the first preset time period.

[0122] Optionally, the object to be identified can be determined as the first target object represented by the spiking neuron based on the fact that the ratio between the exponent of the first membrane voltage and the sum of the exponents of the membrane voltages of the multiple spiking neurons in the SNN at the end of the first preset time period is greater than a second preset threshold.

[0123] One possible formula for calculating the probability that a feature pulse sequence represents a specific object (e.g., object j) by a certain spiking neuron is as follows:

[0124]

[0125] Wherein, P(c k =j) represents the input characteristic pulse sequence c k It is the probability that it is object j;

[0126] V j This represents the membrane voltage on the spiking neuron representing object j;

[0127] exp(V j ) represents the exponent of the membrane voltage on the spiking neuron of object j;

[0128] n represents the number of spiking neurons in the SNN;

[0129] This represents the exponential sum of the membrane voltages on the n spiking neurons of the SNN.

[0130] In another possible implementation, the membrane voltage of the spiking neuron includes a first membrane voltage and a third membrane voltage. The third membrane voltage is the membrane voltage of the spiking neuron at the end of a second preset time period, determined according to multiple pulses and weights in the second feature pulse sequence. If the average or weighted average of the first membrane voltage and the third membrane voltage is greater than a first preset threshold, the object to be identified is determined to be the first target object.

[0131] In another possible implementation, the membrane voltage of the spiking neuron includes a first membrane voltage and a third membrane voltage. If the average or weighted average of the first ratio and the second ratio is greater than a second preset threshold, the object to be identified is determined to be the first target object. The second ratio is the ratio between the third membrane voltage and the sum of the membrane voltages of the multiple spiking neurons in the SNN at the end of the second preset time period.

[0132] In the above technical solution, the spiking neurons in the spiking neural network (SNN) can identify objects based on partial AER events of the object to be identified within a preset time period. Thus, even when the input AER events are incomplete, the object to be identified can be identified based on partial AER events.

[0133] In addition, this application embodiment also provides a method for training weights in an SNN, enabling the SNN to identify the object to be identified based on the trained weights, according to the firing frequency of the spiking neuron or the membrane voltage of the spiking neuron. The following is in conjunction with... Figure 6-7 This application provides a detailed description of a method for training synaptic weights in an SNN, as provided in an embodiment.

[0134] Figure 6 This is a schematic flowchart illustrating a method for training synaptic weights in an SNN provided in an embodiment of this application. See also... Figure 6 The method may include steps 610-670, which are described in detail below.

[0135] Step 610: Calculate the membrane voltage of the spiking neuron.

[0136] The membrane voltage of a spiking neuron is initialized to 0. After receiving the characteristic pulse sequence output by the coding layer, the membrane voltage of the spiking neuron will change according to the received characteristic pulse sequence. For the specific calculation method of the membrane voltage of each spiking neuron, please refer to the description of formula (1), which will not be repeated here.

[0137] Step 620: Initialize t s =0.

[0138] t sThis represents the initial moment of each time period. Due to the decay mechanism of neurons, the weights can be updated segmentally in this application. This allows for full utilization of the spatial and temporal information of the object to be identified carried by the pulses in the pulse sequence within that time period, thereby improving the efficiency and accuracy of synaptic weight training.

[0139] Step 630: Determine t s Is it less than the total length L of the characteristic pulse sequence?

[0140] If t s If the length of the characteristic pulse stream sequence is less than L, then steps 740-760 can be executed. If t s If the length is greater than the total length L of the characteristic pulse sequence, then step 770 can be executed.

[0141] Step 640: Find the membrane voltage of the spiking neuron at (t s , t s +t R Peak value t within the range peak .

[0142] Specifically, in the embodiments of this application, it is possible to start from t s Begin with t R Within a fixed search duration, determine the membrane voltage of each spiking neuron within (t... s , t s +t R Peak value t within the range peak .

[0143] In one possible implementation, the peak time t of the spiking neuron can be determined based on formula (4). peak .

[0144]

[0145] in, This represents the peak moment of the neuron representing object j.

[0146] It should be noted that if multiple time points within a segment satisfy the conditions in formula (4) above, the earliest time point shall be selected as the earliest time point.

[0147] Step 650: Based on the spiking neuron in (t s , t s +t peak The membrane voltage within the range adjusts the synaptic weight.

[0148] During training, synaptic weights are first randomly initialized, and then determined based on the spiking neurons' positions at (t) s , t s +t peakThe synaptic weight is adjusted within the range of membrane voltage. For specific methods on adjusting synaptic weight, please refer to [link to relevant documentation]. Figure 7 The description in the text.

[0149] Figure 7 This is a schematic flowchart illustrating a method for adjusting synaptic weights provided in an embodiment of this application. Figure 7 As shown, the method includes steps 710-740, which will be described in detail below.

[0150] Step 710: Approximate the relationship between the frequency of pulse firing by the spiking neuron and the voltage.

[0151] The relationship between the frequency of neuronal pulse firing and voltage in the embodiment of this application is shown in formula (5).

[0152] f out =log(exp(V peak (5) + 1)

[0153] Among them, f out This indicates the frequency of pulse firing by a spiking neuron;

[0154] V peak This indicates that the neuron is at its peak time t peak The corresponding membrane voltage.

[0155] The relationship between the frequency of neuron pulse firing of object j and voltage is shown in formula (6).

[0156]

[0157] in, This represents the frequency of neuron pulse firing for object j;

[0158] This indicates that the neuron representing object j is at its peak at time t. peak The corresponding membrane voltage.

[0159] Step 720: Determine the probability that the k-th sample belongs to object j.

[0160] One possible formula for calculating the probability that the k-th sample (e.g., the characteristic pulse sequence of the k-th time period) is a specific object (e.g., object j) is as follows:

[0161]

[0162] in, This represents the probability that the predicted k-th sample belongs to object j;

[0163] This represents the frequency of neuron pulse firing for object j;

[0164] It represents the sum of the pulse firing frequencies of n neurons.

[0165] Step 730: Determine the loss function for the k-th sample.

[0166] In this embodiment, the loss function of the k-th sample can be determined based on the probability of the k-th sample belonging to object j predicted in step 820 and the actual object to which the k-th sample belongs. Specifically, please refer to the following formula (8).

[0167]

[0168] Among them, L k Let represent the loss function for the k-th sample;

[0169] This represents the probability that the predicted k-th sample belongs to the true object of the k-th sample.

[0170] Step 740: Minimize the loss function of the k-th sample and update the synaptic weights w. i .

[0171] The embodiments of this application can perform synaptic weighting w based on the following formulas (9)-(11). i Update.

[0172]

[0173] Where λ is the learning rate, which is usually set to 0.1.

[0174]

[0175]

[0176] Step 660: Update t s .

[0177] The embodiments of this application can update t according to formula (12). s .

[0178]

[0179] Step 670: End.

[0180] Optionally, in some embodiments, since the activity of individual spiking neurons is easily affected, population coding is employed to improve the reliability of information encoding. In this application, each object category is associated with a population of multiple spiking neurons. That is, multiple spiking neurons in the SNN can represent an object category.

[0181] The above text combined Figures 1 to 7 The present application describes in detail the object recognition method provided in the embodiments. The following will be combined with... Figure 8 The embodiments of the apparatus of this application are described in detail below. It should be understood that the descriptions of the method embodiments correspond to the descriptions of the apparatus embodiments; therefore, any parts not described in detail can be referred to the foregoing method embodiments.

[0182] Figure 8 This is a schematic block diagram of an object recognition device 800 provided in an embodiment of this application. The object recognition device can be implemented as part or all of the device through software, hardware, or a combination of both. The device provided in this embodiment can implement the embodiments of this application. Figure 5 The process, the object recognition device 800 includes: a receiving module 810, an acquiring module 820, and a determining module 830, wherein:

[0183] The receiving module 810 is used to receive a first feature pulse sequence of an object to be identified, wherein the first feature pulse sequence contains multiple pulses and is obtained based on a first part of the AER events of the object to be identified within a first preset time period.

[0184] The acquisition module 820 is used to obtain the first membrane voltage of the spiking neuron at the end of the first preset time period based on the plurality of pulses in the first characteristic pulse sequence and the set weights.

[0185] The determination module 830 is used to determine the object to be identified as a first target object based on the membrane voltage of the spiking neuron, wherein the membrane voltage of the spiking neuron includes the first membrane voltage.

[0186] In one possible implementation, the determining module 830 is specifically used to: determine the object to be identified as the first target object if the first membrane voltage on the first spiking neuron is greater than a first preset threshold.

[0187] In another possible implementation, the determining module 830 is specifically used to: determine the object to be identified as the first target object if the first ratio between the first membrane voltage and the second membrane voltage is greater than a second preset threshold, wherein the second membrane voltage is the sum of the membrane voltages of the multiple spiking neurons in the SNN at the end of the first preset time period.

[0188] In another possible implementation, the receiving module 810 is further configured to: receive a second feature pulse sequence of the object to be identified, the second feature pulse sequence comprising multiple pulses, the second feature pulse sequence being obtained based on a second portion of AER events of the object to be identified within a second preset time period;

[0189] The acquisition module 820 is further configured to: acquire the third membrane voltage of the spiking neuron at the end of the second preset time period based on the plurality of pulses in the second characteristic pulse sequence and the set weights;

[0190] The determination module 830 is specifically used for:

[0191] The object to be identified is determined as the first target object based on the first membrane voltage and the third membrane voltage.

[0192] In another possible implementation, the determining module 830 is specifically used to: determine the object to be identified as the first target object if the average or weighted average of the first membrane voltage and the third membrane voltage is greater than a first preset threshold.

[0193] In another possible implementation, the determining module 830 is specifically used to: determine the object to be identified as the first target object if the average or weighted average of the first ratio and the second ratio is greater than a second preset threshold, wherein the second ratio is the ratio between the third membrane voltage and the sum of the membrane voltages of the multiple spiking neurons in the SNN at the end of the second preset time period.

[0194] In another possible implementation, the AER event includes the timestamp and address information that generated the AER event.

[0195] In another possible implementation, the weight is determined based on the deviation between the predicted and actual results of the object to be identified over multiple different time periods.

[0196] In the technical solution provided in this application, the spiking neurons in the spiking neural network (SNN) can identify objects based on a portion of the AER events of the object to be identified within a preset time period, thereby enabling the identification of the object to be identified even when the input AER events are incomplete.

[0197] It should be noted that the object recognition device provided in the above embodiments is only illustrated by the division of the above functional modules. In practical applications, the above functions can be assigned to different functional modules as needed, that is, the internal structure of the device can be divided into different functional modules to complete all or part of the functions described above. In addition, the object recognition device and the object recognition method embodiments provided in the above embodiments belong to the same concept, and their specific implementation process can be found in the method embodiments above, which will not be repeated here.

[0198] In this embodiment, a computing device for object recognition is also provided. The computing device includes a processor and a memory. The memory is used to store one or more instructions. The processor implements the object recognition method provided above by executing the one or more instructions.

[0199] In this embodiment, a computer-readable storage medium is also provided, which stores instructions that, when executed on a computing device, cause the computing device to perform the object recognition method described above.

[0200] In this embodiment, a computer program product containing instructions is also provided, which, when run on a computing device, causes the computing device to execute the object recognition method provided above, or causes the computing device to implement the function of the object recognition device provided above.

[0201] It should be understood that in the various embodiments of this application, the order of the above-mentioned processes does not imply the order of execution. The execution order of each process should be determined by its function and internal logic, and should not constitute any limitation on the implementation process of the embodiments of this application.

[0202] Those skilled in the art will recognize that the units and algorithm steps of the various examples described in conjunction with the embodiments disclosed herein can be implemented in electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are implemented in hardware or software depends on the specific application and design constraints of the technical solution. Those skilled in the art can use different methods to implement the described functions for each specific application, but such implementation should not be considered beyond the scope of this application.

[0203] Those skilled in the art will understand that, for the sake of convenience and brevity, the specific working processes of the systems, devices, and units described above can be referred to the corresponding processes in the foregoing method embodiments, and will not be repeated here.

[0204] In the several embodiments provided in this application, it should be understood that the disclosed systems, apparatuses, and methods can be implemented in other ways. For example, the apparatus embodiments described above are merely illustrative; for instance, the division of units is only a logical functional division, and in actual implementation, there may be other division methods. For example, multiple units or components may be combined or integrated into another system, or some features may be ignored or not executed. Furthermore, the coupling or direct coupling or communication connection shown or discussed may be through some interfaces; the indirect coupling or communication connection between apparatuses or units may be electrical, mechanical, or other forms.

[0205] The unit described as a separate component may or may not be physically separate. The component shown as a unit may or may not be a physical unit; that is, it may be located in one place or distributed across multiple network units. Some or all of the units can be selected to achieve the purpose of this embodiment according to actual needs.

[0206] In addition, the functional units in the various embodiments of this application can be integrated into one processing unit, or each unit can exist physically separately, or two or more units can be integrated into one unit.

[0207] If this function is implemented as a software functional unit and sold or used as an independent product, it can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of this application, in essence, or the part that contributes to the prior art, or a part of the technical solution, can be embodied in the form of a software product. This computer software product is stored in a storage medium and includes several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) to execute all or part of the steps of the methods in the various embodiments of this application. The aforementioned storage medium includes various media capable of storing program code, such as USB flash drives, portable hard drives, read-only memory (ROM), random access memory (RAM), magnetic disks, or optical disks.

[0208] The above description is merely a specific embodiment of this application, but the scope of protection of this application is not limited thereto. Any variations or substitutions that can be easily conceived by those skilled in the art within the scope of the technology disclosed in this application should be included within the scope of protection of this application. Therefore, the scope of protection of this application should be determined by the scope of the claims.

Claims

1. A method for object recognition, characterized in that, The method is performed by spiking neurons in a spiking neural network (SNN), and the method includes: Receive a first feature pulse sequence of the object to be identified, wherein the first feature pulse sequence contains multiple pulses, and the first feature pulse sequence is obtained based on a first portion of AER events of the object to be identified within a first preset time period; Based on the multiple pulses in the first characteristic pulse sequence and the set weights, the first membrane voltage of the spiking neuron at the end of the first preset time period is obtained; Receive the second feature pulse sequence of the object to be identified, the second feature pulse sequence containing multiple pulses, the second feature pulse sequence being obtained based on the second part of the AER events of the object to be identified within a second preset time period; Based on the plurality of pulses in the second characteristic pulse sequence and the set weights, the third membrane voltage of the spiking neuron at the end of the second preset time period is obtained; The object to be identified is determined as the first target object based on the membrane voltage of the spiking neuron, wherein the membrane voltage of the spiking neuron includes the first membrane voltage and the third membrane voltage.

2. The method according to claim 1, characterized in that, The step of determining the object to be identified as the first target object based on the membrane voltage of the spiking neuron includes: If the first membrane voltage is greater than the first preset threshold, the object to be identified is determined to be the first target object.

3. The method according to claim 1, characterized in that, The step of determining the object to be identified as the first target object based on the membrane voltage of the spiking neuron includes: If the first ratio between the first membrane voltage and the second membrane voltage is greater than a second preset threshold, the object to be identified is determined to be the first target object, wherein the second membrane voltage is the sum of the membrane voltages of multiple spiking neurons in the SNN at the end of the first preset time period.

4. The method according to claim 1, characterized in that, The step of determining the object to be identified as the first target object based on the first membrane voltage and the third membrane voltage includes: If the average or weighted average of the first membrane voltage and the third membrane voltage is greater than a first preset threshold, the object to be identified is determined to be the first target object.

5. The method according to claim 1, characterized in that, The step of determining the object to be identified as the first target object based on the first membrane voltage and the third membrane voltage includes: If the average or weighted average of the first ratio and the second ratio is greater than the second preset threshold, the object to be identified is determined to be the first target object, wherein the second ratio is the ratio between the third membrane voltage and the sum of the membrane voltages of the multiple spiking neurons in the SNN at the end of the second preset time period.

6. The method according to any one of claims 1 to 5, characterized in that, The AER event includes the timestamp and address information of the AER event.

7. The method according to any one of claims 1 to 5, characterized in that, The weights are determined based on the deviation between the predicted and actual results of the object to be identified in multiple different time periods.

8. An object recognition device, characterized in that, The object recognition device is applied to spiking neurons in a spiking neural network (SNN), and the device includes: The receiving module is configured to receive a first feature pulse sequence of the object to be identified, wherein the first feature pulse sequence contains multiple pulses and is obtained based on a first portion of AER events of the object to be identified within a first preset time period. The acquisition module is used to obtain the first membrane voltage of the spiking neuron at the end of the first preset time period based on the plurality of pulses in the first feature pulse sequence and the set weights. The receiving module is further configured to receive a second feature pulse sequence of the object to be identified, the second feature pulse sequence comprising multiple pulses, and the second feature pulse sequence being obtained based on a second portion of AER events of the object to be identified within a second preset time period; The acquisition module is further configured to acquire the third membrane voltage of the spiking neuron at the end of the second preset time period based on the plurality of pulses in the second feature pulse sequence and the set weights. A determination module is used to determine the object to be identified as a first target object based on the membrane voltage of the spiking neuron, wherein the membrane voltage of the spiking neuron includes the first membrane voltage and the third membrane voltage.

9. The apparatus according to claim 8, characterized in that, The determining module is specifically used for: If the first membrane voltage on the spiking neuron is greater than a first preset threshold, the object to be identified is determined to be the first target object.

10. The apparatus according to claim 8, characterized in that, The determining module is specifically used for: If the first ratio between the first membrane voltage and the second membrane voltage is greater than a second preset threshold, the object to be identified is determined to be the first target object, wherein the second membrane voltage is the sum of the membrane voltages of multiple spiking neurons in the SNN at the end of the first preset time period.

11. The apparatus according to claim 8, characterized in that, The determining module is specifically used for: If the average or weighted average of the first membrane voltage and the third membrane voltage is greater than a first preset threshold, the object to be identified is determined to be the first target object.

12. The apparatus according to claim 8, characterized in that, The determining module is specifically used for: If the average or weighted average of the first ratio and the second ratio is greater than the second preset threshold, the object to be identified is determined to be the first target object, wherein the second ratio is the ratio between the third membrane voltage and the sum of the membrane voltages of the multiple spiking neurons in the SNN at the end of the second preset time period.

13. The apparatus according to any one of claims 8 to 12, characterized in that, The AER event includes the timestamp and address information of the AER event.

14. The apparatus according to any one of claims 8 to 12, characterized in that, The weights are determined based on the deviation between the predicted and actual results of the object to be identified in multiple different time periods.

15. A computing device for object recognition, characterized in that, include: A communication interface for receiving the first characteristic pulse sequence of the object to be identified; A processor, connected to the communication interface and used to perform the method as described in any one of claims 1 to 7.

16. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores instructions that, when executed by a computing device, cause the computing device to perform the method of any one of claims 1 to 7.