Identification methods and systems, chips, terminal devices and storage media
By configuring the mailbox register and SRAM in the terminal device and utilizing the prefetching and mailbox mechanisms, efficient data processing of the terminal device's recognition algorithm at the hardware level is achieved, solving the problems of long processing time, low efficiency, and high power consumption, and improving the performance of recognition processing.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- 伟光有限公司(CN)
- Filing Date
- 2022-11-22
- Publication Date
- 2026-06-30
AI Technical Summary
Existing terminal devices' recognition algorithms are time-consuming, inefficient, and power-consuming, failing to effectively improve recognition processing performance.
The terminal device is equipped with a mailbox register, SRAM and NPU. The model data and the data to be identified are pre-stored in the NPU and SRAM using a prefetch mechanism. The data status parameters are recorded through the mailbox register to realize hardware interaction between the first processor and the NPU, reducing software interaction.
It effectively reduces power consumption, shortens data processing time, improves recognition and processing efficiency, and enhances recognition and processing performance.
Smart Images

Figure CN115761742B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of data processing technology, and in particular to an identification method and system, a chip, a terminal device, and a storage medium. Background Technology
[0002] Currently, with the development of terminal technology, deploying algorithms such as Optical Character Recognition (OCR) on the terminal side has gradually become an inevitable choice. In order to meet the high computing power and power consumption requirements of recognition algorithms, terminal devices need to continuously increase computing power and load capacity.
[0003] However, the recognition algorithms currently running on terminal devices still suffer from drawbacks such as long processing time, low efficiency, and high power consumption, which cannot effectively improve the performance of recognition processing. Summary of the Invention
[0004] This application provides an identification method and system, chip, terminal device and storage medium, which can effectively reduce power consumption, shorten data processing time, improve identification processing efficiency, and thus greatly improve the performance of identification processing.
[0005] The technical solution of this application embodiment is implemented as follows:
[0006] In a first aspect, embodiments of this application provide an identification method, wherein the terminal device is configured with a mailbox register, a static random access memory (SRAM), a neural network processing unit (NPU), and a first processor, wherein the mailbox register is used to record a first data state corresponding to the first processor and a second data state corresponding to the NPU; the method includes:
[0007] In response to the recognition command, the model data corresponding to the data to be recognized is stored in the NPU in advance, and the data to be recognized is stored in the SRAM; wherein, the recognition command is used to instruct the data to be recognized to be recognized to be processed for recognition.
[0008] The first processor preprocesses the data to be identified stored in the SRAM and stores the preprocessed data in the SRAM.
[0009] Based on the first data status parameters and the second data status parameters recorded in the mailbox register, the data to be processed is pre-stored in the NPU; wherein, the data to be processed is part or all of the data in the pre-processed data;
[0010] The NPU performs identification and processing on the data to be processed based on the model data.
[0011] Secondly, embodiments of this application provide an identification system, which includes a mailbox register, a static random access memory (SRAM), a neural network processing unit (NPU), and a first processor. The mailbox register is used to record a first data state corresponding to the first processor and a second data state corresponding to the NPU. The identification system is configured to execute:
[0012] In response to the recognition command, the model data corresponding to the data to be recognized is stored in the NPU in advance, and the data to be recognized is stored in the SRAM; wherein, the recognition command is used to instruct the data to be recognized to be recognized to be processed for recognition.
[0013] The first processor preprocesses the data to be identified stored in the SRAM and stores the preprocessed data in the SRAM.
[0014] Based on the first data status parameters and the second data status parameters recorded in the mailbox register, the data to be processed is pre-stored in the NPU; wherein, the data to be processed is part or all of the data in the pre-processed data;
[0015] The NPU performs identification and processing on the data to be processed based on the model data.
[0016] Thirdly, embodiments of this application provide a chip, the chip including a mailbox register, a static random access memory (SRAM), a neural network processing unit (NPU), and a first processor, wherein the mailbox register is used to record a first data state corresponding to the first processor and a second data state corresponding to the NPU; the chip is configured to execute:
[0017] In response to the recognition command, the model data corresponding to the data to be recognized is stored in the NPU in advance, and the data to be recognized is stored in the SRAM; wherein, the recognition command is used to instruct the data to be recognized to be recognized to be processed for recognition.
[0018] The first processor preprocesses the data to be identified stored in the SRAM and stores the preprocessed data in the SRAM.
[0019] Based on the first data status parameters and the second data status parameters recorded in the mailbox register, the data to be processed is pre-stored in the NPU; wherein, the data to be processed is part or all of the data in the pre-processed data;
[0020] The NPU performs identification and processing on the data to be processed based on the model data.
[0021] Fourthly, embodiments of this application provide a terminal device, which includes a storage unit, a preprocessing unit, and an identification unit.
[0022] The storage unit is configured to, in response to a recognition instruction, pre-store the model data corresponding to the data to be recognized in the NPU and store the data to be recognized in the SRAM; wherein, the recognition instruction is configured to instruct the data to be recognized to undergo recognition processing;
[0023] The preprocessing unit is used to preprocess the data to be identified stored in the SRAM by the first processor;
[0024] The storage unit is further configured to store preprocessed data into the SRAM; and to store data to be processed into the NPU in advance based on the first data status parameters and the second data status parameters recorded in the mailbox register; wherein the data to be processed is part or all of the data in the preprocessed data;
[0025] The identification unit is used to identify and process the data to be processed by the NPU based on the model data.
[0026] Fifthly, embodiments of this application provide a terminal device, the terminal device including a mailbox register, a static random access memory (SRAM), a neural network processing unit (NPU), and a first processor, wherein the mailbox register is used to record a first data state corresponding to the first processor and a second data state corresponding to the NPU; the terminal device is used to implement the method described in the first aspect.
[0027] In a sixth aspect, embodiments of this application provide a computer-readable storage medium having a program stored thereon, characterized in that, when the program is executed by a processor, it implements the method described in the first aspect.
[0028] This application provides an identification method and system, chip, terminal device, and storage medium. The terminal device is configured with a mailbox register, SRAM, NPU, and a first processor. The mailbox register is used to record a first data state corresponding to the first processor and a second data state corresponding to the NPU. In response to an identification command, model data corresponding to the data to be identified is pre-stored in the NPU, and the data to be identified is stored in the SRAM. The identification command is used to instruct the data to be identified to undergo identification processing. The first processor pre-processes the data to be identified stored in the SRAM, and stores the pre-processed data in the SRAM. Based on the first and second data state parameters recorded in the mailbox register, the data to be processed is pre-stored in the NPU. The data to be processed is part or all of the data in the pre-processed data. The NPU performs identification processing on the data to be processed based on the model data. In other words, in this application's embodiment, based on the pre-fetching mechanism, the terminal device can pre-extract model data, data to be identified, and related data during the identification process, thereby shortening the data processing time during the identification process. At the same time, based on the first and second data state parameters recorded in the mailbox register, the terminal device realizes hardware interaction between the first processor and the NPU, thereby reducing software interaction. As can be seen, the identification method proposed in this application, by utilizing the prefetching mechanism and the Mailbox mechanism, can effectively reduce power consumption, shorten data processing time, improve identification processing efficiency, and thus greatly enhance the performance of identification processing. Attached Figure Description
[0029] Figure 1 Hardware diagram for common recognition processing Figure 1 ;
[0030] Figure 2 Hardware diagram for common recognition processing Figure 2 ;
[0031] Figure 3 Schematic diagram of the implementation framework for the identification method Figure 1 ;
[0032] Figure 4 Schematic diagram of the implementation process of the identification method Figure 1 ;
[0033] Figure 5 Schematic diagram of the implementation framework for the identification method Figure 2 ;
[0034] Figure 6 Schematic diagram of the implementation framework for the identification method Figure 3 ;
[0035] Figure 7 Schematic diagram of the implementation process of the identification method Figure 2 ;
[0036] Figure 8 Schematic diagram of the implementation framework for the identification method Figure 4 ;
[0037] Figure 9 Schematic diagram of the implementation process of the identification method Figure 3 ;
[0038] Figure 10 A hardware diagram of the identification system;
[0039] Figure 11 A schematic diagram of the composition structure of the identification system;
[0040] Figure 12 This is a schematic diagram of the chip's structure.
[0041] Figure 13 Schematic diagram of the composition structure of the terminal device Figure 1 ;
[0042] Figure 14 Schematic diagram of the composition structure of the terminal device Figure 2 . Detailed Implementation
[0043] The technical solutions of the embodiments of this application will be clearly and completely described below with reference to the accompanying drawings. It should be understood that the specific embodiments described herein are only for explaining the relevant application and not for limiting the application. Furthermore, it should be noted that, for ease of description, only the parts related to the relevant application are shown in the accompanying drawings.
[0044] Artificial intelligence (AI) is a new technical science that studies and develops theories, methods, technologies, and application systems for simulating, extending, and expanding human intelligence.
[0045] Artificial intelligence (AI) is a branch of computer science that attempts to understand the essence of intelligence and produce new intelligent machines that can react in a way similar to human intelligence. Research in this field includes robotics, speech recognition, image recognition, natural language processing, and expert systems. Since its inception, AI has matured in both theory and technology, and its applications have expanded continuously. It is conceivable that future AI-driven technological products will serve as "containers" of human wisdom. AI can simulate the information processes of human consciousness and thought. While AI is not human intelligence, it can think like a human and may even surpass human intelligence.
[0046] Optical Character Recognition (OCR) is the process by which electronic devices (such as scanners or digital cameras) examine characters printed on paper and then use character recognition methods to translate the shapes into computer text; that is, scanning text documents and then analyzing and processing image files to obtain text and layout information.
[0047] The AI algorithms involved in free text recognition mainly involve OCR detection. OCR recognition algorithms are primarily deployed in the cloud, utilizing GPU parallel matrix operation chips and NPU dedicated AI chips for inference. As the computing power of edge hardware platforms continues to improve, deployment on the edge has become an inevitable choice for OCR. However, OCR is computationally intensive and has high power consumption requirements, especially for high-precision OCR. Therefore, it is necessary to strengthen the configuration of the edge computing engine and increase its computing power to handle the large OCR load.
[0048] Figure 1 Hardware diagram for common recognition processing Figure 1 ,like Figure 1 As shown, generally speaking, for dynamic OCR, data from the Mobile Industry Processor Interface (MIPI) is stored in Double Data Rate (DDR) memory. The Image Sensor Processing (ISP) reads the data, processes it, and writes the resulting data back to DDR. The Microcontroller Unit (MCU) then notifies the NPU to retrieve and process the data. For static OCR, the ISP directly retrieves data from DDR.
[0049] In the above scheme, the data transmission of MIPI, ISP and NPU all pass through DDR. In the OCR scenario, the model data of NPU is stored in DDR. When the data between ISP and NPU is interacting, even if there is no data transmission in MIPI, DDR still needs to be powered on. The power consumption of DDR will be very high, which is difficult to meet the low power consumption requirements of OCR. At the same time, the latency of DDR is high, and it is also difficult to meet the OCR processing time requirements.
[0050] Figure 2 Hardware diagram for common recognition processing Figure 2 ,like Figure 2As shown, a smaller static random-access memory (SRAM) can also be used as a cache. Data from the MIPI interface is stored in DDR, the ISP reads the data for processing, writes the generated data into SRAM first, and the MCU notifies the NPU to retrieve the data for processing.
[0051] In the above scheme, SRAM is used as the storage intermediary for interaction between the NPU and ISP, which reduces power consumption and improves the efficiency of NPU data access. However, dynamic OCR still requires DDR to store data and serve as the storage intermediary for data interaction between the ISP and NPU; in static OCR, images are stored in DDR. Because the capacity of SRAM cannot be too large, the NPU needs to know the amount of data that can be processed in SRAM, which requires MCU participation for control. The interaction between ISP and NPU is relatively frequent, resulting in a high load on the MCU and low interaction efficiency.
[0052] It is evident that current recognition algorithms running on terminal devices still suffer from drawbacks such as long processing time, low efficiency, and high power consumption, failing to effectively improve recognition processing performance.
[0053] To address the aforementioned issues, in embodiments of this application, the terminal device is configured with a mailbox register, SRAM, NPU, and a first processor. The mailbox register records a first data state corresponding to the first processor and a second data state corresponding to the NPU. In response to a recognition instruction, model data corresponding to the data to be recognized is pre-stored in the NPU, and the data to be recognized is stored in the SRAM. The recognition instruction instructs the data to be recognized to undergo recognition processing. The first processor pre-processes the data to be recognized stored in the SRAM and stores the pre-processed data in the SRAM. Based on the first and second data state parameters recorded in the mailbox register, the data to be processed is pre-stored in the NPU. The data to be processed is part or all of the data in the pre-processed data. The NPU performs recognition processing on the data to be processed based on the model data. In other words, in embodiments of this application, based on a pre-fetching mechanism, the terminal device can pre-extract model data, data to be recognized, and relevant data during the recognition process, thereby shortening data processing time during the recognition process. Simultaneously, based on the first and second data state parameters recorded in the mailbox register, the terminal device implements hardware interaction between the first processor and the NPU, thereby reducing software interaction. As can be seen, the identification method proposed in this application, by utilizing the prefetching mechanism and the Mailbox mechanism, can effectively reduce power consumption, shorten data processing time, improve identification processing efficiency, and thus greatly enhance the performance of identification processing.
[0054] The technical solutions in the embodiments of this application will be clearly and completely described below with reference to the accompanying drawings.
[0055] One embodiment of this application provides an identification method applied in a terminal device, wherein the terminal device may be configured with a mailbox register, SRAM, NPU, and a first processor.
[0056] For example, in the embodiments of this application, Figure 3 Schematic diagram of the implementation framework for the identification method Figure 1 ,like Figure 3 As shown, the terminal device may include a mailbox register, SRAM, an NPU, and a first processor. The mailbox register records the first data state corresponding to the first processor and the second data state corresponding to the NPU. The SRAM stores the data to be identified and the data generated during the identification process.
[0057] In other words, in the embodiments of this application, on the one hand, SRAM can be used to store relevant data, and on the other hand, the Mailbox mechanism of the mailbox register can be used to record the data processing process and status corresponding to the NPU and the first processor. That is, the mailbox register can record the data generation status of the first processor and the data processing status of the NPU at the same time, thereby achieving data decoupling between the first processor and the NPU.
[0058] Furthermore, in the embodiments of this application, Figure 4 Schematic diagram of the implementation process of the identification method Figure 1 ,like Figure 4 As shown in the embodiments of this application, the method for the terminal device to perform identification processing may include the following steps:
[0059] Step 101: In response to the recognition instruction, the model data corresponding to the data to be recognized is stored in the NPU in advance, and the data to be recognized is stored in the SRAM; wherein, the recognition instruction is used to instruct the data to be recognized to be processed for recognition.
[0060] In the embodiments of this application, the terminal device may, in response to the received recognition instruction, store the model data corresponding to the data to be recognized in the NPU in advance, or store the data to be recognized in the SRAM.
[0061] It should be noted that, in the embodiments of this application, the recognition instruction received by the terminal device can be used to instruct the data to be recognized to undergo recognition processing. The recognition instruction can be any type of recognition processing instruction. For example, the recognition type of the recognition instruction can include optical character recognition (OCR), gesture recognition, face recognition, voice recognition, biometric recognition, etc., and this application does not specifically limit this.
[0062] Furthermore, in the embodiments of this application, the data to be identified corresponding to the recognition instruction can also be data of any format or form. For example, corresponding to the recognition type of the recognition instruction, when the recognition instruction is OCR, pose recognition, face recognition, etc., the data to be identified can be image data; when the recognition instruction is speech recognition, the data to be identified can be speech data. Of course, the data to be identified can also be other types of data, and this application does not specifically limit this.
[0063] It is understood that, in the embodiments of this application, the terminal device can receive recognition instructions in a variety of ways. For example, the terminal device can receive recognition instructions through a configured touch screen, through a configured physical button, or through a configured voice acquisition device. This application does not specifically limit the methods used in this regard.
[0064] For example, in an embodiment of this application, the terminal device can run a first application. When the OCR function in the first application is activated by the user's touch operation, the terminal device can receive the corresponding recognition instruction. This recognition instruction is text recognition.
[0065] Furthermore, in the embodiments of this application, based on the prefetch mechanism, after the terminal device receives the recognition instruction, it can wake up the NPU in advance and store the model data corresponding to the data to be recognized in the NPU in advance before the NPU performs recognition processing.
[0066] It should be noted that, in the embodiments of this application, the terminal device may further include a double-data-rate memory (DDR), wherein the DDR stores model data corresponding to the data to be identified. After receiving the identification instruction, the model data corresponding to the data to be identified can be read from the DDR in advance, and then the model data is stored in the internal storage space of the NPU.
[0067] For example, in the embodiments of this application, Figure 5 Schematic diagram of the implementation framework for the identification method Figure 2 ,like Figure 5As shown, the terminal device can be equipped with DDR. DDR can store model data, which can be pre-stored in the NPU's internal memory before the NPU performs recognition processing. This eliminates the need to retrieve the model data during subsequent recognition processing, thus saving time and reducing power consumption.
[0068] Furthermore, in the embodiments of this application, the terminal device may also include a microcontroller unit (MCU), wherein the MCU can respond to the received recognition command and wake up the NPU in advance, so that the NPU can pre-store the model data corresponding to the data to be recognized.
[0069] It is understood that in the implementation of this application, the terminal device can be any form such as an electronic device, a chip, an integrated circuit (IC), or an application-specific integrated circuit (ASIC). For example, the identification method proposed in the embodiments of this application can be implemented by a chip, which can integrate an MCU, an NPU, a mailbox register, SRAM, a first processor, DDR, etc.
[0070] For example, in the embodiments of this application, Figure 6 Schematic diagram of the implementation framework for the identification method Figure 3 ,like Figure 6 As shown, the terminal device can be equipped with an MCU. Upon receiving a recognition command, the MCU can instruct the NPU to read and store the model data in advance. That is, before the NPU starts recognition processing, the NPU prefetches the model data into its internal memory, so that the model data does not need to be extracted in the subsequent recognition processing, thereby saving time and reducing power consumption.
[0071] Furthermore, in embodiments of this application, the terminal device, in response to an identification instruction instructing the identification data to be identified to be processed, may also store the identification data in SRAM. Specifically, before the first processor and NPU perform corresponding processing based on the identification data, the terminal device may first store the identification data in SRAM.
[0072] It should be noted that, in the embodiments of this application, if the recognition instruction is static OCR, based on the prefetch mechanism, the terminal device can read the data to be recognized from DDR in advance, and then store the data to be recognized in SRAM.
[0073] In other words, in the embodiments of this application, for the recognition instruction characterizing static OCR, the data to be recognized is initially stored in DDR. After receiving the recognition instruction, the terminal device can read the corresponding data to be recognized from DDR in advance, and then store the data to be recognized in SRAM. That is, the terminal device can prefetch the data to be recognized from DDR into SRAM.
[0074] It is understood that, in the embodiments of this application, for the recognition instruction characterizing static OCR, the data to be recognized can be at least one frame of image stored in DDR, and the at least one frame of image can be prefetched and stored in SRAM before the recognition process begins, so that the data to be recognized will not be read in the subsequent recognition process, thereby improving the processing speed.
[0075] Furthermore, in the embodiments of this application, Figure 7 Schematic diagram of the implementation process of the identification method Figure 2 ,like Figure 7 As shown, after the model data corresponding to the data to be identified is pre-stored in the NPU and the data to be identified is stored in the SRAM in response to the recognition command, i.e. after step 101, the method for the terminal device to perform recognition processing may include the following steps:
[0076] Step 105: Turn off DDR to switch DDR to a low-power state.
[0077] In the embodiments of this application, after the terminal device stores the model data corresponding to the data to be identified in the NPU and stores the data to be identified in the SRAM in advance, it can further choose to turn off DDR so that DDR switches to a low-power state.
[0078] In other words, in the embodiments of this application, for static OCR, after the terminal device prefetches the model data corresponding to the data to be identified stored in DDR into the NPU internal memory based on the prefetch mechanism, and prefetches the data to be identified stored in DDR into SRAM, it can choose to put DDR in a low-power state, thereby further reducing power consumption.
[0079] Furthermore, in the embodiments of this application, if the recognition instruction is dynamic OCR, the terminal device can directly store the data to be recognized into SRAM after obtaining the data to be recognized from the data interface.
[0080] In other words, in the embodiments of this application, for a recognition instruction that represents dynamic OCR, after receiving the recognition instruction, the terminal device can directly obtain the data to be recognized from the data interface and store it in SRAM, without using DDR to store the data to be recognized.
[0081] It is understood that, in the embodiments of this application, for the recognition instruction representing dynamic OCR, the data to be recognized can be at least one frame of image collected by the terminal device through the image sensor, and the at least one frame of image can be directly stored in SRAM through the data interface, instead of using DDR, which reduces power consumption to a certain extent.
[0082] It should be noted that, in the embodiments of this application, the terminal device may be configured with a data interface, through which the real-time collected data to be identified can be transmitted. The data interface can be of any type, and this application does not impose any specific limitations.
[0083] For example, in the embodiments of this application, Figure 8 Schematic diagram of the implementation framework for the identification method Figure 4 ,like Figure 8 As shown, terminal devices can be equipped with data interfaces, such as the Mobile Industry Processor Interface (MIPI). In dynamic OCR, MIPI data can be stored in SRAM first, meaning dynamic OCR no longer uses DDR, thus reducing power consumption.
[0084] It should be noted that, in the embodiments of this application, the terminal device can be any device with storage and data processing capabilities, such as smartphones, tablets, handheld computers, mobile stations (MS), mobile terminals, wearable smart devices, smart TVs, etc.
[0085] Step 102: The first processor preprocesses the data to be identified stored in the SRAM and stores the preprocessed data in the SRAM.
[0086] In the embodiments of this application, after the terminal device responds to the recognition command by storing the model data corresponding to the data to be recognized in the NPU and storing the data to be recognized in the SRAM, it can further preprocess the data to be recognized stored in the SRAM through the first processor and store the preprocessed data in the SRAM.
[0087] It should be noted that, in the embodiments of this application, the first processor can be used to preprocess the data to be identified. The first processor can be any type of processor used to preprocess different types of data to be identified; this application does not impose any specific limitations on this.
[0088] For example, in an embodiment of this application, when the data to be identified is image data, the first processor can be an image signal processor (ISP), that is, the terminal device can preprocess the data to be identified, including image data, through the ISP.
[0089] Furthermore, in the embodiments of this application, after the terminal device preprocesses the data to be identified stored in the SRAM through the first processor to obtain the preprocessed data, it can store the preprocessed data into the SRAM.
[0090] It is understood that, in the embodiments of this application, the preprocessed data generated after the first processor preprocesses the data to be identified will not be stored in DDR, but will be stored in SRAM, thereby reducing power consumption.
[0091] Furthermore, in the embodiments of this application, after the first processor preprocesses the data to be identified stored in the SRAM, the terminal device may choose to update the first data status parameter recorded in the mailbox register.
[0092] It should be noted that, in the embodiments of this application, the terminal device can record the first data status parameters corresponding to the first processor through the mailbox register. These first data status parameters can be used to determine the data generation status of the first processor. For example, the first data status parameters may include, but are not limited to, parameters such as the storage address and size of the preprocessed data after preprocessing by the first processor.
[0093] In other words, in the embodiments of this application, based on the Mailbox mechanism, the mailbox register can record the status of the preprocessed data corresponding to the first processor, that is, record the first data status parameters. Accordingly, the first processor can update the first data status parameters recorded in the mailbox register.
[0094] For example, in the embodiments of this application, when updating the first data status parameter, it is possible to select that the first data status parameter recorded in the mailbox register be updated at a preset time interval during the preprocessing of the data to be identified by the first processor; or it is possible to select that the first data status parameter recorded in the mailbox register be updated after the first processor has completed the preprocessing of the data to be identified.
[0095] Step 103: Based on the first and second data status parameters recorded in the mailbox register, store the data to be processed in the NPU in advance; wherein, the data to be processed is part or all of the data in the preprocessed data.
[0096] In the embodiments of this application, after the terminal device preprocesses the data to be identified stored in the SRAM by the first processor and stores the preprocessed data in the SRAM, it can further store the data to be processed in the NPU in advance based on the first data status parameters and the second data status parameters recorded in the mailbox register; wherein, the data to be processed is part or all of the data in the preprocessed data.
[0097] It should be noted that, in the embodiments of this application, during the identification process, based on the prefetch mechanism, the data to be processed in the preprocessed data stored in SRAM can also be prefetched into the NPU. Specifically, the terminal device can determine the amount of data required by the NPU based on the first data status parameters corresponding to the first processor recorded in the mailbox register and the second data status parameters corresponding to the NPU, thereby determining the data to be prefetched for processing.
[0098] It is understood that, in the embodiments of this application, the data to be processed can be part or all of the preprocessed data. Specifically, after the first processor completes the preprocessing of the data to be identified, it stores the preprocessed data in SRAM. The NPU, after determining the required data to be processed based on the first and second data status parameters, can pre-extract the data to be processed from the SRAM and store it in an internal intermediate buffer within the NPU.
[0099] It should be noted that, in the embodiments of this application, the second data status parameter can be used to determine the data processing status of the NPU. For example, the second data status parameter may include, but is not limited to: the amount of data already processed by the NPU, and the amount of data the NPU is expected to process.
[0100] In other words, in the embodiments of this application, based on the Mailbox mechanism, the mailbox register can simultaneously record the data generation status of the first processor and the data processing status of the NPU, that is, record the first data status parameter and the second data status parameter, thereby achieving data decoupling between the first processor and the NPU. The NPU can flexibly read data from SRAM using the data status parameters and the second data status parameter, thus satisfying the uneven bandwidth requirements of its own data processing.
[0101] It is understood that, in the embodiments of this application, the terminal device determines the data to be processed based on the first data status parameter and the second data status parameter recorded in the mailbox register, thereby prefetching the data to be processed from the SRAM to the NPU, which can reduce the capacity requirement of the SRAM and also reduce the time delay of the NPU output data.
[0102] Furthermore, in the embodiments of this application, the data to be processed determined based on the first data state parameter and the second data state parameter conforms to both the data generation situation of the first processor and the data processing situation of the NPU, so that the interaction between the first processor and the NPU can achieve small granularity and high efficiency, without waiting for the first processor to process a certain amount of data before the NPU starts processing, thereby solving the problem of the NPU waiting for data.
[0103] It is understood that, in the embodiments of this application, the determination of the amount of data to be processed and the prefetching time can be made by referring to the first data status parameter and the second data status parameter, and can also be combined with the internal processing mechanism of the NPU, so as to meet the uneven bandwidth demand during the NPU data processing process.
[0104] Step 104: The NPU identifies and processes the data to be processed based on the model data.
[0105] In the embodiments of this application, after the terminal device stores the data to be processed in the NPU in advance based on the first data status parameters and the second data status parameters recorded in the mailbox register, the NPU can further identify and process the data to be processed according to the model data.
[0106] Furthermore, in the embodiments of this application, when the terminal device performs recognition processing, the NPU can first generate a recognition model for recognition processing based on the pre-fetched model data, and then use the recognition model to perform recognition processing on the data to be processed, thereby obtaining the recognition result corresponding to the data to be recognized.
[0107] It should be noted that, in the embodiments of this application, the recognition result corresponds to the recognition instruction and the data to be recognized. For example, corresponding to the recognition type of the recognition instruction, when the recognition instruction is OCR and the data to be recognized is image data, the recognition result can be text information; when the recognition instruction is face recognition and the data to be recognized is image data, the recognition result can be person identification information; when the recognition instruction is speech recognition and the data to be recognized is speech data, the recognition result can be text information. This application does not impose specific limitations in this regard.
[0108] Furthermore, in the embodiments of this application, after the NPU performs identification processing on the data to be processed based on the model data, the terminal device may choose to update the second data status parameter recorded in the mailbox register.
[0109] In other words, in the embodiments of this application, based on the Mailbox mechanism, the mailbox register can record the data processing status corresponding to the NPU, that is, record the second data status parameter. Accordingly, the NPU can update the second data status parameter recorded in the mailbox register.
[0110] For example, in the embodiments of this application, when updating the second data status parameter, it is possible to update the second data status parameter recorded in the mailbox register at a preset time interval during the process of NPU recognizing the data to be recognized; or it is possible to update the second data status parameter recorded in the mailbox register after NPU has completed the recognition processing of the data to be processed.
[0111] In summary, the identification method proposed in steps 101 to 105 can improve identification efficiency and reduce power consumption by applying the Mailbox mechanism and the prefetching mechanism.
[0112] It is understood that the identification method proposed in this application embodiment, on the one hand, is based on a prefetching mechanism, which can prefetch model data, prefetch data to be identified, and prefetch preprocessed data during the identification process, thereby reducing data processing time and improving identification processing efficiency.
[0113] It is understood that the identification method proposed in this application, on the other hand, based on the Mailbox mechanism, can realize hardware interaction between the preprocessing of the first processor and the identification processing of the NPU, thereby reducing software interaction. Compared with the millisecond-level software interaction time, the nanosecond-level hardware interaction time can greatly shorten the identification processing time and improve the identification efficiency. The Mailbox mechanism can also improve the performance of the NPU to a certain extent and reduce the latency in the identification processing pipeline.
[0114] It is understood that, in another aspect, the identification method proposed in this application embodiment stores the data to be identified directly using SRAM instead of DDR during identification processing. Since DDR has a slower access speed compared to SRAM, using SRAM instead of DDR for data storage can effectively speed up data processing and reduce power consumption.
[0115] Furthermore, in the embodiments of this application, the recognition method proposed in the embodiments of this application can be used in scenarios of dynamic OCR or static OCR, as well as in scenarios of pose recognition, face recognition, voice recognition, fingerprint recognition, etc. This application does not make specific limitations in this regard.
[0116] Furthermore, in the embodiments of this application, the system architecture using NPU combined with SRAM proposed in the embodiments of this application can support scenarios for dynamic OCR or static OCR, as well as scenarios for pose recognition, face recognition, speech recognition, fingerprint recognition, etc. This application does not make specific limitations in this regard.
[0117] Furthermore, in the embodiments of this application, the Mailbox mechanism and SRAM storage mechanism proposed in the embodiments of this application can support data interaction between NPU and ISP, as well as data interaction between NPU and other processors such as Graphics Processing Unit (GPU). This application does not specifically limit this.
[0118] Furthermore, in the embodiments of this application, the Mailbox mechanism proposed in this application, combined with the hardware structure of NPU and SRAM, can be applied to any system with unbalanced bandwidth requirements, and this application does not impose any specific limitations on it.
[0119] It is understood that, in the embodiments of this application, the preprocessing operations related to the data to be identified can also be implemented using hardware modules. For example, for data to be processed including images, image preprocessing operations such as size normalization, zero-padding alignment, and pixel value normalization can be implemented using hardware modules, thereby further saving processing time and reducing power consumption.
[0120] It is understood that, in the embodiments of this application, the terminal device may choose to combine with other computing engines on the edge, such as GPU, to achieve multi-tasking and thus accelerate application processing such as OCR.
[0121] This application provides an identification method. A terminal device is configured with a mailbox register, SRAM, NPU, and a first processor. The mailbox register records a first data state corresponding to the first processor and a second data state corresponding to the NPU. In response to an identification command, model data corresponding to the data to be identified is pre-stored in the NPU, and the data to be identified is stored in the SRAM. The identification command instructs the data to be identified to undergo identification processing. The first processor pre-processes the data to be identified stored in the SRAM, and the pre-processed data is stored in the SRAM. Based on the first and second data state parameters recorded in the mailbox register, the data to be processed is pre-stored in the NPU. The data to be processed is part or all of the pre-processed data. The NPU performs identification processing on the data to be processed based on the model data. In other words, in this application's embodiment, based on a pre-fetching mechanism, the terminal device can pre-extract model data, data to be identified, and relevant data during the identification process, thereby shortening data processing time during the identification process. Simultaneously, based on the first and second data state parameters recorded in the mailbox register, the terminal device implements hardware interaction between the first processor and the NPU, thereby reducing software interaction. As can be seen, the identification method proposed in this application, by utilizing the prefetching mechanism and the Mailbox mechanism, can effectively reduce power consumption, shorten data processing time, improve identification processing efficiency, and thus greatly enhance the performance of identification processing.
[0122] Based on the above embodiments, another embodiment of this application provides an identification method, which is applied to a terminal device, wherein the terminal device may be configured with a mailbox register, SRAM, NPU, ISP (first processor), DDR, MCU, and MIPI (data interface).
[0123] It is understood that, in the implementation of this application, the terminal device can be any form such as an electronic device, a chip, an integrated circuit (IC), or an application-specific integrated circuit (ASIC).
[0124] It should be noted that, in the embodiments of this application, the mailbox register can be used to record the first data state corresponding to the ISP and the second data state corresponding to the NPU. That is, the mailbox register's mailbox mechanism can be used to record the data processing process and state corresponding to the NPU and ISP. In other words, the mailbox register can simultaneously record the data generation status of the ISP and the data processing status of the NPU, thereby achieving data decoupling between the ISP and the NPU.
[0125] It is understood that, in the embodiments of this application, the Mailbox mechanism enables hardware interaction between the ISP's preprocessing and the NPU's recognition processing, thereby reducing software interaction; at the same time, the Mailbox mechanism can also improve the performance of the NPU to a certain extent and reduce the latency in the recognition processing pipeline, thereby greatly shortening the recognition processing time and improving recognition efficiency.
[0126] It should be noted that, in the embodiments of this application, SRAM can be used to store the data to be identified and the data generated during the identification process. That is, SRAM can be used to store relevant data.
[0127] It is understood that, in the embodiments of this application, when performing identification processing, the data to be identified can be directly stored in SRAM instead of DDR, which can effectively speed up data processing and reduce power consumption.
[0128] Furthermore, in the embodiments of this application, Figure 9 Schematic diagram of the implementation process of the identification method Figure 3 ,like Figure 9 As shown in the embodiments of this application, the method for the terminal device to perform identification processing may include the following steps:
[0129] Step 201: Receive identification instruction; wherein, the identification instruction is used to instruct the data to be identified to be processed for identification.
[0130] In embodiments of this application, the terminal device can receive a recognition instruction to instruct the data to be recognized to undergo recognition processing. The recognition instruction can be any type of recognition processing. For example, the recognition type of the instruction can include OCR, gesture recognition, face recognition, voice recognition, biometric recognition, etc., and this application does not specifically limit this.
[0131] Furthermore, in the embodiments of this application, the data to be identified corresponding to the recognition instruction can also be data of any format or form. For example, corresponding to the recognition type of the recognition instruction, when the recognition instruction is OCR, pose recognition, face recognition, etc., the data to be identified can be image data; when the recognition instruction is speech recognition, the data to be identified can be speech data. Of course, the data to be identified can also be other types of data, and this application does not specifically limit this.
[0132] Step 202: Store the model data corresponding to the data to be identified in the NPU in advance.
[0133] In the embodiments of this application, the terminal device may, in response to the received recognition instruction, store the model data corresponding to the data to be recognized in the NPU in advance.
[0134] Furthermore, in the embodiments of this application, based on the prefetch mechanism, after the terminal device receives the recognition instruction, it can wake up the NPU in advance and store the model data corresponding to the data to be recognized in the NPU in advance before the NPU performs recognition processing.
[0135] For example, in the embodiments of this application, the DDR can store model data, and the model data can be pre-stored in the NPU's internal memory before the NPU performs the recognition processing, so that the model data will not be extracted in the subsequent recognition processing, thereby saving time and reducing power consumption.
[0136] In other words, in the embodiments of this application, after receiving the recognition instruction, the MCU can respond to the received recognition instruction, wake up the NPU in advance, and notify the NPU to read and store the model data in advance. That is, before the NPU starts the recognition process, the NPU prefetches the model data into the NPU's internal memory.
[0137] Step 203: If the recognition instruction is static OCR, read the data to be recognized from DDR in advance and store the data to be recognized in SRAM.
[0138] In the embodiments of this application, in response to a received recognition instruction, if the recognition instruction is a static OCR, the terminal device can pre-read the data to be recognized from DDR based on the prefetch mechanism, and then store the data to be recognized in SRAM.
[0139] In other words, in the embodiments of this application, for the recognition instruction characterizing static OCR, the data to be recognized is initially stored in DDR. After receiving the recognition instruction, the terminal device can pre-read the corresponding data to be recognized from DDR and then store the data to be recognized in SRAM. That is, the terminal device can pre-fetch the data to be recognized from DDR into SRAM, so that the data to be recognized will not be read in the subsequent recognition processing flow, thereby improving the processing speed.
[0140] Step 204: Turn off DDR.
[0141] In the embodiments of this application, after the terminal device stores the model data corresponding to the data to be identified in the NPU and stores the data to be identified in the SRAM in advance, it can further choose to turn off DDR so that DDR switches to a low-power state.
[0142] In other words, in the embodiments of this application, for static OCR, after the terminal device prefetches the model data corresponding to the data to be identified stored in DDR into the NPU internal memory based on the prefetch mechanism, and prefetches the data to be identified stored in DDR into SRAM, it can choose to put DDR in a low-power state, thereby further reducing power consumption.
[0143] Step 205: If the recognition instruction is dynamic OCR, the data to be recognized obtained from MIPI will be stored in SRAM.
[0144] In the embodiments of this application, if the recognition instruction is dynamic OCR, the terminal device can directly store the data to be recognized into SRAM after obtaining the data to be recognized from MIPI.
[0145] In other words, in the embodiments of this application, for a recognition instruction that characterizes dynamic OCR, after receiving the recognition instruction, the terminal device can directly obtain the data to be recognized from the data interface and store it in SRAM, instead of using DDR to store the data to be recognized, thereby reducing power consumption.
[0146] Step 206: Preprocess the data to be identified stored in SRAM using ISP, and store the preprocessed data in SRAM.
[0147] In the embodiments of this application, the terminal device may further preprocess the data to be identified stored in the SRAM through the ISP, and store the preprocessed data in the SRAM.
[0148] It should be noted that, in the embodiments of this application, after the terminal device preprocesses the data to be identified stored in the SRAM through the ISP and obtains the preprocessed data, it can store the preprocessed data in the SRAM.
[0149] It is understood that, in the embodiments of this application, the preprocessed data generated after the ISP preprocesses the data to be identified will not be stored in DDR, but will be stored in SRAM, thereby reducing power consumption.
[0150] Step 207: Update the first data status parameter recorded in the mailbox register.
[0151] In the embodiments of this application, after the data to be identified stored in the SRAM is preprocessed by the ISP, the terminal device may choose to update the first data status parameter recorded in the mailbox register.
[0152] It should be noted that, in the embodiments of this application, the terminal device can record the first data status parameters corresponding to the ISP through the mailbox register. These first data status parameters can be used to determine the data generated by the ISP. For example, the first data status parameters may include, but are not limited to, parameters such as the storage address and data size of the preprocessed data after preprocessing by the ISP.
[0153] In other words, in the embodiments of this application, based on the Mailbox mechanism, the mailbox register can record the status of the preprocessed data corresponding to the ISP, that is, record the first data status parameter. Accordingly, the ISP can update the first data status parameter recorded in the mailbox register.
[0154] Step 208: Based on the first and second data status parameters recorded in the mailbox register, store the data to be processed in the NPU in advance; wherein, the data to be processed is part or all of the data in the preprocessed data.
[0155] In the embodiments of this application, after the terminal device preprocesses the data to be identified stored in the SRAM through the ISP and stores the preprocessed data in the SRAM, it can further store the data to be processed in the NPU in advance based on the first data status parameter and the second data status parameter recorded in the mailbox register; wherein, the data to be processed is part or all of the data in the preprocessed data.
[0156] It should be noted that, in the embodiments of this application, the terminal device can determine the amount of data required by the NPU based on the first data status parameter corresponding to the ISP and the second data status parameter corresponding to the NPU recorded in the mailbox register, thereby determining the data to be prefetched and processed.
[0157] It is understood that, in the embodiments of this application, the data to be processed can be part or all of the preprocessed data. Specifically, after the ISP completes the preprocessing of the data to be identified, it stores the preprocessed data in SRAM. The NPU, after determining the required data to be processed based on the first and second data status parameters, can pre-extract the data to be processed from the SRAM and store it in the intermediate buffer inside the NPU.
[0158] Therefore, in the embodiments of this application, based on the prefetching mechanism, the terminal device can not only prefetch model data and data to be identified, but also prefetch preprocessed data during the NPU's identification processing, thereby reducing data processing time and improving identification processing efficiency.
[0159] It should be noted that, in the embodiments of this application, the second data status parameter can be used to determine the data processing status of the NPU. For example, the second data status parameter may include, but is not limited to: the amount of data already processed by the NPU, and the amount of data the NPU is expected to process.
[0160] In other words, in the embodiments of this application, based on the Mailbox mechanism, the mailbox register can simultaneously record the data generation status of the ISP and the data processing status of the NPU, that is, record the first data status parameter and the second data status parameter, thereby achieving data decoupling between the ISP and the NPU. The NPU can flexibly read data from SRAM using the data status parameters and the second data status parameter, thus satisfying the uneven bandwidth requirements of its own data processing.
[0161] It is understood that, in the embodiments of this application, the terminal device determines the data to be processed based on the first data status parameter and the second data status parameter recorded in the mailbox register, thereby prefetching the data to be processed from the SRAM to the NPU, which can reduce the capacity requirement of the SRAM and also reduce the time delay of the NPU output data.
[0162] Furthermore, in the embodiments of this application, the data to be processed determined based on the first data state parameter and the second data state parameter conforms to both the data generation situation of the ISP and the data processing situation of the NPU, so that the interaction between the ISP and the NPU can be granular and efficient, without waiting for the ISP to process a certain amount of data before the NPU starts processing, thereby solving the problem of the NPU waiting for data.
[0163] It is understood that, in the embodiments of this application, the determination of the amount of data to be processed and the prefetching time can be made by referring to the first data status parameter and the second data status parameter, and can also be combined with the internal processing mechanism of the NPU, so as to meet the uneven bandwidth demand during the NPU data processing process.
[0164] Step 209: The NPU identifies and processes the data to be processed based on the model data.
[0165] In the embodiments of this application, after the terminal device stores the data to be processed in the NPU in advance based on the first data status parameters and the second data status parameters recorded in the mailbox register, the NPU can further identify and process the data to be processed according to the model data.
[0166] Furthermore, in the embodiments of this application, when the terminal device performs recognition processing, the NPU can first generate a recognition model for recognition processing based on the pre-fetched model data, and then use the recognition model to perform recognition processing on the data to be processed, thereby obtaining the recognition result corresponding to the data to be recognized.
[0167] Step 210: Update the second data status parameter recorded in the mailbox register.
[0168] In the embodiments of this application, after the NPU identifies and processes the data to be processed based on the model data, the terminal device may choose to update the second data status parameter recorded in the mailbox register.
[0169] In other words, in the embodiments of this application, based on the Mailbox mechanism, the mailbox register can record the data processing status corresponding to the NPU, that is, record the second data status parameter. Accordingly, the NPU can update the second data status parameter recorded in the mailbox register.
[0170] This application provides an identification method based on a prefetching mechanism. The terminal device can pre-extract model data, data to be identified, and relevant data during the identification process, thereby shortening data processing time. Simultaneously, based on the first and second data status parameters recorded in the mailbox register, the terminal device implements hardware interaction between the first processor and the NPU, reducing software interaction. Therefore, the identification method proposed in this application, utilizing the prefetching and mailbox mechanisms, can effectively reduce power consumption, shorten data processing time, and improve identification processing efficiency, thus significantly enhancing the performance of the identification process.
[0171] Based on the above embodiments, another embodiment of this application proposes an identification method that can be applied to terminal devices or identification systems. This identification method can save system power consumption and speed up data processing by introducing SRAM. It can also reduce software interaction time and improve NPU performance by introducing a Mailbox mechanism to synchronize data interaction between the ISP and NPU, thereby reducing pipeline latency. Furthermore, it can reduce data processing time by introducing a prefetching mechanism to prefetch network model data, prefetch image data (data to be identified) during static OCR, and prefetch data processed by the ISP (data to be processed) during NPU identification processing.
[0172] In the embodiments of this application, on the one hand, the characteristics of NPU processing OCR neural networks can be combined to add Mailbox hardware design units, so that the interaction between ISP and NPU can be made with small granularity and high efficiency, while smoothing the NPU bandwidth and greatly reducing the load on MCU.
[0173] In the embodiments of this application, on the other hand, a prefetching mechanism is added to prefetch the model data used by the NPU into the NPU's internal memory and prefetch the image data (data to be recognized) from DDR into SRAM, thereby improving the processing speed.
[0174] In another aspect of the embodiments of this application, during network execution, dynamic OCR no longer uses DDR, and static OCR prefetches the image into SRAM in advance. At this time, DDR can be in a low-power state, so that dynamic OCR and static OCR use DDR as little as possible during execution, thereby minimizing power consumption.
[0175] For example, in the embodiments of this application, Figure 10 To identify the hardware diagram of the system, such as Figure 10 As shown, the identification system may include Mailbox, SRAM, NPU, ISP, DDR, MCU, and MIPI.
[0176] When an AI task (recognition instruction) arrives, the AI model data (model data) stored in DDR will be prefetched into the NPU's internal storage medium.
[0177] In dynamic OCR, MIPI data (data to be recognized) can be stored in SRAM first. The ISP will directly read the data in SRAM and process it according to the speed of MIPI data transmission. The generated data (preprocessed data) will be stored in SRAM again.
[0178] As can be seen, during dynamic OCR, MIPI data goes directly into SRAM instead of DDR, saving power consumption.
[0179] In static OCR, the image data (data to be recognized) is pre-read from DDR to SRAM, and then DDR is set to a low-power state. During subsequent data processing, the interaction between the ISP and NPU also occurs through SRAM.
[0180] Mailbox records information about the data generated by the ISP, such as the storage address and size of the data. Mailbox also records information about how the NPU processes the data generated by the ISP, such as the amount of data consumed by the NPU and the amount of data required for the next execution.
[0181] In other words, Mailbox enables data decoupling between the ISP and NPU. The ISP can generate data at a fixed rate based on its own characteristics, while the NPU can flexibly read data from SRAM based on the uneven bandwidth requirements of its data processing. This eliminates the need for the ISP to finish processing one frame before the NPU restarts processing, reducing SRAM capacity requirements and minimizing the time latency of NPU output data. This allows subsequent NPU processing to execute faster, thereby reducing the overall pipeline latency.
[0182] As can be seen, using the Mailbox mechanism enables hardware interaction between ISP preprocessing and NPU inference, rather than MCU synchronization, reducing system software interaction time. Compared to millisecond-level software interaction time, nanosecond-level hardware interaction time can significantly shorten recognition processing time and improve recognition efficiency.
[0183] It's important to note that, based on the prefetch mechanism, after an inference task arrives, the MCU in the recognition system wakes up the NPU. Before the NPU begins processing data, it prefetches the NPU model into its internal memory, saving time compared to reading the NPU model data after it arrives. Furthermore, during the execution of the NPU inference task, the NPU prefetches data from SRAM into its internal intermediate buffer. The amount of data prefetched and the prefetching time are determined by the Mailbox combined with the NPU's internal processing mechanism. In other words, the data prefetching process is entirely handled by hardware, without introducing software overhead, and it also solves the problem of the NPU waiting for data.
[0184] It is evident that using a prefetching mechanism—prefetching network model data, ISP-processed data, and image data during static OCR—reduces data processing time.
[0185] Furthermore, in the embodiments of this application, the recognition method proposed in the embodiments of this application can be used in scenarios of dynamic OCR or static OCR, as well as in scenarios of pose recognition, face recognition, voice recognition, fingerprint recognition, etc. This application does not make specific limitations in this regard.
[0186] Furthermore, in the embodiments of this application, the system architecture using NPU combined with SRAM proposed in the embodiments of this application can support scenarios for dynamic OCR or static OCR, as well as scenarios for pose recognition, face recognition, speech recognition, fingerprint recognition, etc. This application does not make specific limitations in this regard.
[0187] Furthermore, in the embodiments of this application, the Mailbox mechanism and SRAM storage mechanism proposed in the embodiments of this application can support data interaction between NPU and ISP, as well as data interaction between NPU and other processors such as Graphics Processing Unit (GPU). This application does not specifically limit this.
[0188] Furthermore, in the embodiments of this application, the Mailbox mechanism proposed in this application, combined with the hardware structure of NPU and SRAM, can be applied to any system with unbalanced bandwidth requirements, and this application does not impose any specific limitations on it.
[0189] It is understood that, in the embodiments of this application, the preprocessing operations related to the data to be identified can also be implemented using hardware modules. For example, for data to be processed including images, image preprocessing operations such as size normalization, zero-padding alignment, and pixel value normalization can be implemented using hardware modules, thereby further saving processing time and reducing power consumption.
[0190] It is understood that, in the embodiments of this application, the terminal device may choose to combine with other computing engines on the edge, such as GPU, to achieve multi-tasking and thus accelerate application processing such as OCR.
[0191] This application provides an identification method based on a prefetching mechanism. The terminal device can pre-extract model data, data to be identified, and relevant data during the identification process, thereby shortening data processing time. Simultaneously, based on the first and second data status parameters recorded in the mailbox register, the terminal device implements hardware interaction between the first processor and the NPU, reducing software interaction. Therefore, the identification method proposed in this application, utilizing the prefetching and mailbox mechanisms, can effectively reduce power consumption, shorten data processing time, and improve identification processing efficiency, thus significantly enhancing the performance of the identification process.
[0192] Based on the above embodiments, in another embodiment of this application... Figure 11 To identify the system's structural composition, as shown in the diagram. Figure 11 As shown, the recognition system 10 proposed in this application embodiment may include: a mailbox register 40, an SRAM 50, an NPU 60, and a first processor 70; wherein, the mailbox register 40 is used to record a first data state corresponding to the first processor and a second data state corresponding to the NPU; the recognition system 10 is configured to perform: in response to a recognition instruction, pre-store the model data corresponding to the data to be recognized to the NPU, and store the data to be recognized to the SRAM; wherein, the recognition instruction is used to instruct the recognition processing of the data to be recognized; pre-process the data to be recognized stored in the SRAM by the first processor, and store the pre-processed data in the SRAM; pre-store the data to be processed to the NPU based on the first data state parameters and the second data state parameters recorded in the mailbox register; wherein, the data to be processed is part or all of the data in the pre-processed data; and perform recognition processing of the data to be processed by the NPU according to the model data.
[0193] In the embodiments of this application, further, Figure 12 This is a schematic diagram of the chip's structure, such as... Figure 12As shown in this embodiment, the chip 20 may include: a mailbox register 40, an SRAM 50, an NPU 60, and a first processor 70; wherein, the mailbox register 40 is used to record a first data state corresponding to the first processor and a second data state corresponding to the NPU; the chip 20 is configured to perform: in response to an identification instruction, pre-store model data corresponding to the data to be identified in the NPU, and store the data to be identified in the SRAM; wherein, the identification instruction is used to instruct the identification processing of the data to be identified; pre-process the data to be identified stored in the SRAM by the first processor, and store the pre-processed data in the SRAM; pre-store the data to be processed in the NPU based on the first data state parameters and the second data state parameters recorded in the mailbox register; wherein, the data to be processed is part or all of the data in the pre-processed data; and perform identification processing of the data to be processed by the NPU according to the model data.
[0194] In the embodiments of this application, further, Figure 13 Schematic diagram of the composition structure of the terminal device Figure 1 ,like Figure 13 As shown, the terminal device 30 proposed in this application embodiment may include: a storage unit 31, a preprocessing unit 32, and an identification unit 33.
[0195] The storage unit 31 is used to, in response to a recognition instruction, store the model data corresponding to the data to be recognized in the NPU in advance, and store the data to be recognized in the SRAM; wherein, the recognition instruction is used to instruct the data to be recognized to be processed for recognition.
[0196] The preprocessing unit 32 is used to preprocess the data to be identified stored in the SRAM by the first processor;
[0197] The storage unit 31 is further configured to store the preprocessed data into the SRAM; and to store the data to be processed into the NPU in advance based on the first data status parameters and the second data status parameters recorded in the mailbox register; wherein the data to be processed is part or all of the data in the preprocessed data;
[0198] The identification unit 33 is used to identify and process the data to be processed by the NPU based on the model data.
[0199] In the embodiments of this application, further, Figure 14 Schematic diagram of the composition structure of the terminal device Figure 2 ,like Figure 14As shown, the terminal device 30 proposed in this application embodiment may include: a mailbox register 40, an SRAM 50, an NPU 60, and a first processor 70; wherein, the mailbox register 40 is used to record a first data state corresponding to the first processor and a second data state corresponding to the NPU; the terminal device 30 is configured to perform: in response to an identification instruction, pre-store the model data corresponding to the data to be identified to the NPU, and store the data to be identified to the SRAM; wherein, the identification instruction is used to instruct the identification processing of the data to be identified; pre-process the data to be identified stored in the SRAM by the first processor, and store the pre-processed data in the SRAM; pre-store the data to be processed to the NPU based on the first data state parameters and the second data state parameters recorded in the mailbox register; wherein, the data to be processed is part or all of the data in the pre-processed data; and perform identification processing of the data to be processed by the NPU according to the model data.
[0200] This application provides a computer-readable storage medium having a program stored thereon, which, when executed by a processor, implements the identification method described above.
[0201] Specifically, the program instructions corresponding to one identification method in this embodiment can be stored on storage media such as optical discs, hard disks, and USB flash drives. When the program instructions corresponding to one identification method in the storage media are read or executed by an electronic device, the following steps are included:
[0202] In response to the recognition command, the model data corresponding to the data to be recognized is stored in the NPU in advance, and the data to be recognized is stored in the SRAM; wherein, the recognition command is used to instruct the data to be recognized to be recognized to be processed for recognition.
[0203] The first processor preprocesses the data to be identified stored in the SRAM and stores the preprocessed data in the SRAM.
[0204] Based on the first data status parameters and the second data status parameters recorded in the mailbox register, the data to be processed is pre-stored in the NPU; wherein, the data to be processed is part or all of the data in the pre-processed data;
[0205] The NPU performs identification and processing on the data to be processed based on the model data.
[0206] In the embodiments of this application, the processor can be at least one of the following: Application Specific Integrated Circuit (ASIC), Digital Signal Processor (DSP), Digital Signal Processing Device (DSPD), Programmable Logic Device (PLD), Field Programmable Gate Array (FPGA), Central Processing Unit (CPU), Controller, Microcontroller, and Microprocessor. It is understood that for different devices, the electronic device used to implement the above-described processor function can also be other types, and the embodiments of this application do not specifically limit this.
[0207] Furthermore, in this embodiment, the functional modules can be integrated into one processing unit, or each unit can exist physically separately, or two or more units can be integrated into one unit. The integrated unit can be implemented in hardware or as a software functional module.
[0208] If the integrated unit is implemented as a software functional module and is not sold or used as an independent product, it can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of this embodiment, in essence, or the part that contributes to the prior art, or all or part of the technical solution, can be embodied in the form of a software product. This computer software product is stored in a storage medium and includes several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) or processor to execute all or part of the steps of the method of this embodiment. The aforementioned storage medium includes various media capable of storing program code, such as USB flash drives, portable hard drives, read-only memory (ROM), random access memory (RAM), magnetic disks, or optical disks.
[0209] This application provides an identification system, chip, terminal device, and storage medium. In response to an identification command, the system pre-stores the model data corresponding to the data to be identified in an NPU and the data to be identified in an SRAM. The identification command instructs the system to process the data to be identified. A first processor preprocesses the data to be identified stored in the SRAM and stores the preprocessed data in the SRAM. Based on first and second data status parameters recorded in a mailbox register, the system pre-stores the data to be processed in the NPU. The data to be processed is part or all of the preprocessed data. The NPU performs identification processing on the data to be processed based on the model data. In other words, in this application's embodiments, based on a prefetching mechanism, the terminal device can pre-extract model data, data to be identified, and relevant data during the identification process, thereby shortening data processing time during identification. Simultaneously, based on the first and second data status parameters recorded in the mailbox register, the terminal device implements hardware interaction between the first processor and the NPU, thereby reducing software interaction. Therefore, in this application's embodiments, the prefetching and mailbox mechanisms effectively reduce power consumption, shorten data processing time, and improve identification processing efficiency, thus significantly enhancing the performance of identification processing.
[0210] Those skilled in the art will understand that embodiments of this application can be provided as methods, systems, or computer program products. Therefore, this application can take the form of hardware embodiments, software embodiments, or embodiments combining software and hardware aspects. Furthermore, this application can take the form of a computer program product implemented on one or more computer-usable storage media (including, but not limited to, disk storage and optical storage) containing computer-usable program code.
[0211] This application is described with reference to schematic and / or block diagrams of implementations of methods, apparatus (systems), and computer program products according to embodiments of this application. It should be understood that each block of the schematic and / or block diagrams can be implemented by computer program instructions, and combinations of blocks in the schematic and / or block diagrams can be implemented. These computer program instructions can be provided to a processor of a general-purpose computer, special-purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create a machine for implementing the schematic and / or block diagrams. Figure 1 One or more processes and / or boxes Figure 1 A device that provides the functions specified in one or more boxes.
[0212] These computer program instructions may also be stored in a computer-readable storage medium that can direct a computer or other programmable data processing device to function in a particular manner, such that the instructions stored in the computer-readable storage medium produce an article of manufacture including instruction means, which are implemented in the implementation flow diagram. Figure 1 One or more processes and / or boxes Figure 1 The function specified in one or more boxes.
[0213] These computer program instructions may also be loaded onto a computer or other programmable data processing equipment to cause a series of operational steps to be performed on the computer or other programmable equipment to produce a computer-implemented process, thereby providing instructions that execute on the computer or other programmable equipment for implementing the process. Figure 1 One or more processes and / or boxes Figure 1 The steps of the function specified in one or more boxes.
[0214] The above description is merely a preferred embodiment of this application and is not intended to limit the scope of protection of this application.
Claims
1. An identification method applied to a terminal device, characterized in that, The terminal device is configured with a mailbox register, a static random access memory (SRAM), a neural network processing unit (NPU), and a first processor. The mailbox register is used to record a first data status parameter corresponding to the first processor and a second data status parameter corresponding to the NPU. The method includes: In response to the recognition command, the model data corresponding to the data to be recognized is stored in the NPU in advance, and the data to be recognized is stored in the SRAM; wherein, the recognition command is used to instruct the data to be recognized to be recognized to be processed for recognition. The first processor preprocesses the data to be identified stored in the SRAM and stores the preprocessed data in the SRAM. Based on the first data status parameters and the second data status parameters recorded in the mailbox register, the data to be processed is pre-stored in the NPU; wherein, the data to be processed is part or all of the data in the pre-processed data; The NPU performs identification and processing on the data to be processed based on the model data.
2. The method of claim 1, wherein, The terminal device further includes: a double-data-rate memory (DDR), wherein the pre-storing of the model data corresponding to the data to be identified into the NPU includes: The model data is read from the DDR in advance and stored in the internal storage space of the NPU.
3. The method of claim 2, wherein, When the recognition instruction is a static optical character recognition (OCR) function, the method further includes: The data to be identified is read from the DDR in advance and stored in the SRAM.
4. The method of claim 2, wherein, When the recognition instruction is dynamic OCR, the method further includes: After obtaining the data to be identified from the data interface, the data to be identified is directly stored in the SRAM.
5. The method of claim 3, wherein, In response to the recognition command, after pre-storing the model data corresponding to the data to be recognized in the NPU and storing the data to be recognized in the SRAM, the method further includes: The DDR is turned off to switch the DDR to a low-power state.
6. The method according to claim 1, characterized in that, The first data status parameter includes the storage address and data size corresponding to the preprocessed data; The second data status parameter includes the amount of data already processed and the amount of data planned to be processed.
7. The method of claim 6, wherein, The method further includes: After the first processor preprocesses the data to be identified stored in the SRAM, the first data status parameter recorded in the mailbox register is updated.
8. The method of claim 6, wherein, The method further includes: After the NPU identifies and processes the data to be processed based on the model data, the second data status parameter recorded in the mailbox register is updated.
9. The method according to claim 4, characterized in that, The first processor includes an image signal processor (ISP); The data interface includes the Mobile Industry Processor Interface (MIPI).
10. An identification system characterized by The recognition system includes a mailbox register, a static random access memory (SRAM), a neural network processing unit (NPU), and a first processor. The mailbox register is used to record a first data state parameter corresponding to the first processor and a second data state parameter corresponding to the NPU. The recognition system is configured to execute: In response to the recognition command, the model data corresponding to the data to be recognized is stored in the NPU in advance, and the data to be recognized is stored in the SRAM; wherein, the recognition command is used to instruct the data to be recognized to be recognized to be processed for recognition. The first processor preprocesses the data to be identified stored in the SRAM and stores the preprocessed data in the SRAM. Based on the first data status parameters and the second data status parameters recorded in the mailbox register, the data to be processed is pre-stored in the NPU; wherein, the data to be processed is part or all of the data in the pre-processed data; The NPU performs identification and processing on the data to be processed based on the model data.
11. A chip, characterized by The chip includes a mailbox register, a static random access memory (SRAM), a neural network processing unit (NPU), and a first processor. The mailbox register is used to record a first data state parameter corresponding to the first processor and a second data state parameter corresponding to the NPU. The chip is configured to execute: In response to the recognition command, the model data corresponding to the data to be recognized is stored in the NPU in advance, and the data to be recognized is stored in the SRAM; wherein, the recognition command is used to instruct the data to be recognized to be recognized to be processed for recognition. The first processor preprocesses the data to be identified stored in the SRAM and stores the preprocessed data in the SRAM. Based on the first data status parameters and the second data status parameters recorded in the mailbox register, the data to be processed is pre-stored in the NPU; wherein, the data to be processed is part or all of the data in the pre-processed data; The NPU performs identification and processing on the data to be processed based on the model data.
12. A terminal device, comprising: The terminal device includes a storage unit, a preprocessing unit, and an identification unit. The storage unit is configured to, in response to a recognition instruction, store the model data corresponding to the data to be recognized in the NPU in advance, and store the data to be recognized in the SRAM; wherein, the recognition instruction is configured to instruct the data to be recognized to be processed for recognition. The preprocessing unit is used to preprocess the data to be identified stored in the SRAM by the first processor; The storage unit is further configured to store preprocessed data into the SRAM; based on the first data status parameters corresponding to the first processor and the second data status parameters corresponding to the NPU recorded in the mailbox register, the data to be processed is pre-stored into the NPU; wherein, the data to be processed is part or all of the data in the preprocessed data; The identification unit is used to identify and process the data to be processed by the NPU based on the model data.
13. A terminal device, comprising: The terminal device includes a mailbox register, a static random access memory (SRAM), a neural network processing unit (NPU), and a first processor. The mailbox register is used to record a first data status parameter corresponding to the first processor and a second data status parameter corresponding to the NPU. The terminal device is used to implement the method as described in any one of claims 1-9.
14. A computer-readable storage medium having stored thereon a program, characterized in that, When the program is executed by the processor, it implements the method as described in any one of claims 1-9.