A cyclic redundancy check code optimization generation method, system, terminal and medium

By constructing a performance prediction model and optimizing hardware resource configuration, the calculation method of cyclic redundancy check (CRC) is dynamically adjusted, solving the computational performance problem of variable data block size and achieving more efficient check code calculation and hardware module design.

CN115686924BActive Publication Date: 2026-06-12BEIJING YIXIN YIYU MICROELECTRONICS TECH CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
BEIJING YIXIN YIYU MICROELECTRONICS TECH CO LTD
Filing Date
2022-10-10
Publication Date
2026-06-12

Smart Images

  • Figure CN115686924B_ABST
    Figure CN115686924B_ABST
Patent Text Reader

Abstract

The application discloses a cyclic redundancy check code optimization generation method and system, a terminal and a medium, and relates to the fields of data storage and data communication.The technical scheme is as follows: a first performance prediction model and a second performance prediction model are constructed; data block size information is input into the first performance prediction model and the second performance prediction model respectively, and serial delay parameters and parallel delay parameters are obtained respectively; the resource satisfaction degree of searching and assembling optimal hardware modules under different calculation modes from a hardware resource library is analyzed, with the constraint conditions of meeting delay performance indexes, power consumption indexes and data block size information; the priority values of cyclic redundancy check codes under different calculation modes are determined in combination with the resource satisfaction degree and the delay parameters; and the cyclic redundancy check codes are calculated in the corresponding optimal hardware modules in the calculation mode with a large priority value.The application can improve the performance of a data processing system and the design efficiency of a cyclic redundancy check code hardware module.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the fields of data storage and data communication, and more specifically, to a method, system, terminal, and medium for optimizing the generation of cyclic redundancy check codes. Background Technology

[0002] In data storage and communication systems, Cyclic Redundancy Check (CRC) codes are widely accepted due to their simple calculation and implementation methods to ensure data integrity and prevent data errors during transmission, thereby guaranteeing data transmission and communication quality. However, as data volumes continue to increase, the computation delay of CRC codes becomes increasingly longer, especially when using traditional serial calculation methods, such as... Figure 1 As shown, the traditional serial CRC calculation method calculates the checksum for a fixed-size data block sequentially according to the data flow. In this case, parallel computation is not possible; subsequent data can only be calculated after the preceding data has been processed. This leads to increased computation latency for the checksum, failing to meet system requirements. To optimize the computation latency of CRC, different computing platforms, such as Intel and ARM, have introduced hardware instructions and methods for parallel CRC computation in their processor products. Taking the Intel platform as an example, firstly, the platform provides the necessary hardware instructions for parallel CRC computation, such as CRC32 and pclmulqdq. Secondly, it divides the data block, and each block independently calculates its corresponding CRC checksum in parallel. Thirdly, it uses pclmulqdq to merge and integrate the independent calculation results into the final result, thereby shortening the CRC computation latency.

[0003] However, the performance of serial and parallel computation of Cyclic Redundancy Check (CRC) codes varies significantly depending on the data block size. This is mainly due to the overhead of data partitioning and final check code merging. Taking the Intel platform as an example, when the data block is small, serial computation has a much lower latency than parallel computation. Conversely, when the data block is large, parallel computation significantly reduces latency. Existing CRC computation methods do not consider this situation, typically employing either serial or parallel computation schemes. A single computation method is suitable when the data block size is fixed. However, when the data block size is variable, a single computation method cannot further improve the computational performance of CRC.

[0004] Therefore, how to research and design an optimized generation method, system, terminal, and medium for cyclic redundancy check codes that can overcome the above-mentioned defects is an urgent problem that we need to solve. Summary of the Invention

[0005] To address the shortcomings of existing technologies, the present invention aims to provide an optimized method, system, terminal, and medium for generating cyclic redundancy check codes (CRCs). This method can meet the performance requirements for check code calculation in variable-length data processing systems, shorten the calculation time of check codes, improve the performance of data processing systems, enhance data reliability, and increase the design efficiency of CRC hardware modules.

[0006] The above-mentioned technical objective of the present invention is achieved through the following technical solution:

[0007] Firstly, a method for optimizing the generation of cyclic redundancy check codes is provided, including the following steps:

[0008] Based on the latency performance dataset obtained from the test, a first performance prediction model using a serial cyclic redundancy check (CRC) calculation method and a second performance prediction model using a parallel CRC calculation method are constructed respectively.

[0009] The data block size information is input into the first performance prediction model and the second performance prediction model respectively to obtain the serial delay parameter and the parallel delay parameter respectively;

[0010] With constraints of meeting latency performance indicators, power consumption indicators, and data block size information, we analyze the resource satisfaction of searching the hardware resource library to build the optimal hardware module under different computing methods.

[0011] The priority value of the Cyclic Redundancy Check (CRC) code in different calculation methods is determined by combining resource satisfaction and delay parameters;

[0012] The cyclic redundancy check code is obtained by selecting the calculation method with the higher priority value in the corresponding optimal hardware module.

[0013] Furthermore, the process of obtaining the latency performance data is as follows:

[0014] Analyze the specified parallelization instructions provided by the CPU platform for calculating cyclic redundancy check codes to obtain the performance parameters of a single instruction;

[0015] Based on the sum of the performance parameters of all specified parallelization instructions corresponding to different computing methods, the latency performance data of different computing methods at the corresponding data block size is obtained by testing.

[0016] Adjust the data block size to obtain a latency performance dataset consisting of multiple latency performance data.

[0017] Furthermore, the construction process of the first performance prediction model and the second performance prediction model is as follows:

[0018] The latency performance data in the latency performance dataset, calculated using the same method, is divided into a training set and a test set.

[0019] By combining convolutional neural networks and using the neural network gradient descent training algorithm, we can construct the first and second performance prediction models under the corresponding calculation methods.

[0020] Furthermore, the process of searching and assembling the optimal hardware module is as follows:

[0021] Input constraints;

[0022] The simulated annealing algorithm is used to automatically search for hardware modules that meet the constraints from the hardware resource library;

[0023] All the hardware modules searched are assembled into the optimal hardware module and packaged into a hardware IP core with an AXI interface.

[0024] Furthermore, the analysis process for resource satisfaction is as follows:

[0025] The completion rate of each indicator is obtained by comparing the actual value of each indicator in the optimal hardware module with the standard value of each indicator in the constraints.

[0026] The resource satisfaction level in the optimal hardware module is determined by the weight of the completion rate of all indicators.

[0027] Furthermore, the priority value is determined based on the mean or weight of resource satisfaction and delay parameters.

[0028] Furthermore, the process for determining the priority value is as follows:

[0029] Select the more important parameter between resource satisfaction and delay as the base parameter, and the remaining parameter as the correction parameter;

[0030] The basic terms are modified based on the modification terms to obtain the priority value of the corresponding calculation method.

[0031] Secondly, a cyclic redundancy check (CRC) code optimization generation system is provided, including:

[0032] The model building module is used to construct a first performance prediction model using a serial cyclic redundancy check (CRC) calculation method and a second performance prediction model using a parallel CRC calculation method based on the latency performance dataset obtained from the test.

[0033] The delay prediction module is used to input the data block size information into the first performance prediction model and the second performance prediction model respectively, and obtain the serial delay parameter and the parallel delay parameter respectively.

[0034] The resource analysis module is used to analyze the resource satisfaction of searching the hardware resource library to build the optimal hardware module under different computing methods, with the constraints of latency performance indicators, power consumption indicators and data block size information.

[0035] The comprehensive analysis module is used to determine the priority value of the cyclic redundancy check code under different calculation methods by combining resource satisfaction and delay parameters;

[0036] The optimization calculation module is used to select the calculation method with the higher priority value and calculate the cyclic redundancy check code in the corresponding optimal hardware module.

[0037] Thirdly, a computer terminal is provided, comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor, when executing the program, implements a cyclic redundancy check code optimization generation method as described in any one of the first aspects.

[0038] Fourthly, a computer-readable medium is provided having a computer program stored thereon, the computer program being executed by a processor to implement a cyclic redundancy check code optimization generation method as described in any one of the first aspects.

[0039] Compared with the prior art, the present invention has the following beneficial effects:

[0040] 1. The present invention provides an optimized generation method for Cyclic Redundancy Check (CRC) codes. By comparing the computational latency of different computation methods, the optimal computation method is selected, which can reduce the computation time of CRC codes and improve the performance of the data processing system. At the same time, considering that the resources and power consumption occupied by serial and parallel computation methods on dedicated hardware circuits are significantly different, and that as the size of the data block to be processed changes, a single hardware circuit implementation method cannot fully meet the performance requirements and it is difficult to achieve a balance between performance, resources, and power consumption, the computational latency and resource situation are comprehensively considered when selecting the final computation method, which can improve the design efficiency of the CRC hardware module.

[0041] 2. In determining the priority values ​​of different calculation methods, this invention divides them into basic items and correction items based on the importance of resource satisfaction and delay parameters. The correction items are then used to modify the basic items, making the calculated priority values ​​more accurate and reliable. Attached Figure Description

[0042] The accompanying drawings, which are included to provide a further understanding of embodiments of the invention and form part of this application, do not constitute a limitation thereof. In the drawings:

[0043] Figure 1 This is a flowchart from an embodiment of the present invention;

[0044] Figure 2 This is a system block diagram in an embodiment of the present invention. Detailed Implementation

[0045] To make the objectives, technical solutions, and advantages of the present invention clearer, the present invention will be further described in detail below with reference to the embodiments and accompanying drawings. The illustrative embodiments and descriptions of the present invention are only used to explain the present invention and are not intended to limit the present invention.

[0046] Example 1: An optimized method for generating cyclic redundancy check codes, such as... Figure 1 As shown, it includes the following steps:

[0047] S1: Based on the latency performance dataset obtained from the test, construct a first performance prediction model using a serial cyclic redundancy check (CRC) calculation method and a second performance prediction model using a parallel CRC calculation method.

[0048] S2: Input the data block size information into the first performance prediction model and the second performance prediction model respectively to obtain the serial delay parameter and the parallel delay parameter respectively;

[0049] S3: Using latency performance indicators, power consumption indicators, and data block size information as constraints, analyze the resource satisfaction of searching the hardware resource library to build the optimal hardware module under different computing methods;

[0050] S4: Determine the priority value of the Cyclic Redundancy Check code in different calculation methods by combining resource satisfaction and delay parameters;

[0051] S5: Select the calculation method with the larger priority value to calculate the cyclic redundancy check code in the corresponding optimal hardware module.

[0052] The process of obtaining latency performance data is as follows: analyze the specified parallelization instructions provided by the CPU platform for calculating the cyclic redundancy check code to obtain the performance parameters of a single instruction; based on the sum of the performance parameters of all specified parallelization instructions corresponding to different calculation methods, test the latency performance data of different calculation methods at the corresponding data block size; adjust the data block size to obtain a latency performance dataset composed of multiple latency performance data.

[0053] For CPU platforms such as Intel and Arm, latency testing is conducted based on different hardware instructions provided by different platforms. For example, on x86, the crc32 and pclmulqdq instructions are used, while on ARM, the crc32cd, crc32cw, crc32ch, and crc32cb instructions are used.

[0054] Serial cyclic redundancy check (CRC) refers to a method where data is not divided into independent sub-blocks, but rather the entire data block is sequentially calculated in fixed-size blocks. Parallel CRC, on the other hand, involves dividing the data into blocks of the required size according to hardware instructions. These blocks are then processed in parallel using the CPU's internal hardware. The results of these parallel calculations are then integrated to accelerate CRC calculation. Therefore, serial and parallel calculation methods share some of the same hardware instructions.

[0055] The values ​​of different data block sizes range from 1 byte to 16384 bytes. The actual sampling distribution is 16 bytes, 32 bytes, 64 bytes, 128 bytes, 256 bytes, 512 bytes, 1024 bytes, 2048 bytes, 3072 bytes, 4096 bytes, and 8192 bytes.

[0056] The construction process of the first performance prediction model and the second performance prediction model is as follows: the latency performance data under the same calculation method in the latency performance dataset is divided into training set and test set; the first performance prediction model and the second performance prediction model under the corresponding calculation method are constructed by combining convolutional neural network and using neural network gradient descent training algorithm.

[0057] The process of searching and assembling the optimal hardware module is as follows: input constraints; use the simulated annealing algorithm to automatically search for hardware modules that meet the constraints from the hardware resource library; assemble all the searched hardware modules into the optimal hardware module and package it into a hardware IP core with an AXI interface.

[0058] The resource satisfaction analysis process is as follows: the completion degree of each indicator is obtained by comparing the actual value of each indicator in the optimal hardware module with the standard value of each indicator in the constraints; the weight of the completion degree of all indicators is used as the resource satisfaction degree in the optimal hardware module.

[0059] As an optional implementation, the priority value is determined based on the mean or weight of resource satisfaction and delay parameters.

[0060] As another optional implementation method, the priority value determination process is as follows: select the more important one between resource satisfaction and delay parameters as the basic item, and the remaining one as the correction item; modify the basic item according to the correction item to obtain the priority value of the corresponding calculation method.

[0061] Example 2: A Cyclic Redundancy Check (CRC) code optimization generation system, which implements the CRC code optimization generation method described in Example 1, such as... Figure 2 As shown, it includes a model building module, a delay prediction module, a resource analysis module, a comprehensive analysis module, and an optimization calculation module.

[0062] The system comprises the following modules: a model building module, used to construct a first performance prediction model using a serial cyclic redundancy check (CRC) calculation method and a second performance prediction model using a parallel CRC calculation method based on the latency performance dataset obtained from testing; a latency prediction module, used to input data block size information into the first and second performance prediction models respectively to obtain serial latency parameters and parallel latency parameters; a resource analysis module, used to analyze the resource satisfaction of searching the hardware resource library to build the optimal hardware module under different calculation methods, using latency performance indicators, power consumption indicators, and data block size information as constraints; a comprehensive analysis module, used to determine the priority value of the CRC under different calculation methods by combining resource satisfaction and latency parameters; and an optimization calculation module, used to select the calculation method with the larger priority value to calculate the CRC in the corresponding optimal hardware module.

[0063] Working Principle: This invention selects the optimal computation method by comparing the computation latency of different methods, thereby reducing the computation time of Cyclic Redundancy Check (CRC) codes and improving the performance of the data processing system. Simultaneously, considering the significant differences in resource and power consumption required for serial and parallel computation methods on dedicated hardware circuits, and the fact that a single hardware circuit implementation cannot fully meet performance demands as the size of the required data blocks changes, making it difficult to achieve a balance between performance, resources, and power consumption, this invention comprehensively considers computation latency and resource availability when selecting the final computation method, thus improving the design efficiency of the CRC hardware module.

[0064] Those skilled in the art will understand that embodiments of this application can be provided as methods, systems, or computer program products. Therefore, this application can take the form of a completely hardware embodiment, a completely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, this application can take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer-usable program code.

[0065] This application is described with reference to flowchart illustrations and / or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of this application. It will be understood that each block of the flowchart illustrations and / or block diagrams, and combinations of blocks in the flowchart illustrations and / or block diagrams, can be implemented by computer program instructions. These computer program instructions can be provided to a processor of a general-purpose computer, special-purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, generate instructions for implementing the flowchart... Figure 1 One or more processes and / or boxes Figure 1 A device that provides the functions specified in one or more boxes.

[0066] These computer program instructions may also be stored in a computer-readable storage medium that can direct a computer or other programmable data processing device to function in a particular manner, such that the instructions stored in the computer-readable storage medium produce an article of manufacture including instruction means, which are implemented in a process Figure 1 One or more processes and / or boxes Figure 1 The function specified in one or more boxes.

[0067] These computer program instructions may also be loaded onto a computer or other programmable data processing equipment to cause a series of operational steps to be performed on the computer or other programmable equipment to produce a computer-implemented process, thereby providing instructions that execute on the computer or other programmable equipment for implementing the process. Figure 1 One or more processes and / or boxes Figure 1 The steps of the function specified in one or more boxes.

[0068] The above specific embodiments further illustrate the purpose, technical solution, and beneficial effects of the present invention. It should be understood that the above are merely specific embodiments of the present invention and are not intended to limit the scope of protection of the present invention. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of the present invention should be included within the scope of protection of the present invention.

Claims

1. A method for optimizing the generation of cyclic redundancy check codes, characterized in that, Includes the following steps: Based on the latency performance dataset obtained from the test, a first performance prediction model using a serial cyclic redundancy check (CRC) calculation method and a second performance prediction model using a parallel CRC calculation method are constructed respectively. The data block size information is input into the first performance prediction model and the second performance prediction model respectively to obtain the serial delay parameter and the parallel delay parameter respectively; With constraints of meeting latency performance indicators, power consumption indicators, and data block size information, we analyze the resource satisfaction of searching the hardware resource library to build the optimal hardware module under different computing methods. The priority value of the Cyclic Redundancy Check (CRC) code in different calculation methods is determined by combining resource satisfaction and delay parameters; The cyclic redundancy check code is obtained by selecting the calculation method with the larger priority value in the corresponding optimal hardware module. The construction process of the first performance prediction model and the second performance prediction model is as follows: The latency performance data in the latency performance dataset, calculated using the same method, is divided into a training set and a test set. By combining convolutional neural networks and using the neural network gradient descent training algorithm, we can construct the first performance prediction model and the second performance prediction model under the corresponding calculation method. The specific process of searching and assembling the optimal hardware module is as follows: Input constraints; The simulated annealing algorithm is used to automatically search for hardware modules that meet the constraints from the hardware resource library; All the hardware modules searched are assembled into the optimal hardware module and packaged into a hardware IP core with an AXI interface.

2. The method for optimizing and generating cyclic redundancy check codes according to claim 1, characterized in that, The process of obtaining the latency performance data is as follows: Analyze the specified parallelization instructions provided by the CPU platform for calculating cyclic redundancy check codes to obtain the performance parameters of a single instruction; Based on the sum of the performance parameters of all specified parallelization instructions corresponding to different computing methods, the latency performance data of different computing methods at the corresponding data block size is obtained by testing. Adjust the data block size to obtain a latency performance dataset consisting of multiple latency performance data.

3. The method for optimizing and generating cyclic redundancy check codes according to claim 1, characterized in that, The analysis process for resource satisfaction is as follows: The completion rate of each indicator is obtained by comparing the actual value of each indicator in the optimal hardware module with the standard value of each indicator in the constraints. The resource satisfaction level in the optimal hardware module is determined by the weight of the completion rate of all indicators.

4. The method for optimizing and generating cyclic redundancy check codes according to claim 1, characterized in that, The priority value is determined based on the mean or weight of resource satisfaction and delay parameters.

5. The method for optimizing and generating cyclic redundancy check codes according to claim 1, characterized in that, The process for determining the priority value is as follows: Select the more important parameter between resource satisfaction and delay as the base parameter, and the remaining parameter as the correction parameter; The basic terms are modified based on the modification terms to obtain the priority value of the corresponding calculation method.

6. A cyclic redundancy check (CRC) code optimization generation system, characterized in that, include: The model building module is used to construct a first performance prediction model using a serial cyclic redundancy check (CRC) calculation method and a second performance prediction model using a parallel CRC calculation method based on the latency performance dataset obtained from the test. The delay prediction module is used to input the data block size information into the first performance prediction model and the second performance prediction model respectively, and obtain the serial delay parameter and the parallel delay parameter respectively. The resource analysis module is used to analyze the resource satisfaction of searching the hardware resource library to build the optimal hardware module under different computing methods, with the constraints of latency performance indicators, power consumption indicators and data block size information. The comprehensive analysis module is used to determine the priority value of the cyclic redundancy check code under different calculation methods by combining resource satisfaction and delay parameters; The optimization calculation module is used to select the calculation method with the higher priority value to calculate the cyclic redundancy check code in the corresponding optimal hardware module. The construction process of the first performance prediction model and the second performance prediction model is as follows: The latency performance data in the latency performance dataset, calculated using the same method, is divided into a training set and a test set. By combining convolutional neural networks and using the neural network gradient descent training algorithm, we can construct the first performance prediction model and the second performance prediction model under the corresponding calculation method. The specific process of searching and assembling the optimal hardware module is as follows: Input constraints; The simulated annealing algorithm is used to automatically search for hardware modules that meet the constraints from the hardware resource library; All the hardware modules searched are assembled into the optimal hardware module and packaged into a hardware IP core with an AXI interface.

7. A computer terminal comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, characterized in that, When the processor executes the program, it implements a cyclic redundancy check code optimization generation method as described in any one of claims 1-5.

8. A computer-readable medium having a computer program stored thereon, characterized in that, The computer program, when executed by a processor, can implement a cyclic redundancy check code optimization generation method as described in any one of claims 1-5.