String length function segmentation optimization method and storage medium thereof
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- WUXI ADVANCED TECH RES INST
- Filing Date
- 2022-11-23
- Publication Date
- 2026-06-23
Smart Images

Figure CN115794227B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to a method for optimizing string length function segmentation and its storage medium, belonging to the field of basic library optimization technology in compilers. Background Technology
[0002] Currently, performance optimization of library functions is a crucial aspect of compiler optimization, with the string length function being one of the most fundamental. The main function of the string length function is to calculate the length of a string, scanning from the beginning until the first null terminator '\0' is encountered, and then returning the string length value. This function is widely used, and therefore its performance has received considerable attention. Major processor architectures such as x86 and ARM have implemented optimized versions of this function. One common implementation method uses byte comparison instructions, finding the terminator by comparing each byte with 0. However, this method is a scalar implementation and does not fully utilize the processor's supported structural features for optimization.
[0003] Existing processor functions for calculating string length suffer from problems such as limited optimization methods and low computational efficiency.
[0004] 1) In the loop segment, the existing technology uses fixed small bytes to process strings of different lengths, comparing them every 8 bytes, which results in a large proportion of the overhead of looping and judgment.
[0005] 2) In the end-of-line processing, the existing technology uses byte-by-byte determination to locate the end-of-line character, which is not very efficient. Summary of the Invention
[0006] The technical problem to be solved by the present invention is to overcome the defects of the prior art and provide a string length function segmentation optimization method and its storage medium.
[0007] To achieve the above objectives, this invention provides a method for optimizing string length function segmentation, comprising:
[0008] Step S10: Load the string without delimiting it, based on the starting address of the string;
[0009] Step S11: Remove interference from irrelevant data before the starting address of the string;
[0010] Step S12: Perform boundary alignment on the starting address of the string;
[0011] Step S13: Find the end character of the string after boundary processing. If the end character is not found, search for the end character of the string in a loop. If it is found, perform tail processing.
[0012] Calculate the length of the string.
[0013] Firstly, in step S10, the string is loaded without delimitation based on its starting address, which is achieved through the following steps:
[0014] Get the address of the first character of the string;
[0015] The string's header is defined as the address from the first byte to the first 32-byte bounded address. The header of the string is then loaded into a register using an unbounded vector.
[0016] Firstly, in step S11, removing irrelevant data before the starting address of the string is achieved through the following steps:
[0017] Use the OR and NOT logical operations to set all irrelevant data in a string before the first address to 0.
[0018] Firstly, in step S12, the starting address of the string is delimited, which is achieved through the following steps:
[0019] Set the lower 5 bits of the string's starting address to 0.
[0020] Firstly, in step S13, the end-of-string character after boundary processing is found. If no end-of-string character is found, the string is searched for in a loop. If a end-of-string character is found, the end-of-string is processed. This is achieved through the following steps:
[0021] Step 31: Perform an XOR operation on the string byte by byte with 0x7f to obtain the first string result;
[0022] The second string is obtained by performing two overflow subtraction operations on the first string result and then subtracting 0x7f twice.
[0023] Replace all non-zero bytes in the second string with 0x80, and replace the 0 terminator in the second string with 0x81;
[0024] Step 32, apply the 1 instruction to the second string;
[0025] If the second string consists entirely of non-zero bytes and does not contain the terminating null character 0, then the result obtained by using the 1 instruction on the second string is 32;
[0026] If the second string contains a null terminator 0, then the result obtained by using the 1 instruction on the second string is greater than 32;
[0027] Step 33: Compare the result obtained by the 1 command with 32. If the result obtained by the 1 command for the second string is equal to 32, then loop to find the end character of the string. If the result obtained by the 1 command for the second string is not equal to 32, then perform tail processing.
[0028] First, find the end-of-string character in a loop, which is achieved through the following steps:
[0029] Step 41: Add 32 bytes to the starting address of the string after boundary processing in step S12 to obtain the third string;
[0030] Step 42: Perform an XOR operation between the third string and 0x7f byte by byte to obtain the fourth string;
[0031] After performing two overflow subtraction operations on the fourth string, subtract 0x7f twice to obtain the fifth string;
[0032] Replace all non-zero bytes of the fifth string with 0x80, and replace the 0 terminator of the fifth string with 0x81;
[0033] Step 43: Apply the 1 instruction to the fifth string to obtain the 1 instruction result;
[0034] Step 44: If the result of the 1 command is equal to 0, it is determined that no end character of the string has been found, and proceed to step S41. If the result of the 1 command is 1, it is determined that the end character of the string has been found, and the end character is processed.
[0035] Firstly, tail processing is performed through the following steps:
[0036] Step 51: If the second string contains a null terminator 0, then find the 32 bytes containing the null terminator.
[0037] Step 52: Take the 32 bytes containing the string terminator as the end of the string;
[0038] Step 53, locate the specific address of the end-of-line character, which is achieved through the following steps:
[0039] Using the trailing zeros of the vector, we start from the lowest address of the end of the string and check if it is 0, until a 1 appears at the lowest address of the end of the string, thus obtaining the number of consecutive zeros before a 1 appears at the lowest address of the end of the string.
[0040] Add the number of consecutive zeros to the 32 bytes containing the string's terminator to get the address of the string's terminator.
[0041] First, calculate the string length using the following steps:
[0042] The length of a string is obtained by subtracting the address of its terminator from the address of its first character.
[0043] An electronic device includes a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor executes the program to implement the steps of any of the methods described above.
[0044] A computer-readable storage medium having a computer program stored thereon, which, when executed by a processor, implements the steps of any of the methods described above.
[0045] The beneficial effects achieved by this invention are as follows:
[0046] 1) This invention uses vector overflow addition and subtraction instructions to accelerate the search for the end character: the string content is subjected to two overflow subtraction operations, and after subtracting 0x7f twice, the non-zero bytes are replaced with 0x80 and the all-zero bytes are replaced with 0x81. The number 1 instruction is used directly to determine whether there is an end character, thereby improving the search speed.
[0047] 2) This invention uses vector instructions for optimization: Existing technology uses scalar instructions for function implementation, which can only process 8 bytes at a time when searching for the end character in the loop segment. Using vector instructions, 32 bytes can be processed in one loop. When the length of the input string is large, the vector implementation can significantly improve the performance.
[0048] 3) This invention uses vector tail 0 instructions to accelerate tail processing: Existing technologies use byte-by-byte comparison to process the end of a string. Since the tail of a scalar implementation is 8 bytes, the performance disadvantage of this method is small. However, since this invention uses vector implementation, the end of the string is 32 bytes. If the byte-by-byte comparison method is still used, the performance overhead is large. Therefore, this invention uses the vector tail 0 instruction to locate the address of the terminator. Only two instructions are needed to complete the tail processing, reducing the number of instructions and improving the performance of the function to calculate the string length.
[0049] 4) The string length function implemented using the above optimization method increases the amount of data processed in a single operation and simplifies the number of instructions. Especially when the amount of input string data is large, it makes full use of the advantages of the vector instruction set to realize the function of calculating string length, thereby improving the function performance. Attached Figure Description
[0050] Figure 1 This is a flowchart of the present invention;
[0051] Figure 2 This is a schematic diagram illustrating the use of vector overflow addition and subtraction instructions to process strings in this invention. Detailed Implementation
[0052] The following embodiments are only used to illustrate the technical solutions of the present invention more clearly, and should not be used to limit the scope of protection of the present invention.
[0053] Figure 1 This is a flowchart of the method of the present invention, which includes:
[0054] Step S10: Load the string according to the unbounded address of the first string.
[0055] Get the address of the first character of the string;
[0056] The string's header is defined as the address from the first byte to the first 32-byte bounded address, and the header is loaded into a register using an unbounded vector.
[0057] Specifically, the input parameter of the string length function is a string, and the starting address of the input string is obtained. Since it is impossible to guarantee that the starting address of the string is aligned to the 32-byte boundary, the address from the starting address to the first 32-byte boundary address is used as the head of the string. The head of the string is processed separately first, and the unbounded vector is used to load the contents of the string head into the register.
[0058] Step S11: Remove irrelevant data before the starting address of the string.
[0059] Use the OR and NOT logical operations to set all irrelevant data in a string before the first address to 0;
[0060] Specifically, since the non-boundary vector load instruction is used in step S10, irrelevant data before the first address will be loaded into the register together. In this step, the irrelevant data before the first address is processed by using OR and NOT logical operations to set all the irrelevant data before the first address to 0, so as to avoid the irrelevant data before the first address from interfering with subsequent processing.
[0061] Step S12: Perform boundary alignment on the starting address of the string.
[0062] The boundary handling includes setting the lower 5 bits of the first address of the string to 0.
[0063] Specifically, since the starting address of the string obtained in step S10 is not aligned with the boundary, the starting address of the string is aligned in this step. The method is to set the lower 5 bits of the address to 0 to make it align with the boundary of 32 bytes.
[0064] Step S13: Process the string after boundary processing to find the end character of the string.
[0065] Specifically, the beginning of the string is processed using vector overflow addition and subtraction instructions. Each non-zero byte is replaced with a fixed number 0x80, and the string terminator 0 is replaced with another number 0x81. Then, the 1 instruction can be used to complete the judgment. The specific implementation is as follows (see...). Figure 2 ):
[0066] Step 31. Perform an XOR operation between the string content and 0x7f byte by byte to obtain the first string result;
[0067] The first string result obtained in step 31 is subjected to two overflow subtraction operations and then 0x7f is subtracted twice to obtain the second string; all non-zero bytes of the second string are replaced with 0x80, and the 0 terminator of the second string is replaced with 0x81.
[0068] Step 32. Apply the 1 instruction to the second string;
[0069] If the second string consists entirely of non-zero bytes and does not contain the terminating null character 0, then the result obtained by using the 1 instruction on the second string is 32;
[0070] If the second string contains a null terminator 0, then the result obtained by using the 1 instruction on the second string is greater than 32;
[0071] Step 33. Compare the result obtained by the 1 command with 32. If the result obtained by the 1 command for the second string is equal to 32, then loop to find the end character of the string. If the result obtained by the 1 command for the second string is not equal to 32, then perform tail processing.
[0072] Step S14: Processing the loop segment.
[0073] Specifically, since no string terminator was found in the second string, this step will search for the string terminator by looping through 32 bytes. The specific implementation is as follows:
[0074] Step 41: Add 32 bytes to the starting address of the string after boundary processing in step S12, and use it as the address of the first processing of the loop segment to obtain the third string. Add 32 bytes to the starting address of the string in each loop.
[0075] Step 42: Perform an XOR operation between the third string and 0x7f byte by byte to obtain the fourth string;
[0076] After performing two overflow subtraction operations on the fourth string, subtract 0x7f twice to obtain the fifth string;
[0077] All non-zero bytes in the fifth string were replaced with 0x80, and the null terminator 0 in the fifth string was replaced with 0x81;
[0078] Step 43: Apply the 1 instruction to the fifth string to obtain the 1 instruction result;
[0079] Step 44: If the result of the 1 instruction is equal to 0, it is determined that no end character of the string has been found, and step S41 is performed. If the result of the 1 instruction is 1, it is determined that the end character of the string has been found, and the end character is processed.
[0080] Step S15: Tail processing.
[0081] Step 51: If the second string contains a null terminator 0, then find the 32 bytes containing the null terminator.
[0082] Step 52: Take the 32 bytes containing the string terminator as the end of the string;
[0083] Step 53, locate the specific address of the end-of-line character, which is achieved through the following steps:
[0084] Using the trailing zeros of the vector, we start from the lowest address of the end of the string and check if it is 0, until a 1 appears at the lowest address of the end of the string. This gives us the number of consecutive zeros before a 1 appears at the lowest address of the end of the string.
[0085] Add the number of consecutive zeros to the 32 bytes containing the string's terminator to get the address of the string's terminator.
[0086] Specifically, having located the 32-byte string terminator using previous steps, and treating this portion as the end of the string, this step will pinpoint the exact address of the terminator. This is achieved by using the trailing zero instruction (LTC) to count consecutive zeros starting from the lowest address of this 32-byte string until a 1 is encountered (the location of the string terminator). Adding this count to the lowest address of the 32-byte string yields the address of the string terminator.
[0087] S16: Calculate the string length.
[0088] By subtracting the address of the string's terminator from the address of the string's beginning, we can obtain the length of the string.
[0089] This application is described with reference to flowchart illustrations and / or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of this application. It will be understood that each block of the flowchart illustrations and / or block diagrams, and combinations of blocks in the flowchart illustrations and / or block diagrams, can be implemented by computer program instructions. These computer program instructions can be provided to a processor of a general-purpose computer, special-purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, generate instructions for implementing the flowchart... Figure 1 One or more processes and / or boxes Figure 1 A device that provides the functions specified in one or more boxes.
[0090] These computer program instructions may also be stored in a computer-readable storage medium that can direct a computer or other programmable data processing device to function in a particular manner, such that the instructions stored in the computer-readable storage medium produce an article of manufacture including instruction means, which are implemented in a process Figure 1 One or more processes and / or boxes Figure 1 The function specified in one or more boxes.
[0091] These computer program instructions may also be loaded onto a computer or other programmable data processing equipment to cause a series of operational steps to be performed on the computer or other programmable equipment to produce a computer-implemented process, thereby providing instructions that execute on the computer or other programmable equipment for implementing the process. Figure 1 One or more processes and / or boxes Figure 1 The steps of the function specified in one or more boxes.
[0092] The above description is only a preferred embodiment of the present invention. It should be noted that for those skilled in the art, several improvements and modifications can be made without departing from the technical principles of the present invention, and these improvements and modifications should also be considered within the scope of protection of the present invention.
Claims
1. A method for segmenting optimization of a string length function, characterized in that, include: Step S10: Load the string without delimiting it, based on the starting address of the string; Step S10 is achieved through the following steps: Get the address of the first character of the string; The string's header is defined as the address from the first byte to the first 32-byte bounded address, and the header is loaded into a register using an unbounded vector. Step S11: Remove interference from irrelevant data before the starting address of the string; Step S12: Perform boundary alignment on the starting address of the string; Step S13: Find the end character of the string after boundary processing. If the end character is not found, search for the end character of the string in a loop. If it is found, perform tail processing. Calculate the string length; Step S13 is achieved through the following steps: Step 31: Perform an XOR operation on the string byte by byte with 0x7f to obtain the first string result; The second string is obtained by performing two overflow subtraction operations on the first string result and then subtracting 0x7f twice. Replace all non-zero bytes in the second string with 0x80, and replace the end character of the second string with 0x81; Step 32: Apply the 1-count instruction to the second string; the 1-count instruction is used to calculate the number of all bits that are 1 in the string; If the second string consists entirely of non-zero bytes and has no terminator, then the result of using the 1 instruction on the second string is 32; If the second string contains a terminator, then the result obtained by using the 1 instruction on the second string is greater than 32; Step 33: Compare the result obtained by the 1 command with 32. If the result obtained by the 1 command for the second string is equal to 32, then loop to find the end character of the string. If the result obtained by the 1 command for the second string is not equal to 32, then perform tail processing.
2. The string length function segmentation optimization method according to claim 1, characterized in that, Step S11, removing irrelevant data before the starting address of the string, is achieved through the following steps: Use the OR and NOT logical operations to set all irrelevant data in a string before the first address to 0.
3. The string length function segmentation optimization method according to claim 1, characterized in that, Step S12 involves aligning the starting address of the string to the specified boundary, achieved through the following steps: Set the lower 5 bits of the string's starting address to 0.
4. The string length function segmentation optimization method according to claim 1, characterized in that, Find the end-of-string character in a loop using the following steps: Step 41: Add 32 bytes to the starting address of the string after boundary processing in step S12 to obtain the third string; Step 42: Perform an XOR operation between the third string and 0x7f byte by byte to obtain the fourth string; After performing two overflow subtraction operations on the fourth string, subtract 0x7f twice to obtain the fifth string; Replace all non-zero bytes of the fifth string with 0x80, and replace the terminator of the fifth string with 0x81; Step 43: Apply the 1 instruction to the fifth string to obtain the 1 instruction result; Step 44: If the result of the 1 command is equal to 0, it is determined that no end character of the string has been found, and proceed to step S41. If the result of the 1 command is 1, it is determined that the end character of the string has been found, and the end character is processed.
5. The string length function segmentation optimization method according to claim 1, characterized in that, Tail processing is performed through the following steps: Step 51: If the second string contains a terminator, then find the 32 bytes containing the terminator. Step 52: Take the 32 bytes containing the string terminator as the end of the string; Step 53, locate the specific address of the end-of-line character, which is achieved through the following steps: Using the trailing zeros of the vector, we start from the lowest address of the end of the string and check if it is 0, until a 1 appears at the lowest address of the end of the string, thus obtaining the number of consecutive zeros before a 1 appears at the lowest address of the end of the string. Add the number of consecutive zeros to the 32 bytes containing the string's terminator to get the address of the string's terminator.
6. The string length function segmentation optimization method according to claim 5, characterized in that, The length of a string can be calculated using the following steps: The length of a string is obtained by subtracting the address of its terminator from the address of its first character.
7. An electronic device comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, characterized in that, When the processor executes the program, it implements the steps of the method according to any one of claims 1 to 6.
8. A computer readable storage medium having stored thereon a computer program, characterized in that, When executed by a processor, the computer program implements the steps of the method according to any one of claims 1 to 6.