An instruction customization method and system
By converting binary executable code into assembly code, analyzing and generating custom instructions, the problem of binary code acceleration in embedded processors is solved, achieving a transparent and efficient instruction acceleration effect.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- NAT UNIV OF DEFENSE TECH
- Filing Date
- 2022-07-06
- Publication Date
- 2026-06-19
AI Technical Summary
Existing technologies struggle to effectively accelerate user-provided binary executable code, especially in the field of embedded processors, where user privacy or trade secrets prevent the embedding of custom instruction assembly code in the source code or the recompilation of the source code using a modified compiler.
By acquiring the user's binary executable code, converting it into assembly code, analyzing the data dependencies in the instruction stream, filtering out data-related instruction sequence groups, generating custom instructions, and implementing these instructions on the target machine to replace the original assembly code and generate new binary executable code.
It achieves a transparent and effective instruction acceleration process for users, without requiring users to be familiar with custom instructions or modify the compiler, significantly improving the execution speed of binary executable code.
Smart Images

Figure CN115185585B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of embedded processor instruction customization, and in particular to an instruction customization method and system. Background Technology
[0002] As applications increasingly demand higher performance from embedded processors, accelerating application execution through custom instructions has become a common practice. This is especially true with the development of domain-specific architectures, where embedded processors designed for specific domains and applications often achieve better performance by adding custom instructions. However, traditional methods for accelerating application execution with custom instructions typically involve embedding custom instruction assembly code into the source code or modifying the compiler to support the custom instructions. Embedding custom instruction assembly code into the source code is inefficient and requires programmers to be highly familiar with the custom instructions and application characteristics, making it unsuitable for large-scale programs. Modifying the compiler involves a complex modification process and requires the source code to be recompiled to generate new code.
[0003] Traditional custom instruction acceleration methods are a forward approach, working from source code to executable code. However, in the embedded processor field, some applications involve user privacy or trade secrets, requiring users to provide only binary executable code. Traditional custom instruction acceleration methods require users to embed custom instruction assembly code into the source code or recompile the source code using a modified compiler, failing to address the issue of accelerating the user's binary executable code.
[0004] Therefore, providing an instruction customization method and system that can effectively accelerate the execution process of user binary executable code is a problem that urgently needs to be solved by those skilled in the art. Summary of the Invention
[0005] The purpose of this invention is to provide an instruction customization method and system. This method is logically clear, safe, effective, reliable and easy to operate. It can effectively accelerate the execution process of the user's binary executable code. This acceleration process is completely transparent to the user. The user does not need to be familiar with the characteristics of the customized instructions and application programs, nor does the user need to modify the compiler for the customized instructions.
[0006] Based on the above objectives, the technical solution provided by the present invention is as follows:
[0007] A method for customizing instructions includes the following steps:
[0008] S1. Obtain the binary executable code input by the user, and convert the binary executable code into first assembly code;
[0009] S2. Analyze the data correlation of the instruction stream in the first assembly code to obtain the first data-related instruction sequence group;
[0010] S3. Filter the first data-related instruction sequence group to obtain the second data-related instruction sequence group;
[0011] S4. Generate a customized instruction in the second data-related instruction sequence group according to a preset first rule;
[0012] S5. Implement the customized instructions and obtain the second assembly code;
[0013] S6. According to the second preset rule, the first assembly code is replaced with the second assembly code in sequence. After all data-related instructions are replaced, the third assembly code is obtained.
[0014] S7. Convert the third assembly code into new binary executable code and send the new binary executable code to the user.
[0015] Preferably, the step of analyzing the data correlation of the instruction stream in the first assembly code and obtaining the first data-related instruction sequence group includes the following steps:
[0016] A1. Determine whether two adjacent instructions are a data-dependent instruction sequence according to the instruction flow order in the first assembly code;
[0017] A2. Obtain and analyze all data-related instruction sequences in the first assembly code;
[0018] A3. Based on the statistical results, generate the first data-related instruction sequence group.
[0019] Preferably, the step of determining whether two adjacent instructions are a data-dependent instruction sequence according to the instruction flow order in the first assembly code includes the following steps:
[0020] B1. Select adjacent instructions A and B according to the instruction flow order in the first assembly code;
[0021] B2. Determine whether the source register of instruction B contains the destination register of instruction A;
[0022] B3. Determine whether the destination register of instruction A is used as the destination register when it first appears after instruction A;
[0023] B4. If all the above judgment results are yes, then instruction A and instruction B are related in terms of data;
[0024] B5. Define instruction A and instruction B as an AB type data-related instruction sequence.
[0025] Preferably, the steps of acquiring and statistically analyzing all data-related instruction sequences in the first assembly code and generating a first data-related instruction sequence group based on the statistical results include the following steps:
[0026] C1. Repeat steps B1 to B5 until several AB-type data-related instruction sequences are obtained;
[0027] C2. Statistically analyze the instruction sequence related to each of the AB-type data and its frequency of occurrence;
[0028] C3. Integrate each of the AB-type data-related instruction sequences to generate the first data-related instruction sequence group.
[0029] Preferably, filtering the first data-related instruction sequence group to obtain the second data-related instruction sequence group includes the following steps:
[0030] D1. Select instruction A from a plurality of the AB-type data-related instruction sequences;
[0031] D2. Determine whether instruction A contains a jump instruction;
[0032] D3. Determine whether instruction A contains a branch instruction;
[0033] D4. If any of the above judgment results are true, then delete the AB type data-related instruction sequence that contains the jump instruction or the branch instruction in instruction A, and form a preliminary screening data-related instruction sequence group.
[0034] Preferably, if any of the above judgment results are true, after deleting the AB-type data-related instruction sequence containing the jump instruction or the branch instruction in instruction A to form the initial screening data-related instruction sequence group, the following steps are further included:
[0035] E1. Obtain and count the total number of source registers of instruction A and the source register of instruction B in each data-related instruction sequence in the initial screening data-related instruction sequence group;
[0036] E2. Determine whether the total number of source registers of instruction A and the source registers of instruction B in the instruction sequence group related to the initial screening data is greater than the preset number of source registers;
[0037] E3. If the above judgment result is yes, then delete the instruction sequence in the initial screening data related instruction sequence group where the total number of source registers is greater than the preset number of source registers, and generate the second data related instruction sequence group.
[0038] Preferably, the step of generating a custom instruction and implementing the custom instruction according to a preset first rule in the second data-related instruction sequence group, and obtaining the second assembly code, includes the following steps:
[0039] F1. In the second data-related instruction sequence group, a preset standard frequency is used;
[0040] F2. Arrange the AB type data-related instruction sequence in descending order of frequency above a preset standard frequency;
[0041] F3. Execute customized instructions sequentially according to the frequency from high to low, wherein the customized instructions are used to implement the function of the AB type data related instruction sequence in the second data related instruction sequence group;
[0042] F4. Implement the custom instructions on the user's target machine and obtain the second assembly code.
[0043] Preferably, the step of sequentially replacing the first assembly code with the second assembly code according to the second preset rule, and obtaining the third assembly code after all data-related instructions have been replaced, includes the following steps:
[0044] G1. Replace the first assembly code with the second assembly code in descending order of frequency;
[0045] G2. Repeat step G1 until all data-related instruction sequences in the first assembly code have been replaced;
[0046] G3. Integrate and generate the third assembly code.
[0047] Preferably,
[0048] The conversion tool for converting the binary executable code into first assembly code is a disassembler.
[0049] The conversion tool used to transform the third assembly code into new binary executable code is an assembler.
[0050] An instruction customization system, comprising:
[0051] The conversion module is used to convert between binary executable code and assembly code;
[0052] The analysis module is used to analyze the data correlation of the instruction stream in order to obtain the first related instruction sequence group;
[0053] The filtering module is used to filter the first related instruction sequence group to obtain the second related instruction sequence group;
[0054] The instruction customization module is used to generate customized instructions in the second related instruction sequence group according to preset rules;
[0055] The acquisition module is used to acquire the second assembly code;
[0056] The replacement module is used to replace the first assembly code with the second assembly code and obtain the third assembly code;
[0057] The output module is used to send new binary executable code to the user.
[0058] This invention discloses an instruction customization method, which involves converting user data's binary executable code into first assembly code; subsequently analyzing the data correlation within the instruction flow of the first assembly code; obtaining a first data-related instruction sequence group; filtering the first data-related instruction sequence group according to certain rules; obtaining a second data-related instruction sequence group after filtering; generating customized instructions from the second data-related instruction sequence group according to a first preset rule; sending the generated customized instructions to the user to implement the customized instructions and obtain second assembly code; similarly, according to the second preset rule, sequentially replacing the first assembly code with the second assembly code; after all data-related instructions have been replaced, obtaining a third assembly code; converting the third assembly code into new binary executable code, and sending the new binary executable code to the user. This method only requires the user to provide binary executable code. By converting the binary executable code into assembly code and determining the data-related instruction sequence within it, custom instructions are generated to replace the data-related instructions, enabling the assembly code of the customized instructions to implement the functions of the data-related instruction sequence, thereby effectively accelerating the execution process of the user's binary executable code. The system provided by this invention can also achieve the same technical effect. Attached Figure Description
[0059] To more clearly illustrate the technical solutions in the embodiments of this application or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are only some embodiments of this application. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.
[0060] Figure 1 A flowchart illustrating a method for instruction customization provided in an embodiment of the present invention;
[0061] Figure 2 A flowchart of step S2 provided in an embodiment of the present invention;
[0062] Figure 3 A flowchart of step A1 provided in an embodiment of the present invention;
[0063] Figure 4 Flowcharts of steps A2 and A3 provided in embodiments of the present invention;
[0064] Figure 5 A flowchart of step S3 provided in an embodiment of the present invention;
[0065] Figure 6 A flowchart following step D4 provided in an embodiment of the present invention;
[0066] Figure 7 Flowcharts of steps S4 and S5 provided in embodiments of the present invention;
[0067] Figure 8 A flowchart of step S6 provided in an embodiment of the present invention;
[0068] Figure 9 This is a schematic diagram of the structure of an instruction customization system provided in an embodiment of the present invention. Detailed Implementation
[0069] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.
[0070] The embodiments of this invention are written in a progressive manner.
[0071] This invention provides a method and system for customizing instructions. It primarily addresses the technical problem in the prior art where, due to user privacy or trade secrets, users only provide binary executable code, while custom instruction assembly code is embedded in the source code or the source code is recompiled using a modified compiler, which is inconvenient and inefficient.
[0072] A method for customizing instructions includes the following steps:
[0073] S1. Obtain the binary executable code input by the user and convert the binary executable code into first assembly code;
[0074] S2. Analyze the data dependencies of the instruction stream in the first assembly code and obtain the first data-dependent instruction sequence group;
[0075] S3. Filter the first data-related instruction sequence group to obtain the second data-related instruction sequence group;
[0076] S4. Generate customized instructions according to the preset first rule in the second data-related instruction sequence group;
[0077] S5. Implement custom instructions and obtain the second assembly code;
[0078] S6. According to the second preset rule, replace the first assembly code with the second assembly code in sequence. After all data-related instructions have been replaced, obtain the third assembly code.
[0079] S7. Convert the third assembly code into new binary executable code and send the new binary executable code to the user.
[0080] In step S1, the user inputs binary executable code, which is then converted into first assembly code;
[0081] In step S2, the data dependencies of the instruction stream in the first assembly code are analyzed, and the instructions with data dependencies are grouped together to form the first data-dependent instruction sequence group.
[0082] In step S3, the first data-related instruction sequence group is filtered to remove relevant instruction sequences that do not meet the conditions, thus forming the second data-related instruction sequence group;
[0083] In step S4, a customized instruction is generated in the second data-related instruction group according to the pre-set first rule;
[0084] In step S5, the custom instructions are implemented on the user's target machine, and the second assembly code is obtained;
[0085] In step S6, according to the pre-set second preset rule, the first assembly code is replaced with the second assembly code in sequence. After all data-related instructions have been replaced, the assembly code is formed into the third assembly code.
[0086] In step S7, the third assembly code is converted into new binary executable code and sent to the user's target machine. At this time, the execution process of the new binary executable code has a significant speed improvement compared to the execution process of the original binary executable code.
[0087] Preferably, step S2 includes the following steps:
[0088] A1. Determine whether two adjacent instructions are a data-dependent instruction sequence based on the instruction flow order in the first assembly code;
[0089] A2. Obtain and analyze all data-related instruction sequences in the first assembly code;
[0090] A3. Based on the statistical results, generate the first data-related instruction sequence group.
[0091] In step A1, the instruction sequence in the first assembly code is analyzed one by one to determine whether two adjacent instructions are a data-dependent instruction sequence.
[0092] In step A2, based on the judgment result, all data-related instruction sequences in the first assembly code are obtained and their frequency of occurrence is counted;
[0093] In step A3, a first data-related instruction sequence group is formed based on the statistical results.
[0094] Preferably, step A1 includes the following steps:
[0095] B1. Select adjacent instructions A and B according to the instruction flow order in the first assembly code;
[0096] B2. Determine whether the source register of instruction B contains the destination register of instruction A;
[0097] B3. Determine whether the destination register of instruction A is used as the destination register when it first appears after instruction A;
[0098] B4. If all the above judgment results are yes, then instruction A and instruction B are related in terms of data;
[0099] B5. Merge instruction A and instruction B into a sequence of AB type data-related instructions.
[0100] In step B1, adjacent instructions A and B are selected according to the instruction flow order in the first assembly code. Instructions A and B are not actual references but rather metaphorical references to instructions with a certain characteristic.
[0101] In step B2, it is determined whether the source register of instruction B contains the destination register of instruction A, that is, whether the destination register of instruction A is one of the source registers of instruction B.
[0102] It's important to note that the source register refers to the source address / operand register, and the destination register refers to the destination address / operand register. Both are types of registers, used to store the addresses and data of the memory locations currently being accessed by the CPU. Due to the speed difference between memory and CPU operations, registers must be used to hold address and data information until the memory read / write operation is complete. As the names suggest, the source register stores the source address and operands, and the destination register stores the destination address and operands.
[0103] In step B3, it is determined whether the destination register of instruction A is used as the destination register when it first appears after instruction A, that is, whether the destination address / operand of instruction A is the source address / operand of instruction B and not the source address / operand of other instructions.
[0104] In step B4, if all the above judgment results are yes, then it is determined that instruction A and instruction B are related.
[0105] In step B5, data-related instructions A and B are merged and defined as an AB-type data-related instruction sequence.
[0106] Preferably, steps A2 and A3 include the following steps:
[0107] C1. Repeat steps B1 to B5 until several AB-type data-related instruction sequences are obtained;
[0108] C2. Analyze the sequence of instructions related to each AB type data and their frequency of occurrence;
[0109] C3. Integrate each AB-type data-related instruction sequence to generate the first data-related instruction sequence group.
[0110] In step C1, repeat steps B1 to B5 above until all AB-type data-related instruction sequences in the first assembly code are obtained, such as CD, EF, GH, etc., there are a total of several.
[0111] In step C2, the relevant instruction sequence for each AB type data is counted, and its frequency of occurrence is recorded.
[0112] In step C3, each AB-type data-related instruction sequence is integrated to generate the first data-related instruction sequence group.
[0113] It should be noted that if two AB, one CD, one EF, one GH, one IJ, and one KL appear in the instruction stream, then the first data-related instruction sequence group is [AB, CD, EF, GH, IJ, KL], where the frequency of AB is 2.
[0114] Preferably, step S3 includes the following steps:
[0115] D1. Select instruction A from several AB-type data-related instruction sequences;
[0116] D2. Determine whether instruction A contains a jump instruction;
[0117] D3. Determine whether instruction A contains a branch instruction;
[0118] D4. If any of the above judgment results are true, then delete the AB type data-related instruction sequence that contains jump instructions or branch instructions in instruction A, and form a preliminary screening data-related instruction sequence group.
[0119] In step D1, instruction A is selected from the first data-related instruction sequence group. For example, in the above example [AB,CD,EF,GH,IJ,KL], instructions A, C, E, G, I, and K are selected.
[0120] In steps D2 and D3, determine whether instruction A, C, E, G, I, or K is a jump instruction or a branch instruction. For example, in A, C, E, G, I, or K, determine whether C or E is a jump instruction or a branch instruction.
[0121] In step D4, if any of the judgment results are yes, then delete the AB-type data-related instruction sequence containing jump or branch instructions in instruction A, forming the second data-related instruction sequence group. For example, if C is a jump instruction, then delete CD from [AB,CD,EF,GH,IJ,KL]; if E is a branch instruction, then delete EF from [AB,EF,GH,IJ,KL]. Integrate the remaining related instruction sequences, such as [AB,GH,IJ,KL], after filtering, to form the initial data-related instruction sequence group [AB,GH,IJ,KL].
[0122] Preferably, after step D4, the following steps are also included:
[0123] E1. Obtain and count the total number of source registers for instruction A and instruction B in each data-related instruction sequence in the initial screening data-related instruction sequence group;
[0124] E2. Determine whether the total number of source registers of instruction A and instruction B in the instruction sequence group related to the initial screening data is greater than the preset number of source registers;
[0125] E3. If the above judgment result is yes, then delete the instruction sequence in the initial data-related instruction sequence group where the total number of source registers is greater than the preset number of source registers, and generate the second data-related instruction sequence group.
[0126] In step E1, the total number of source registers for each data-related instruction sequence A and B is obtained from the initial screening data-related instruction sequence group. In this embodiment, the total number of source registers for instruction A and instruction B is calculated, with the same source register counted as 1 and different source registers counted separately.
[0127] In step E2, it is determined whether the total number of source registers of instruction A and instruction B is greater than the number of source registers preset according to the instruction set.
[0128] In step E3, if the total number of source registers for instructions A and B in a data instruction sequence is greater than the preset number of source registers, then the data-related instruction sequence is deleted, and a second data-related instruction sequence group is generated. For example, in the initial data-related instruction sequence group [AB,GH,IJ,KL], the preset number of source registers is 3, while the number of source registers for instruction G and instruction H in the data-related instruction sequence GH is 2. Since the total number of source registers in the data-related instruction sequence GH is greater than the preset number of source registers, GH is deleted from the initial data-related instruction sequence group [AB,GH,IJ,KL], and a second data-related instruction sequence group [AB,IJ,KL] is generated.
[0129] Preferably, steps S4 and S5 include the following steps:
[0130] F1. In the second data-related instruction sequence group, the standard frequency is preset;
[0131] F2. Arrange the AB type data-related instruction sequence in descending order of frequency above the preset standard frequency;
[0132] F3. Execute the customized instructions sequentially according to the frequency from high to low. The customized instructions are used to implement the functions of the AB type data-related instruction sequence in the second data-related instruction sequence group.
[0133] F4. Implement custom instructions in the user's target machine and obtain the second assembly code.
[0134] In step F1, in the second data-related instruction sequence group, the standard frequency is set, and the frequency is set according to the actual situation.
[0135] In step F2, the AB type data related instruction sequences are sorted in descending order of frequency above a preset standard frequency.
[0136] In step F3, the customized instructions are executed sequentially according to their frequency from highest to lowest. These customized instructions are used to implement the functions of the AB-type data-related instruction sequence in the second data-related instruction sequence group.
[0137] In step F4, the customized instructions are sent to the user's target machine. The user implements the customized instructions and obtains the second assembly code. For example, the second data-related instruction sequence group [AB, IJ, KL], after being sorted by frequency, is [AB, IJ, KL]. The customized instructions are [X, Y, Z]. Among them, the function implemented by instruction X is the same as the function jointly implemented by instructions A and B; the function implemented by instruction Y is the same as the function jointly implemented by instructions I and J; and the function implemented by instruction Z is the same as the function jointly implemented by instructions K and L.
[0138] Preferably, step S6 includes the following steps:
[0139] G1. Replace the first assembly code with the second assembly code in descending order of frequency;
[0140] G2. Repeat step G1 until all data-related instruction sequences in the first assembly code have been replaced;
[0141] G3. Integrates and generates third-party assembly code.
[0142] In step G1, the assembly code of the custom instructions, i.e., the second assembly code, replaces the first assembly code in sequence.
[0143] In step G2, the above steps are repeated until all data-related instruction sequences in the first assembly code are replaced by the second assembly code.
[0144] In step G3, all the data-related instruction sequences that have been replaced are used to generate the third assembly code.
[0145] Preferably,
[0146] The tool for converting binary executable code into first-order assembly code is a disassembler;
[0147] The conversion tool that transforms third-party assembly code into new binary executable code is an assembler.
[0148] In practical applications, the tool used to convert binary executable code into first assembly code is a disassembler; the tool used to convert third assembly code into new binary executable code is an assembler.
[0149] An instruction customization system, comprising:
[0150] The conversion module is used to convert between binary executable code and assembly code;
[0151] The analysis module is used to analyze the data correlation of the instruction stream in order to obtain the first related instruction sequence group;
[0152] The filtering module is used to filter the first relevant instruction sequence group in order to obtain the second relevant instruction sequence group;
[0153] The instruction customization module is used to generate customized instructions in the second related instruction sequence group according to preset rules;
[0154] The acquisition module is used to acquire the second assembly code;
[0155] The replacement module is used to replace the first assembly code with the second assembly code and obtain the third assembly code;
[0156] The output module is used to send new binary executable code to the user.
[0157] In practical application, the conversion module converts the user-input binary executable code into first assembly code and sends it to the analysis module; the analysis module analyzes the data correlation of the instruction stream of the first assembly code to obtain a first related instruction sequence group and sends it to the filtering module; the filtering module filters the first related instruction sequence group to obtain a second related instruction sequence group and sends it to the instruction customization module; the instruction customization module generates customized instructions from the second related instruction sequence group according to preset rules and sends them to the user's target machine; the customized instructions are implemented on the user's target machine, and the acquisition module obtains the second assembly code from it and sends it to the replacement module; the replacement module replaces the first assembly code with the second assembly code, and after all the contents of the first assembly code have been replaced, it obtains the third assembly code and sends it to the conversion module; the conversion module converts the third assembly code into new binary executable code and sends it to the output module; the output module outputs the new binary executable code to the user.
[0158] In the embodiments provided in this application, it should be understood that the disclosed methods and apparatus can be implemented in other ways. The apparatus embodiments described above are merely illustrative. For example, the division of modules is only a logical functional division, and in actual implementation, there may be other division methods, such as: multiple modules or components can be combined, or integrated into another system, or some features can be ignored or not executed. In addition, the coupling, direct coupling, or communication connection between the various components shown or discussed can be through some interfaces, indirect coupling or communication connection of devices or modules, and can be electrical, mechanical, or other forms.
[0159] Furthermore, in the various embodiments of the present invention, each functional module can be fully integrated into a processor, or each module can be a separate device, or two or more modules can be integrated into a device; each functional module in the various embodiments of the present invention can be implemented in hardware or in the form of hardware plus software functional units.
[0160] Those skilled in the art will understand that all or part of the steps of the above method embodiments can be implemented by program instructions and related hardware. The aforementioned program instructions can be stored in a computer-readable storage medium. When the program instructions are executed, they perform the steps of the above method embodiments. The aforementioned storage medium includes various media that can store program code, such as mobile storage devices, read-only memory (ROM), magnetic disks, or optical disks.
[0161] It should also be noted that, in this document, relational terms such as "first" and "second" are used only to distinguish one entity or operation from another, and do not necessarily require or imply any such actual relationship or order between these entities or operations. Furthermore, the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or apparatus. Without further limitations, an element defined by the phrase "comprising one..." does not exclude the presence of other identical elements in the process, method, article, or apparatus that includes said element.
[0162] The foregoing has provided a detailed description of a method and system for instruction customization provided by the present invention. The above description of the disclosed embodiments enables those skilled in the art to implement or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the general principles defined herein may be implemented in other embodiments without departing from the spirit or scope of the invention. Therefore, the present invention is not to be limited to the embodiments shown herein, but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
Claims
1. An instruction customization method characterized by comprising: Includes the following steps: S1. Obtain the binary executable code input by the user, and convert the binary executable code into first assembly code; S2. Analyze the data correlation of the instruction stream in the first assembly code to obtain the first data-related instruction sequence group; S3. Filter the first data-related instruction sequence group to obtain the second data-related instruction sequence group; S4. Generate a customized instruction in the second data-related instruction sequence group according to a preset first rule; S5. Implement the customized instructions and obtain the second assembly code; S6. According to the second preset rule, the first assembly code is replaced with the second assembly code in sequence. After all data-related instructions are replaced, the third assembly code is obtained. S7. Convert the third assembly code into new binary executable code and send the new binary executable code to the user.
2. The instruction customization method of claim 1, wherein, The step of analyzing the data correlation of the instruction stream in the first assembly code and obtaining the first data-related instruction sequence group includes the following steps: A1. Determine whether two adjacent instructions are a data-dependent instruction sequence according to the instruction flow order in the first assembly code; A2. Obtain and analyze all data-related instruction sequences in the first assembly code; A3. Based on the statistical results, generate the first data-related instruction sequence group.
3. The instruction customization method of claim 2, wherein, The step of determining whether two adjacent instructions are a data-dependent instruction sequence based on the instruction flow order in the first assembly code includes the following steps: B1. Select adjacent instructions A and B according to the instruction flow order in the first assembly code; B2. Determine whether the source register of instruction B contains the destination register of instruction A; B3. Determine whether the destination register of instruction A is used as the destination register when it first appears after instruction A; B4. If all the above judgment results are yes, then instruction A and instruction B are related in terms of data; B5. Define instruction A and instruction B as an AB type data-related instruction sequence.
4. The instruction customization method of claim 3, wherein, The steps of acquiring and statistically analyzing all data-related instruction sequences in the first assembly code and generating a first data-related instruction sequence group based on the statistical results include the following: C1. Repeat steps B1 to B5 until several AB-type data-related instruction sequences are obtained; C2. Statistically analyze the instruction sequence related to each of the AB-type data and its frequency of occurrence; C3. Integrate each of the AB-type data-related instruction sequences to generate the first data-related instruction sequence group.
5. The instruction customization method of claim 4, wherein, Filtering the first data-related instruction sequence group to obtain the second data-related instruction sequence group includes the following steps: D1. Select instruction A from a plurality of the AB-type data-related instruction sequences; D2. Determine whether instruction A contains a jump instruction; D3. Determine whether instruction A contains a branch instruction; D4. If any of the above judgment results are true, then delete the AB type data-related instruction sequence that contains the jump instruction or the branch instruction in instruction A, and form a preliminary screening data-related instruction sequence group.
6. The instruction customization method of claim 5, wherein, If any of the above judgment results are true, then after deleting the AB type data-related instruction sequence that contains the jump instruction or the branch instruction in instruction A, and forming the initial screening data-related instruction sequence group, the following steps are also included: E1. Obtain and count the total number of source registers of instruction A and the source register of instruction B in each data-related instruction sequence in the initial screening data-related instruction sequence group; E2. Determine whether the total number of source registers of instruction A and the source registers of instruction B in the instruction sequence group related to the initial screening data is greater than the preset number of source registers; E3. If the above judgment result is yes, then delete the instruction sequence in the initial screening data related instruction sequence group where the total number of source registers is greater than the preset number of source registers, and generate the second data related instruction sequence group.
7. The instruction customization method of claim 6, wherein, In the second data-related instruction sequence group, generating custom instructions and implementing the custom instructions according to a preset first rule, and obtaining the second assembly code includes the following steps: F1. In the second data-related instruction sequence group, a preset standard frequency is used; F2. Arrange the AB type data-related instruction sequence in descending order of frequency above a preset standard frequency; F3. Execute customized instructions sequentially according to the frequency from high to low, wherein the customized instructions are used to implement the function of the AB type data related instruction sequence in the second data related instruction sequence group; F4. Implement the custom instructions on the user's target machine and obtain the second assembly code.
8. The instruction customization method of claim 7, wherein, The step of sequentially replacing the first assembly code with the second assembly code according to the second preset rule, and obtaining the third assembly code after all data-related instructions have been replaced, includes the following steps: G1. Replace the first assembly code with the second assembly code in descending order of frequency; G2. Repeat step G1 until all data-related instruction sequences in the first assembly code have been replaced; G3. Integrate and generate the third assembly code.
9. The instruction customization method as described in claim 1, characterized in that, The conversion tool for converting the binary executable code into first assembly code is a disassembler. The conversion tool used to transform the third assembly code into new binary executable code is an assembler.
10. An instruction customization system, comprising: include: The conversion module is used to convert between binary executable code and assembly code; The analysis module is used to analyze the data correlation of the instruction stream in order to obtain the first related instruction sequence group; The filtering module is used to filter the first related instruction sequence group to obtain the second related instruction sequence group; The instruction customization module is used to generate customized instructions in the second related instruction sequence group according to preset rules; The acquisition module is used to acquire the second assembly code; The replacement module is used to replace the first assembly code with the second assembly code and obtain the third assembly code; The output module is used to send new binary executable code to the user.