Code modification strategy determination method, apparatus, device, medium, and program product

By extracting and classifying code features using random forest and support vector machine models, and combining them with a retrieval-enhanced generative model for automated code adjustment, this approach solves the problem of low accuracy in traditional code optimization tools and achieves highly efficient code optimization.

CN119513709BActive Publication Date: 2026-06-23CHINA LIFE INSURANCE CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
CHINA LIFE INSURANCE CO LTD
Filing Date
2024-11-27
Publication Date
2026-06-23

AI Technical Summary

Technical Problem

Traditional code optimization tools have low accuracy and require significant human intervention, resulting in low efficiency.

Method used

We use a random forest model to extract key code features, combine them with a support vector machine model for classification, and use a retrieval-enhanced generative model to adjust the code, automatically locating and modifying optimization points.

Benefits of technology

It improves the accuracy and efficiency of code optimization, reduces interference from noise and redundant data, and automates the code optimization process.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN119513709B_ABST
    Figure CN119513709B_ABST
Patent Text Reader

Abstract

The application relates to a code modification strategy determination method, device, equipment, medium and program product. The code modification strategy determination method comprises the following steps: extracting key code features in to-be-optimized code based on a random forest model; classifying the key code features based on a support vector machine model to obtain a target optimization suggestion to which the to-be-optimized code belongs; and adjusting the to-be-optimized code according to the target optimization suggestion based on a retrieval enhancement generation model to obtain target code. Through the above method, automatic positioning and automatic modification of to-be-optimized points of to-be-optimized code are realized, and the accuracy and optimization efficiency of the to-be-optimized code are improved.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of code optimization technology, and in particular to a method, apparatus, device, medium, and program product for determining code modification strategies. Background Technology

[0002] With the continuous advancement of science and technology, the complexity of business code is increasing day by day, making code debugging and optimization work more and more critical.

[0003] In traditional technologies, rule engines and static analysis are widely used in the optimization process of business code to identify common code problems. However, while these tools can detect basic business rule conflicts and code non-standard practices, the accuracy of code optimization is low, often requiring deep human intervention to complete the optimization work, which undoubtedly reduces the efficiency of code optimization. Summary of the Invention

[0004] Therefore, it is necessary to provide a method, apparatus, device, medium, and program product for determining code modification strategies that can improve the efficiency and accuracy of code optimization, in response to the above-mentioned technical problems.

[0005] Firstly, this application provides a method for determining a code modification strategy, including:

[0006] Based on the random forest model, key code features are extracted from the code to be optimized.

[0007] Based on the support vector machine model, key code features are classified to obtain target optimization suggestions for the code to be optimized;

[0008] Based on the retrieval enhancement generation model, the code to be optimized is adjusted according to the target optimization suggestions to obtain the target code.

[0009] In one embodiment, the random forest model is trained as follows: obtaining a first training sample and the optimization suggestion category to which the first training sample belongs; wherein, the first training sample includes at least one of the following features corresponding to the first sample code: business characteristic features, system performance features, code structure features, and user behavior features; the first sample code includes business code and general code; performing feature extraction on the first training sample to obtain target sample features; using the target sample features as model input and the optimization suggestion category to which the first training sample belongs as a label, the pre-built random forest model is trained.

[0010] In one embodiment, the support vector machine model is trained as follows: obtaining a second training sample and the optimization suggestion category to which the second training sample belongs; wherein, the second training sample includes at least one of the following features corresponding to the second sample code: business characteristic features, system performance features, code structure features, and user behavior features; the second sample code includes business code and general code; extracting features from the second training sample according to the random forest model to obtain key sample features; using the key sample features as model input and the optimization suggestion category to which the second training sample belongs as a label, the pre-built support vector machine model is trained.

[0011] In one embodiment, based on the retrieval enhancement generation model and according to the target optimization suggestions, the code to be optimized is adjusted to obtain the target code, including: based on the retrieval enhancement generation model and according to the target optimization suggestions, determining the code segment to be modified and the code adjustment strategy of the code segment to be modified corresponding to the code to be optimized; and adjusting the code segment to be modified according to the code adjustment strategy of the code segment to be modified to obtain the target code.

[0012] In one embodiment, based on the retrieval enhancement generation model, and according to the target optimization suggestion, the code segment to be modified and the code adjustment strategy of the code segment to be modified are determined. This includes: performing feature encoding on the target optimization suggestion based on the encoding network in the retrieval enhancement generation model to obtain the suggestion vector feature of the target optimization suggestion; and performing feature encoding on the code to be optimized to obtain the code vector feature of the code to be optimized; and performing retrieval on the code vector feature based on the enhanced retrieval network in the retrieval enhancement generation model to obtain the code segment to be modified and the code adjustment strategy of the code segment to be modified in the code to be optimized.

[0013] In one embodiment, feature encoding is performed on the code to be optimized to obtain code vector features of the code to be optimized, including: dividing the code to be optimized into at least one code block according to the code structure of the code to be optimized; performing feature encoding on each code block to obtain code block features of the corresponding code block; wherein, the code vector features of the code to be optimized include code block features of at least one code block; correspondingly, based on the suggested vector features, searching in the code vector features to obtain the code segment to be modified in the code to be optimized includes: performing feature matching in the code block features of at least one code block according to the suggested vector features, and determining the code segment to be modified in the code to be optimized based on the matching result.

[0014] Secondly, this application also provides a code modification strategy determination device, comprising:

[0015] The first processing module is used to extract key code features from the code to be optimized based on the random forest model;

[0016] The second processing module is used to classify key code features based on the support vector machine model to obtain target optimization suggestions for the code to be optimized.

[0017] The third processing module is used to adjust the code to be optimized based on the retrieval enhancement generation model and the target optimization suggestions to obtain the target code.

[0018] Thirdly, this application also provides a computer device, including a memory and a processor, wherein the memory stores a computer program, and the processor executes the computer program to perform the following steps:

[0019] Based on the random forest model, key code features are extracted from the code to be optimized.

[0020] Based on the support vector machine model, key code features are classified to obtain target optimization suggestions for the code to be optimized;

[0021] Based on the retrieval enhancement generation model, the code to be optimized is adjusted according to the target optimization suggestions to obtain the target code.

[0022] Fourthly, this application also provides a computer-readable storage medium having a computer program stored thereon, which, when executed by a processor, performs the following steps:

[0023] Based on the random forest model, key code features are extracted from the code to be optimized.

[0024] Based on the support vector machine model, key code features are classified to obtain target optimization suggestions for the code to be optimized;

[0025] Based on the retrieval enhancement generation model, the code to be optimized is adjusted according to the target optimization suggestions to obtain the target code.

[0026] Fifthly, this application also provides a computer program product, including a computer program that, when executed by a processor, performs the following steps:

[0027] Based on the random forest model, key code features are extracted from the code to be optimized.

[0028] Based on the support vector machine model, key code features are classified to obtain target optimization suggestions for the code to be optimized;

[0029] Based on the retrieval enhancement generation model, the code to be optimized is adjusted according to the target optimization suggestions to obtain the target code.

[0030] The aforementioned code modification strategy identifies methods, devices, equipment, media, and program products. It extracts key code features from the code to be optimized using a random forest model, effectively reducing interference from noise and redundant data. Then, a support vector machine model is used to classify these key code features, enabling more accurate and effective generation of target optimization suggestions for the code to be optimized. The random forest model's ability to identify complex nonlinear relationships improves the accuracy of key code feature extraction, while the support vector machine model's strong classification and regression capabilities enhance the efficiency of target optimization suggestion generation. Furthermore, based on a retrieval-enhanced generation model, adjustments are made to the code to be optimized according to the target optimization suggestions, achieving automatic location and modification of the optimization points in the code to be optimized, thus improving the optimization efficiency. Attached Figure Description

[0031] To more clearly illustrate the technical solutions in the embodiments of this application or related technologies, the drawings used in the description of the embodiments of this application or related technologies will be briefly introduced below. Obviously, the drawings described below are only some embodiments of this application. For those skilled in the art, other related drawings can be obtained based on these drawings without creative effort.

[0032] Figure 1 This is a flowchart illustrating a method for determining code modification strategies in one embodiment;

[0033] Figure 2 This is a flowchart illustrating the steps for determining the code adjustment strategy in one embodiment;

[0034] Figure 3 This is a flowchart illustrating the code modification strategy determination method in another embodiment;

[0035] Figure 4 This is a structural block diagram of a code modification strategy determination device in one embodiment;

[0036] Figure 5 This is an internal structural diagram of a computer device in one embodiment. Detailed Implementation

[0037] To make the objectives, technical solutions, and advantages of this application clearer, the following detailed description is provided in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative and not intended to limit the scope of this application.

[0038] In one embodiment, such as Figure 1As shown, a method for determining code modification strategies is provided. This embodiment illustrates the application of this method to a terminal. It is understood that this method can also be applied to a server, and further to a system including both a terminal and a server, and is implemented through interaction between the terminal and the server. In this embodiment, the method includes the following steps:

[0039] S110. Based on the random forest model, extract key code features from the code to be optimized.

[0040] Random Forest Model (RFM) is an ensemble learning method that combines the predictions of multiple decision trees to improve the model's accuracy and stability, and has the ability to identify complex nonlinear relationships. Key code features can be understood as code features in the code to be optimized that are of the same type as pre-determined key sample features. Key sample features can be selected from all sample features based on the feature contribution of each sample feature in the splitting decision calculated by the Random Forest Model.

[0041] In an optional embodiment, the random forest model is trained as follows: obtaining a first training sample and the optimization suggestion category to which the first training sample belongs; wherein, the first training sample includes at least one of the following features corresponding to the first sample code: business characteristic features, system performance features, code structure features, and user behavior features; the first sample code includes business code and general code; performing feature extraction on the first training sample to obtain target sample features; using the target sample features as model input and the optimization suggestion category to which the first training sample belongs as a label, the pre-built random forest model is trained.

[0042] Optionally, based on a random forest model, the feature contribution of each sample feature in the splitting decision can be calculated; based on the feature contribution of each sample feature, a ranking of the importance of sample features can be generated, and the importance of sample features is used to measure the magnitude of the influence of sample features on the prediction results; based on the ranking of the importance of sample features, key sample features can be selected from all sample features. Correspondingly, based on the random forest model, extracting key code features from the code to be optimized includes: extracting key code features from the code to be optimized that are of the same type as the key sample features.

[0043] Optionally, feature extraction from the first training sample to obtain target sample features may include the following steps: cleaning and preprocessing the first training sample; extracting key target sample features from the cleaned and preprocessed first training sample. Target sample features may include at least one of the following: code complexity features, execution time features, module dependency features, memory usage features, and frequent anomaly features.

[0044] Optionally, the pre-built random forest model is obtained through the following steps: obtaining an initial random forest model; setting initial parameters for the initial random forest model, which may include the number of decision trees, maximum tree depth, and minimum number of leaf samples; determining the parameter combinations of the initial random forest model using a grid search method; testing the performance of the initial random forest model with each parameter combination using cross-validation, including the model's accuracy and generalization ability; and selecting the parameter combination corresponding to the optimal test performance to generate the random forest model. The parameter combinations include the number of decision trees, maximum tree depth, minimum number of leaf samples, and other relevant parameters.

[0045] Optionally, training the pre-built random forest model can involve using the target sample features as model input and the optimization suggestion category of the first training sample as a label. This training can include constructing multiple decision trees, training each tree using random sampling of the target sample features, and stopping the growth of the decision trees after reaching a preset maximum depth or minimum number of leaf nodes. Alternatively, before using the target sample features as model input, the target sample features can be normalized and encoded, and then used as model input in the form of feature vectors.

[0046] For example, generic code can be obtained from production code repositories. Generic code may include at least one of the following: adopted recommended code, modified and optimized code content, the latest generic code from external websites, and high-quality code with a rating greater than a preset level.

[0047] For example, business characteristic features may include at least one of the following: policy type features, policy effective date and validity period features, claim frequency features, claim amount features, and customer segmentation features. Among these, policy type features refer to the type of policy, and different types of policies typically have different optimization requirements; policy effective date and validity period features affect the business logic and processing efficiency of the code; claim frequency and claim amount features are used to optimize the processing efficiency of the code for high-frequency or high-value claims; customer segmentation features are obtained by segmenting customers based on factors such as age and risk preference.

[0048] For example, system performance characteristics may include at least one of execution time characteristics, memory usage characteristics, and processing complexity characteristics. Execution time characteristics refer to the execution time of a code block or specific function, which can be used to identify performance bottlenecks; memory usage characteristics are key performance characteristics in large-scale data processing; and processing complexity characteristics refer to the complexity of the code logic, used to identify parts that may require optimization.

[0049] For example, code structure features may include at least one of module dependency features, exception handling frequency features, and code modification frequency features. Module dependency features are used to identify dependencies between critical code modules; exception handling frequency features, i.e., the frequency and distribution of exception handling, are used to identify unstable parts of the code; and code modification frequency features are used to identify frequently modified code modules for further optimization.

[0050] For example, user behavior characteristics may include at least one of the following: recommendation adoption rate characteristics, optimization feedback characteristics, and usage frequency characteristics. The recommendation adoption rate characteristic records the business team's adoption of code optimization suggestions; the optimization feedback characteristic records the effectiveness of user feedback on the adoption of optimization suggestions; and the usage frequency characteristic records the frequency of use of code modules—a higher usage frequency indicates a greater impact on the business and a higher priority for optimization.

[0051] S120. Based on the support vector machine model, the key code features are classified to obtain the target optimization suggestions for the code to be optimized.

[0052] Support Vector Machine (SVM) is a supervised learning algorithm primarily used for classification and regression analysis. Its basic model is defined as a linear classifier with the largest margin in a specific space. Its learning strategy is margin maximization, which can ultimately be transformed into solving a convex quadratic programming problem.

[0053] In an optional implementation, the support vector machine model is trained as follows: obtaining a second training sample and the optimization suggestion category to which the second training sample belongs; wherein, the second training sample includes at least one of the following features corresponding to the second sample code: business characteristic features, system performance features, code structure features, and user behavior features; the second sample code includes business code and general code; extracting features from the second training sample according to the random forest model to obtain key sample features; using the key sample features as model input and the optimization suggestion category to which the second training sample belongs as a label, training the pre-built support vector machine model.

[0054] The first training sample and the second training sample can be the same or different training samples. Optionally, extracting key sample features from the second training sample using the random forest model may include the following steps: calculating the feature contribution of each sample feature in the second training sample to the splitting decision based on the random forest model; generating a ranking of the importance of sample features based on the feature contribution of each sample feature, where the importance of the sample features is used to measure the magnitude of the influence of the sample features on the prediction results; and selecting key sample features from all sample features based on the ranking of the importance of the sample features.

[0055] Optionally, the pre-built support vector machine (SVM) model can be obtained through the following steps: Based on key sample features, select a suitable kernel function for the SVM model; the kernel function may include at least one of linear kernels, polynomial kernels, and Gaussian kernels; determine the relevant parameters of the selected kernel function, which may include the order of the polynomial kernel or the γ parameter in the Gaussian kernel; determine the combination of regularization parameter values ​​for the SVM model using a grid search method; fine-tune the regularization parameter values ​​of the SVM model using cross-validation to determine the optimal regularization parameter values; and obtain the pre-built SVM model based on the selected kernel function and the optimal regularization parameter values. Here, the γ parameter is an adjustable parameter used to control the width of the Gaussian kernel. In the above steps, cross-validation is used, and the performance of the SVM model is tested through combinations of different regularization parameter values, thereby finding the optimal balance between the complexity and generalization ability of the SVM model and avoiding overfitting or underfitting.

[0056] Optionally, training the pre-built support vector machine (SVM) model may include: learning support vectors and decision boundaries based on the optimization suggestions associated with the second training samples and the second training samples, and constructing the optimal classification or regression boundary based on maximizing the distance from the support vectors to the decision boundary. Optionally, after training the pre-built SVM model, the process may also include testing the performance metrics of the SVM model on a validation dataset, and evaluating the classification or regression performance of the SVM model based on the performance metrics. These performance metrics may include the precision and recall of the SVM model. Optionally, after evaluating the classification or regression performance of the SVM model, the process may also include: saving the final SVM model and embedding it into an intelligent code generation tool for generating code optimization suggestions or business-related (e.g., insurance, finance) analyses.

[0057] S130. Based on the retrieval enhancement generation model, adjust the code to be optimized according to the target optimization suggestions to obtain the target code.

[0058] Among them, Retrieval-Augmented Generation (RAG) is a processing technology that combines retrieval and generation functions.

[0059] In an optional embodiment, adjusting the code to be optimized based on the retrieval-enhanced generation model and the target optimization suggestions to obtain the target code includes: determining the code segment to be modified and the code adjustment strategy for the code segment to be modified based on the retrieval-enhanced generation model and the target optimization suggestions; adjusting the code segment to be modified according to the code adjustment strategy to obtain the target code. Through the above steps, automatic location and modification of the code segment to be modified are achieved, improving the optimization efficiency of the code to be optimized. For example, locating the code segment to be modified corresponding to a function with high complexity in the code to be optimized, and performing performance optimization on the code segment to be modified based on the target optimization suggestions. For example, locating the code segment to be modified that has potential security issues in the code to be optimized, and performing performance optimization on the code segment to be modified based on the target optimization suggestions.

[0060] In an optional implementation, after adjusting the code to be optimized to obtain the target code, the method further includes: generating an automatic modification log to facilitate subsequent review and tracking. The modification log may include at least one of the following: the specific content of the adjustments to the code to be optimized, the reason for the modifications, and the location of the modifications.

[0061] In an optional implementation, after adjusting the code to be optimized to obtain the target code, the method further includes: verifying and evaluating the target code. For example, verifying and evaluating the target code may include the following steps: marking automatically modified code areas in the target code on the IDE (Integrated Development Environment); running the code to be optimized and the target code separately in the IDE and collecting performance data during actual runtime; comparing the performance data of the code to be optimized and the target code to evaluate the optimization effect. The marking may include comments and tags, etc., to determine the accuracy of the modifications and ensure that all modifications are clearly recorded. Performance data may include at least one of execution time, memory consumption, and error logs, used to verify the optimization effect of the code modifications. Evaluating the optimization effect includes, for example, checking whether execution time has decreased, memory consumption has decreased, or the number of exceptions in the error log has decreased. Optionally, when the optimization effect does not meet expectations, feedback information is generated, and this feedback information is used as the basis for optimizing the retrieval enhancement generation model, and the retrieval enhancement generation model is continuously updated and iterated to further improve its accuracy.

[0062] In an optional implementation, the code modification strategy determination method further includes: pushing target code, recording the push and adoption status of the target code, and generating a user behavior log. Optionally, a comment file is used to record user code adoption behavior, such as comments, changes, and timestamp information for specific lines of the target code. For example, this log can be saved in binary format.

[0063] In an optional implementation, the code modification strategy determination method further includes: continuously iteratively optimizing the random forest model, support vector machine model, and retrieval enhancement generation model based on the optimization effect evaluation results of the target code and the push adoption status of the target code, so as to improve the quality of the target optimization suggestions.

[0064] In this embodiment, a random forest model is used to extract key code features from the code to be optimized, effectively reducing the interference of noise and redundant data. Then, a support vector machine model is used to classify these key code features, enabling more accurate and effective generation of target optimization suggestions for the code to be optimized. The random forest model has the ability to identify complex nonlinear relationships, which helps improve the accuracy of key code feature extraction, while the support vector machine model has strong classification and regression capabilities, improving the efficiency of target optimization suggestion generation. Furthermore, based on a retrieval-enhanced generation model, the code to be optimized is adjusted according to the target optimization suggestions, achieving automatic location and modification of the optimization points of the code to be optimized, thus improving the optimization efficiency.

[0065] Based on the technical solutions of the above embodiments, this application also provides an optional embodiment in which the steps for determining the code adjustment strategy are refined.

[0066] See Figure 2 The steps for determining the code adjustment strategy shown include:

[0067] S210. Based on the encoding network in the retrieval enhancement generative model, feature encoding is performed on the target optimization suggestions to obtain the suggestion vector features of the target optimization suggestions.

[0068] S220. Perform feature encoding on the code to be optimized to obtain the code vector features of the code to be optimized.

[0069] In an optional embodiment, feature encoding is performed on the code to be optimized to obtain code vector features of the code to be optimized, including: dividing the code to be optimized into at least one code block according to the code structure of the code to be optimized; and performing feature encoding on each code block to obtain the code block features of the corresponding code block. The code vector features of the code to be optimized include the code block features of at least one code block.

[0070] For example, the code structure of the code to be optimized can be the code framework of the code to be optimized, so that the target optimization suggestions can be directly applied to the corresponding code framework, automatically modifying some parts of the code framework, which helps to improve optimization efficiency.

[0071] S230. Based on the enhanced retrieval network in the retrieval enhancement generative model, a retrieval is performed in the code vector features according to the suggestion vector features to obtain the code segment to be modified and the code adjustment strategy of the code segment to be modified in the code to be optimized.

[0072] In an optional embodiment, the process of retrieving code segments to be modified in the code to be optimized based on the suggestion vector features includes: performing feature matching in the code block features of at least one code block based on the suggestion vector features, and determining the code segments to be modified in the code to be optimized based on the matching results.

[0073] Optionally, feature matching in the code block features of at least one code block, based on the suggested vector features, includes: calculating the vector similarity between the suggested vector features and the code block features; determining that the suggested vector features match the code block features if the vector similarity is greater than a preset similarity value; and identifying the code segment to be modified in the code to be optimized by locating the code block features that match the suggested vector features.

[0074] In this embodiment, by introducing an encoding network from the retrieval enhancement generative model and performing feature encoding on the target optimization suggestions and the code to be optimized, a matching basis is provided for locating the code segment to be modified in the code to be optimized. Based on the enhanced retrieval network in the retrieval enhancement generative model, a search is performed on the code vector features according to the suggestion vector features, thereby obtaining the code segment to be modified in the code to be optimized and the code adjustment strategy for the code segment to be modified, which improves the efficiency of locating the code segment to be modified and facilitates the optimization and modification of the code segment to be modified.

[0075] Based on the technical solutions of the above embodiments, this application also provides an optional embodiment, in which another specific implementation of the code modification strategy determination method is provided.

[0076] See Figure 3 The code modification strategy determination method shown includes:

[0077] S301. Obtain the first training sample and the optimization suggestion category to which the first training sample belongs; wherein, the first training sample includes at least one of the following features corresponding to the first sample code: business characteristic features, system performance features, code structure features, and user behavior features; the first sample code includes business code and general code;

[0078] S302. Extract features from the first training sample to obtain the target sample features.

[0079] S303. Use the target sample features as model input and the optimization suggestion category to which the first training sample belongs as label to train the pre-built random forest model.

[0080] S304. Obtain the second training sample and the optimization suggestion category to which the second training sample belongs; wherein, the second training sample includes at least one of the following features corresponding to the second sample code: business characteristic features, system performance features, code structure features and user behavior features; the second sample code includes business code and general code.

[0081] S305. Extract features from the second training sample using the random forest model to obtain key sample features.

[0082] S306. Use the key sample features as model input and the optimization suggestion category to which the second training sample belongs as label to train the pre-built support vector machine model.

[0083] S307. Based on the random forest model, extract key code features from the code to be optimized.

[0084] S308. Based on the support vector machine model, the key code features are classified to obtain the target optimization suggestions for the code to be optimized.

[0085] S309. Based on the retrieval enhancement generation model, and according to the target optimization suggestions, determine the code segment to be modified and the code adjustment strategy for the code segment to be modified in the code to be optimized.

[0086] S310. Adjust the code segment to be modified according to the code adjustment strategy to obtain the target code.

[0087] The detailed technical content of steps S301-S310 has been described in the above embodiments and will not be repeated here.

[0088] It should be understood that although the steps in the flowcharts of the embodiments described above are shown sequentially according to the arrows, these steps are not necessarily executed in the order indicated by the arrows. Unless explicitly stated herein, there is no strict order restriction on the execution of these steps, and they can be executed in other orders. Moreover, at least some steps in the flowcharts of the embodiments described above may include multiple steps or multiple stages. These steps or stages are not necessarily completed at the same time, but can be executed at different times. The execution order of these steps or stages is not necessarily sequential, but can be performed alternately or in turn with other steps or at least some of the steps or stages of other steps.

[0089] Based on the same inventive concept, this application also provides a code modification strategy determination apparatus for implementing the code modification strategy determination method described above. The solution provided by this apparatus is similar to the implementation scheme described in the above method; therefore, the specific limitations in one or more embodiments of the code modification strategy determination apparatus provided below can be found in the limitations of the code modification strategy determination method described above, and will not be repeated here.

[0090] In one exemplary embodiment, such as Figure 4 As shown, a code modification strategy determination device is provided, comprising: a first processing module 410, a second processing module 420, and a third processing module 430, wherein:

[0091] The first processing module 410 is used to extract key code features from the code to be optimized based on the random forest model;

[0092] The second processing module 420 is used to classify key code features based on the support vector machine model to obtain target optimization suggestions for the code to be optimized.

[0093] The third processing module 430 is used to adjust the code to be optimized based on the retrieval enhancement generation model and the target optimization suggestions to obtain the target code.

[0094] In one embodiment, the code modification strategy determination device further includes: a first acquisition module, configured to acquire a first training sample and the optimization suggestion category to which the first training sample belongs; wherein the first training sample includes at least one of the following features corresponding to the first sample code: business characteristic features, system performance features, code structure features, and user behavior features; the first sample code includes business code and general code; a first extraction module, configured to extract features from the first training sample to obtain target sample features; and a first training module, configured to use the target sample features as model input and the optimization suggestion category to which the first training sample belongs as a label to train a pre-built random forest model.

[0095] In one embodiment, the code modification strategy determination device further includes: a second acquisition module, configured to acquire a second training sample and the optimization suggestion category to which the second training sample belongs; wherein the second training sample includes at least one of the following features corresponding to the second sample code: business characteristic features, system performance features, code structure features, and user behavior features; the second sample code includes business code and general code; a second extraction module, configured to extract features from the second training sample according to a random forest model to obtain key sample features; and a second training module, configured to use the key sample features as model input and the optimization suggestion category to which the second training sample belongs as a label to train a pre-built support vector machine model.

[0096] In one embodiment, the third processing module 430 includes: a first determining unit, configured to determine, based on the retrieval enhancement generation model and according to the target optimization suggestion, the code segment to be modified corresponding to the code to be optimized and the code adjustment strategy of the code segment to be modified; and a first adjusting unit, configured to adjust the code segment to be modified according to the code adjustment strategy of the code segment to be modified, to obtain the target code.

[0097] In one embodiment, the first determining unit includes: a first encoding subunit, configured to perform feature encoding on the target optimization proposal based on the encoding network in the retrieval-enhanced generative model to obtain the proposal vector features of the target optimization proposal; and a second encoding subunit, configured to perform feature encoding on the code to be optimized to obtain the code vector features of the code to be optimized; and a first retrieval subunit, configured to perform retrieval in the code vector features based on the enhanced retrieval network in the retrieval-enhanced generative model, according to the proposal vector features, to obtain the code segment to be modified and the code adjustment strategy of the code segment to be modified in the code to be optimized.

[0098] In one embodiment, the second encoding subunit is specifically used to divide the code to be optimized into at least one code block according to the code structure of the code to be optimized; perform feature encoding on each code block to obtain the code block features of the corresponding code block; wherein, the code vector features of the code to be optimized include the code block features of at least one code block; the first retrieval subunit is specifically used to perform feature matching in the code block features of at least one code block according to the suggestion vector features, and determine the code segment to be modified in the code to be optimized according to the matching result.

[0099] The code modification strategy described above determines that each module in the device can be implemented entirely or partially through software, hardware, or a combination thereof. These modules can be embedded in or independent of the processor in a computer device, or stored in the memory of a computer device as software, so that the processor can call and execute the operations corresponding to each module.

[0100] In one exemplary embodiment, a computer device is provided, which may be a terminal, and its internal structure diagram may be as follows: Figure 5 As shown, the computer device includes a processor, memory, input / output interfaces, a communication interface, a display unit, and an input device. The processor, memory, and input / output interfaces are connected via a system bus, and the communication interface, display unit, and input device are also connected to the system bus via the input / output interfaces. The processor provides computing and control capabilities. The memory includes non-volatile storage media and internal memory. The non-volatile storage media stores the operating system and computer programs. The internal memory provides an environment for the operation of the operating system and computer programs stored in the non-volatile storage media. The input / output interfaces are used for exchanging information between the processor and external devices. The communication interface is used for wired or wireless communication with external terminals; wireless communication can be achieved through Wi-Fi, mobile cellular networks, Near Field Communication (NFC), or other technologies. When the computer program is executed by the processor, it implements a code modification strategy determination method. The display unit is used to form a visually visible image and can be a display screen, a projection device, or a virtual reality imaging device. The display screen can be an LCD screen or an e-ink screen. The input device of the computer device can be a touch layer covering the display screen, or buttons, trackballs, or touchpads set on the casing of the computer device, or external keyboards, touchpads, or mice, etc.

[0101] Those skilled in the art will understand that Figure 5The structure shown is merely a block diagram of a portion of the structure related to the present application and does not constitute a limitation on the computer device to which the present application is applied. Specific computer devices may include more or fewer components than those shown in the figure, or combine certain components, or have different component arrangements.

[0102] In one exemplary embodiment, a computer device is provided, including a memory and a processor, wherein the memory stores a computer program, and the processor executes the computer program to perform the following steps:

[0103] Based on the random forest model, key code features are extracted from the code to be optimized.

[0104] Based on the support vector machine model, key code features are classified to obtain target optimization suggestions for the code to be optimized;

[0105] Based on the retrieval enhancement generation model, the code to be optimized is adjusted according to the target optimization suggestions to obtain the target code.

[0106] In one embodiment, when the processor executes the computer program, it further performs the following steps: obtaining a first training sample and the optimization suggestion category to which the first training sample belongs; wherein, the first training sample includes at least one of the following features corresponding to the first sample code: business characteristic features, system performance features, code structure features, and user behavior features; the first sample code includes business code and general code; performing feature extraction on the first training sample to obtain target sample features; using the target sample features as model input and the optimization suggestion category to which the first training sample belongs as a label, and training the pre-built random forest model.

[0107] In one embodiment, when the processor executes the computer program, it further performs the following steps: obtaining a second training sample and the optimization suggestion category to which the second training sample belongs; wherein, the second training sample includes at least one of the following features corresponding to the second sample code: business characteristic features, system performance features, code structure features, and user behavior features; the second sample code includes business code and general code; extracting features from the second training sample according to the random forest model to obtain key sample features; using the key sample features as model input and the optimization suggestion category to which the second training sample belongs as a label to train the pre-built support vector machine model.

[0108] In one embodiment, when the processor executes the computer program, it further performs the following steps: based on the retrieval-enhanced generative model, and according to the target optimization suggestion, determines the code segment to be modified and the code adjustment strategy of the code segment to be modified corresponding to the code to be optimized; and adjusts the code segment to be modified according to the code adjustment strategy of the code segment to be modified to obtain the target code.

[0109] In one embodiment, based on the encoding network in the retrieval-enhanced generative model, feature encoding is performed on the target optimization proposal to obtain the proposal vector features of the target optimization proposal; and feature encoding is performed on the code to be optimized to obtain the code vector features of the code to be optimized; based on the enhanced retrieval network in the retrieval-enhanced generative model, retrieval is performed on the code vector features according to the proposal vector features to obtain the code segment to be modified and the code adjustment strategy of the code segment to be modified in the code to be optimized.

[0110] In one embodiment, the code to be optimized is divided into at least one code block according to the code structure of the code to be optimized; feature encoding is performed on each code block to obtain the code block features of the corresponding code block; wherein, the code vector features of the code to be optimized include the code block features of at least one code block; feature matching is performed on the code block features of at least one code block according to the suggested vector features, and the code segment to be modified in the code to be optimized is determined according to the matching result.

[0111] In one embodiment, a computer-readable storage medium is provided having a computer program stored thereon, the computer program performing the following steps when executed by a processor:

[0112] Based on the random forest model, key code features are extracted from the code to be optimized.

[0113] Based on the support vector machine model, key code features are classified to obtain target optimization suggestions for the code to be optimized;

[0114] Based on the retrieval enhancement generation model, the code to be optimized is adjusted according to the target optimization suggestions to obtain the target code.

[0115] In one embodiment, when the computer program is executed by the processor, it further performs the following steps: obtaining a first training sample and the optimization suggestion category to which the first training sample belongs; wherein, the first training sample includes at least one of the following features corresponding to the first sample code: business characteristic features, system performance features, code structure features, and user behavior features; the first sample code includes business code and general code; performing feature extraction on the first training sample to obtain target sample features; using the target sample features as model input and the optimization suggestion category to which the first training sample belongs as a label to train a pre-built random forest model.

[0116] In one embodiment, the second training sample and the optimization suggestion category to which the second training sample belongs are obtained; wherein, the second training sample includes at least one of the following features corresponding to the second sample code: business characteristic features, system performance features, code structure features, and user behavior features; the second sample code includes business code and general code; features are extracted from the second training sample according to the random forest model to obtain key sample features; the key sample features are used as model input, and the optimization suggestion category to which the second training sample belongs is used as a label to train the pre-built support vector machine model.

[0117] In one embodiment, based on the retrieval-enhanced generation model, the code segment to be modified and the code adjustment strategy for the code segment to be modified are determined according to the target optimization suggestions; the code segment to be modified is adjusted according to the code adjustment strategy to obtain the target code.

[0118] In one embodiment, based on the encoding network in the retrieval-enhanced generative model, feature encoding is performed on the target optimization proposal to obtain the proposal vector features of the target optimization proposal; and feature encoding is performed on the code to be optimized to obtain the code vector features of the code to be optimized; based on the enhanced retrieval network in the retrieval-enhanced generative model, retrieval is performed on the code vector features according to the proposal vector features to obtain the code segment to be modified and the code adjustment strategy of the code segment to be modified in the code to be optimized.

[0119] In one embodiment, the code to be optimized is divided into at least one code block according to the code structure of the code to be optimized; feature encoding is performed on each code block to obtain the code block features of the corresponding code block; wherein, the code vector features of the code to be optimized include the code block features of at least one code block; feature matching is performed on the code block features of at least one code block according to the suggested vector features, and the code segment to be modified in the code to be optimized is determined according to the matching result.

[0120] In one embodiment, a computer program product is provided, including a computer program that, when executed by a processor, performs the following steps:

[0121] Based on the random forest model, key code features are extracted from the code to be optimized.

[0122] Based on the support vector machine model, key code features are classified to obtain target optimization suggestions for the code to be optimized;

[0123] Based on the retrieval enhancement generation model, the code to be optimized is adjusted according to the target optimization suggestions to obtain the target code.

[0124] In one embodiment, when the computer program is executed by the processor, it further performs the following steps: obtaining a first training sample and the optimization suggestion category to which the first training sample belongs; wherein, the first training sample includes at least one of the following features corresponding to the first sample code: business characteristic features, system performance features, code structure features, and user behavior features; the first sample code includes business code and general code; performing feature extraction on the first training sample to obtain target sample features; using the target sample features as model input and the optimization suggestion category to which the first training sample belongs as a label to train a pre-built random forest model.

[0125] In one embodiment, the second training sample and the optimization suggestion category to which the second training sample belongs are obtained; wherein, the second training sample includes at least one of the following features corresponding to the second sample code: business characteristic features, system performance features, code structure features, and user behavior features; the second sample code includes business code and general code; features are extracted from the second training sample according to the random forest model to obtain key sample features; the key sample features are used as model input, and the optimization suggestion category to which the second training sample belongs is used as a label to train the pre-built support vector machine model.

[0126] In one embodiment, based on the retrieval-enhanced generation model, the code segment to be modified and the code adjustment strategy for the code segment to be modified are determined according to the target optimization suggestions; the code segment to be modified is adjusted according to the code adjustment strategy to obtain the target code.

[0127] In one embodiment, based on the encoding network in the retrieval-enhanced generative model, feature encoding is performed on the target optimization proposal to obtain the proposal vector features of the target optimization proposal; and feature encoding is performed on the code to be optimized to obtain the code vector features of the code to be optimized; based on the enhanced retrieval network in the retrieval-enhanced generative model, retrieval is performed on the code vector features according to the proposal vector features to obtain the code segment to be modified and the code adjustment strategy of the code segment to be modified in the code to be optimized.

[0128] In one embodiment, the code to be optimized is divided into at least one code block according to the code structure of the code to be optimized; feature encoding is performed on each code block to obtain the code block features of the corresponding code block; wherein, the code vector features of the code to be optimized include the code block features of at least one code block; feature matching is performed on the code block features of at least one code block according to the suggested vector features, and the code segment to be modified in the code to be optimized is determined according to the matching result.

[0129] Those skilled in the art will understand that all or part of the processes in the methods of the above embodiments can be implemented by a computer program instructing related hardware. The computer program can be stored in a non-volatile computer-readable storage medium, and when executed, it can include the processes of the embodiments of the above methods. Any references to memory, databases, or other media used in the embodiments provided in this application can include at least one of non-volatile memory and volatile memory. Non-volatile memory can include read-only memory (ROM), magnetic tape, floppy disk, flash memory, optical memory, high-density embedded non-volatile memory, resistive random access memory (ReRAM), magnetic random access memory (MRAM), ferroelectric random access memory (FRAM), phase change memory (PCM), graphene memory, etc. Volatile memory can include random access memory (RAM) or external cache memory, etc. By way of illustration and not limitation, RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM). The databases involved in the embodiments provided in this application may include at least one type of relational database and non-relational database. Non-relational databases may include, but are not limited to, blockchain-based distributed databases. The processors involved in the embodiments provided in this application may be general-purpose processors, central processing units, graphics processing units, digital signal processors, programmable logic devices, quantum computing-based data processing logic devices, artificial intelligence (AI) processors, etc., and are not limited to these.

[0130] The technical features of the above embodiments can be combined in any way. For the sake of brevity, not all possible combinations of the technical features in the above embodiments are described. However, as long as there is no contradiction in the combination of these technical features, they should be considered to be within the scope of this application.

[0131] The embodiments described above are merely illustrative of several implementation methods of this application, and while the descriptions are specific and detailed, they should not be construed as limiting the scope of this patent application. It should be noted that those skilled in the art can make various modifications and improvements without departing from the concept of this application, and these all fall within the protection scope of this application. Therefore, the protection scope of this application should be determined by the appended claims.

Claims

1. A method for determining a code modification strategy, characterized in that, include: Based on the random forest model, key code features are extracted from the code to be optimized. The key code features are code features in the code to be optimized that are of the same type as the pre-determined key sample features. The key sample features are the feature contributions of each sample feature in the splitting decision calculated based on the random forest model, and are selected from the sample features; the sample features include at least one of code complexity features, execution time features, module dependency features, memory usage features, and abnormally frequent features; Based on the support vector machine model, the key code features are classified to obtain the target optimization suggestions for the code to be optimized; Based on the retrieval enhancement generation model, and according to the target optimization suggestions, the code segment to be modified and the code adjustment strategy of the code segment to be modified are determined in the code to be optimized. Based on the code adjustment strategy for the code segment to be modified, the code segment to be modified is adjusted to obtain the target code.

2. The method according to claim 1, characterized in that, The random forest model was trained using the following method: Obtain a first training sample and the optimization suggestion category to which the first training sample belongs; wherein, the first training sample includes at least one of the following features corresponding to the first sample code: business characteristic features, system performance features, code structure features, and user behavior features; the first sample code includes business code and general code; Feature extraction is performed on the first training sample to obtain the target sample features; The target sample features are used as model input, and the optimization suggestion category to which the first training sample belongs is used as the label to train the pre-built random forest model.

3. The method according to claim 1, characterized in that, The support vector machine model is trained in the following way: Obtain a second training sample and the optimization suggestion category to which the second training sample belongs; wherein, the second training sample includes at least one of the following features corresponding to the second sample code: business characteristic features, system performance features, code structure features, and user behavior features; the second sample code includes business code and general code; Based on the random forest model, feature extraction is performed on the second training sample to obtain key sample features; The key sample features are used as model input, and the optimization suggestion category to which the second training sample belongs is used as the label to train the pre-built support vector machine model.

4. The method according to claim 3, characterized in that, The pre-built support vector machine model is determined according to the following steps: The kernel function of the support vector machine model is selected based on the key sample features; The relevant parameters of the kernel function are determined, and the combination of regularization parameter values ​​of the support vector machine model is determined based on the grid search method; The regularization parameter values ​​of the support vector machine model are tuned based on the cross-validation method to determine the optimal regularization parameter values. The pre-built support vector machine model is determined based on the kernel function and the optimal regularization parameter value.

5. The method according to any one of claims 1-4, characterized in that, The retrieval-enhanced generation model, based on the target optimization suggestions, determines the code segment to be modified corresponding to the code to be optimized and the code adjustment strategy for the code segment to be modified, including: Based on the encoding network in the retrieval-enhanced generative model, feature encoding is performed on the target optimization proposal to obtain the proposal vector feature of the target optimization proposal; and... The code to be optimized is feature-encoded to obtain the code vector features of the code to be optimized; Based on the enhanced retrieval network in the retrieval enhancement generative model, the code vector features are searched according to the suggestion vector features to obtain the code segment to be modified in the code to be optimized and the code adjustment strategy of the code segment to be modified.

6. The method according to claim 5, characterized in that, The step of performing feature encoding on the code to be optimized to obtain the code vector features of the code to be optimized includes: Based on the code structure of the code to be optimized, the code to be optimized is divided into at least one code block; For each code block, feature encoding is performed to obtain the code block features of the corresponding code block; wherein, the code vector features of the code to be optimized include the code block features of the at least one code block; Accordingly, the step of retrieving the code segment to be modified in the code to be optimized from the code based on the suggested vector features includes: Based on the suggested vector features, feature matching is performed on the code block features of the at least one code block, and based on the matching results, the code segment to be modified in the code to be optimized is determined.

7. A code modification strategy determination device, characterized in that, The device includes: The first processing module is used to extract key code features from the code to be optimized based on a random forest model. The key code features are code features in the code to be optimized that are of the same type as the pre-determined key sample features. The key sample features are selected from each sample feature by filtering based on the feature contribution of each sample feature in the splitting decision calculated by the random forest model. The sample features include at least one of code complexity features, execution time features, module dependency features, memory usage features, and abnormally frequent features. The second processing module is used to classify the key code features based on the support vector machine model to obtain the target optimization suggestions for the code to be optimized. The third processing module is used to determine the code segment to be modified and the code adjustment strategy of the code segment to be modified in the code to be optimized based on the retrieval enhancement generation model and the target optimization suggestion; and to adjust the code segment to be modified according to the code adjustment strategy of the code segment to be modified to obtain the target code.

8. A computer device comprising a memory and a processor, wherein the memory stores a computer program, characterized in that, When the processor executes the computer program, it implements the steps of the method according to any one of claims 1 to 6.

9. A computer-readable storage medium having a computer program stored thereon, characterized in that, When the computer program is executed by a processor, it implements the steps of the method according to any one of claims 1 to 6.

10. A computer program product, comprising a computer program, characterized in that, When the computer program is executed by a processor, it implements the steps of the method according to any one of claims 1 to 6.