A front-end code smell refactoring method, device, equipment and medium
By generating an abstract syntax tree and combining multiple expert models and large language models for front-end code smell detection and refactoring, this technology solves the problems of time-consuming and laborious manual detection and insufficient tool recognition in existing technologies, and achieves efficient and accurate code quality improvement.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- TIANFU JIANGXI LAB
- Filing Date
- 2026-03-20
- Publication Date
- 2026-06-19
AI Technical Summary
In existing technologies, front-end code smell detection and refactoring mainly rely on manual methods, which are time-consuming, labor-intensive, and difficult to cover all problems. Existing tools cannot accurately identify design flaws specific to the framework and lack sufficient automation.
By parsing the front-end source code to generate an abstract syntax tree, combining multiple expert models (rule model, machine learning model and specific front-end framework model) to perform bad smell detection, generating a refactoring strategy, and using a large language model to collaboratively generate the final strategy for automated refactoring and verification.
It significantly improves the accuracy and coverage of code problem identification, generates reasonable and feasible refactoring strategies, reduces false positive rates and regression error risks, and enhances development efficiency and code quality transparency.
Smart Images

Figure CN122240166A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of data processing technology, and specifically to a method, apparatus, device, and medium for refactoring front-end code smells. Background Technology
[0002] In front-end development, as projects grow in size and complexity, various "code smells" often appear in the code, such as code duplication, excessively long functions or components, and overly complex conditional logic. These code smells are not direct program errors, but they often indicate deeper design problems, potentially leading to maintenance difficulties, poor scalability, and reduced readability. If not addressed promptly, code smells increase software maintenance costs and quality risks, hindering team collaboration and project evolution.
[0003] The industry has recognized the dangers of code smells and proposed refactoring as a way to eliminate them. Refactoring refers to adjusting the internal structure of software without changing its external behavior to improve maintainability and readability. Good refactoring practices can reduce code size, reorganize messy code structures, and make the code more concise and clear. However, traditional refactoring work mainly relies on manual work, where developers need to rely on experience to find code smells and manually optimize them. This approach is not only time-consuming and labor-intensive but also prone to errors, especially in large projects where manual review and refactoring often fail to cover all issues.
[0004] In recent years, automation tools have begun to be used to assist in the detection and repair of code smells. For example, static code analysis tools (such as ESLint and SonarQube) can scan code and mark potential problem patterns, and some IDE plugins can also provide refactoring suggestions for common smells. However, existing tools can usually only detect a limited number of patterns, have limited understanding of context and semantics, and cannot automatically perform complex refactoring operations. In addition, front-end code (such as React components and Vue components) often has framework-specific structures and semantics, making it difficult for general-purpose tools to accurately identify design flaws. Therefore, there is an urgent need for a more intelligent and comprehensive method for automated refactoring and code smell detection and repair of front-end code, capable of automatically discovering various code smells in front-end code and performing corresponding refactoring and repair, thereby improving the quality and development efficiency of front-end software. Summary of the Invention
[0005] The purpose of this invention is to provide a method, apparatus, device, and medium for refactoring front-end code smells, which solves the problems in the prior art.
[0006] This invention is achieved through the following technical solution:
[0007] In a first aspect, embodiments of the present invention provide a method for refactoring front-end code smells, including:
[0008] Parse the front-end source code and generate an abstract syntax tree;
[0009] Perform bad smell detection on the abstract syntax tree and output the detection results containing bad smell type, location, severity and context information;
[0010] Based on the detection results, a corresponding reconstruction strategy is generated, wherein the reconstruction strategy is generated collaboratively by a pre-set reconstruction rule template and a large language model;
[0011] Based on the refactoring strategy, the abstract syntax tree is modified to generate refactored code;
[0012] Perform automated testing and code smell elimination verification on the refactored code, and output a refactoring report that includes a comparison of the code before and after refactoring, a list of code smells, and verification results.
[0013] Preferably, the step of performing bad smell detection on the abstract syntax tree and outputting a detection result containing bad smell type, location, severity, and contextual information includes:
[0014] Static code features are extracted from the abstract syntax tree, and the static code features include at least one of node type distribution, code metrics, and framework identifier features;
[0015] The extracted static code features are input into the gating network, and the output is a matching expert model, which includes a rule model, a machine learning model, and a specific front-end framework model.
[0016] The abstract syntax tree is detected based on the expert model. If the detection result indicates that at least one code smell exists, the detection result of the smell type, location, severity, and context information of each code smell is determined.
[0017] Preferably, when the expert model is a rule-based model, the step of detecting the abstract syntax tree based on the expert model includes:
[0018] Load a bad smell rule base that matches the code type of the abstract syntax tree, wherein each rule in the bad smell rule base includes a bad smell type identifier, detection conditions, and the node type of the associated abstract syntax tree;
[0019] Traverse the abstract syntax tree and calculate metrics, including lines of code, number of statements, cyclomatic complexity, nesting depth, number of parameters, and frequency of variable usage;
[0020] Based on the conditions in the metrics and rules, determine whether code smells have been detected and their severity.
[0021] If a code smell is detected, determine the type, location, and context of the code smell.
[0022] Preferably, when the expert model is a machine learning model, the step of detecting the abstract syntax tree based on the expert model includes:
[0023] Multidimensional feature vectors are extracted from the subtrees of the abstract syntax tree, and the multidimensional feature vectors include structural features, semantic features and contextual features;
[0024] The multidimensional feature vector is input into the trained machine learning model to obtain at least one candidate bad taste type and the probability of each candidate bad taste type output by the machine learning model.
[0025] If the probability of any candidate bad flavor type exceeds the probability threshold, then the candidate bad flavor type is determined as the bad flavor type of the abstract syntax tree.
[0026] Determine the location and context information of the bad odor node corresponding to the bad odor based on the bad odor type and the corresponding subtree;
[0027] The severity score is calculated based on the metrics of the odor nodes, and the levels are divided according to preset thresholds.
[0028] Preferably, when the expert model is a specific front-end framework model, the step of detecting the abstract syntax tree based on the expert model includes:
[0029] The front-end framework type is identified based on the feature nodes in the abstract syntax tree;
[0030] Load and recognize the corresponding rule engine;
[0031] Based on the rule engine, pattern matching is performed on the abstract syntax tree to identify and output bad smell types;
[0032] Extract source code location and context information from the abstract syntax tree nodes that trigger rules in pattern matching;
[0033] The impact of the aforementioned flavor types on framework runtime performance, code maintainability, and functional risks is analyzed, and quantitative scores are calculated and severity levels are classified.
[0034] Preferably, generating a corresponding reconstruction strategy based on the detection results includes:
[0035] Based on the type of odor in the detection results, the corresponding reconstruction strategy template is searched in the preset reconstruction rule template library;
[0036] The reconstruction strategy template is instantiated using the context information in the detection results to generate a first reconstruction strategy.
[0037] Preferably, the method further includes:
[0038] The detection results are used as prompts to input into the large language model to obtain the second reconstruction strategy;
[0039] The second reconstruction strategy is verified. If the second reconstruction strategy is not feasible, the first reconstruction strategy is adopted as the final reconstruction strategy.
[0040] If the second reconstruction strategy is feasible, then the second reconstruction strategy shall be adopted as the final reconstruction strategy.
[0041] Secondly, embodiments of the present invention provide a front-end code smell refactoring device, comprising:
[0042] The parsing module is used to parse the front-end source code and generate an abstract syntax tree;
[0043] The detection module is used to perform bad smell detection on the abstract syntax tree and output the detection results including bad smell type, location, severity and context information;
[0044] The strategy module is used to generate a corresponding reconstruction strategy based on the detection results, wherein the reconstruction strategy is generated collaboratively by a preset reconstruction rule template and a large language model;
[0045] The modification module is used to modify the abstract syntax tree according to the refactoring strategy and generate refactored code;
[0046] The reporting module is used to perform automated testing and code smell elimination verification on the refactored code, and output a refactoring report that includes a comparison before and after refactoring, a list of code smells, and verification results.
[0047] Thirdly, embodiments of the present invention provide an electronic device, including: at least one processor, at least one memory, and computer program instructions stored in the memory, which, when executed by the processor, implement the method of the first aspect described above.
[0048] Fourthly, embodiments of the present invention provide a storage medium storing computer program instructions, which, when executed by a processor, implement the method of the first aspect described above.
[0049] Compared with the prior art, the present invention has the following advantages and beneficial effects:
[0050] By introducing a code smell detection mechanism using multiple expert models, the accuracy and coverage of code problem identification are significantly improved. Traditional single detection methods often have limitations in dealing with diverse and deep-seated code smells. This solution, however, uses a gating network to dynamically schedule rule models, machine learning models, and specific front-end framework models for collaborative analysis, allowing both common code problems and framework-specific defects to be handled by the most specialized models. This division of labor and integration not only improves the detection rate of various code smells (such as code duplication, excessively long functions, and React / Vue-specific anti-patterns), but also reduces false positives through comprehensive evaluation of multi-model results, providing a more reliable basis for problem diagnosis in subsequent refactoring.
[0051] By employing a collaborative approach of pre-built reconstruction rule templates and a large language model to generate reconstruction strategies, the system enhances its adaptability and intelligence while ensuring the safety of reconstruction operations. The pre-built template library ensures that remediation solutions for common bugs conform to industry best practices and are predictable, while the introduction of the large language model enables it to handle special cases not covered by the template library or with extremely complex contexts, providing more creative optimization suggestions. The combination of feasibility verification and a rollback mechanism avoids the uncertainties and risks that may arise from relying solely on the large language model, while overcoming the lack of flexibility inherent in pure rule-based systems. This allows for the generation of reasonable, feasible, and effective reconstruction strategies across various scenarios.
[0052] An end-to-end automated refactoring pipeline was constructed, using an abstract syntax tree (AST) as the unified operational foundation. This pipeline achieves a closed loop from code parsing, code smell detection, strategy generation to code modification and verification. Since all structural transformations are performed directly on the AST, the accuracy and reliability of the refactoring operations at the syntactic level are ensured, avoiding errors that may arise from text-based replacement. The integration of automated testing and code smell elimination verification allows for immediate verification of functional consistency and refactoring effectiveness after modifications, forming a quality safety net. This significantly reduces the risk of regression errors introduced by automated refactoring and increases developers' trust in the results.
[0053] This solution enhances the transparency and auditability of the entire process by outputting structured, traceable refactoring reports. The reports detail detected code smells, applied refactoring strategies, code change differences, and verification results, making the refactoring process no longer a "black box" and facilitating developer understanding, review, and intervention. This not only facilitates human-machine collaboration but also provides the necessary data foundation for integrating this method into continuous integration processes, thereby supporting continuous and automated code quality maintenance throughout the software lifecycle, reducing maintenance costs and improving engineering efficiency in the long term. Attached Figure Description
[0054] To more clearly illustrate the technical solutions of the exemplary embodiments of the present invention, the accompanying drawings used in the embodiments will be briefly described below. It should be understood that the following drawings only show some embodiments of the present invention and should not be considered as a limitation of the scope. For those skilled in the art, other related drawings can be obtained based on these drawings without creative effort. In the drawings:
[0055] Figure 1 A flowchart illustrating the front-end code smell refactoring method provided by this invention;
[0056] Figure 2 A schematic diagram of the front-end code smell refactoring device provided by the present invention;
[0057] Figure 3 This is a schematic diagram of the structure of the electronic device provided by the present invention. Detailed Implementation
[0058] To make the objectives, technical solutions, and advantages of the present invention clearer, the present invention will be further described in detail below with reference to the embodiments and accompanying drawings. The illustrative embodiments and descriptions of the present invention are only used to explain the present invention and are not intended to limit the present invention.
[0059] It should be noted that, in this document, relational terms such as "first" and "second" are used merely to distinguish one entity or operation from another, and do not necessarily require or imply any such actual relationship or order between these entities or operations. Furthermore, the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or apparatus. Without further limitations, an element defined by the phrase "comprising..." does not exclude the presence of additional identical elements in the process, method, article, or apparatus that includes said element.
[0060] It should be noted that all actions involving the acquisition of signals, information, or data in this invention are carried out in compliance with the relevant data protection laws and regulations of the locality and with authorization from the owner of the relevant device.
[0061] Example 1
[0062] Please see Figure 1 This invention provides a method for refactoring front-end code smells, including:
[0063] S1. Parse the front-end source code and generate an abstract syntax tree;
[0064] Specifically, the process involves obtaining the front-end source code files to be analyzed. These files can be JavaScript or TypeScript source code from a project, or they can contain code formats specific to front-end frameworks such as JSX, TSX, or Vue single-file components. The source code is then input into a parser for lexical and syntactic analysis, generating an Abstract Syntax Tree (AST). For example, JavaScript parsers such as Espree and Acorn can be used to convert JS / TS code into an ESTree-formatted AST, or a framework-specific parser can be used to process the JSX / Vue template portion. By constructing the AST, the structure and elements of the code (functions, variables, statements, expressions, etc.) are represented as tree nodes, facilitating subsequent analysis and manipulation.
[0065] S2. Perform bad smell detection on the abstract syntax tree and output the detection results containing bad smell type, location, severity and context information.
[0066] Specifically, after obtaining the AST, the system scans and detects code smells using predefined rules. These rules can be divided into two categories: general code smell detection rules and specific rules tailored to the characteristics of front-end frameworks.
[0067] General bad code smell detection rules are used to identify various common bad code structures. For example, by traversing the AST and counting the number of lines of code and branches of a function, bad smells such as "overly long functions" or "excessively high cyclomatic complexity" can be detected; by comparing the structure or content of different code snippets, duplicate code blocks (code duplication) can be found; by analyzing the number of attributes and methods of a class or module, it can be determined whether there are problems such as "overly large classes" or "data clumps". These rules can be implemented based on simple pattern matching or combined with code metrics (such as Halstead complexity, McCabe cyclomatic complexity, etc.) for judgment.
[0068] Front-end framework-specific bad smell rules target typical problems within a particular framework. For example, in React components, they detect unnecessary `useEffect` dependencies, unused state or props, or missing `key` attributes in list rendering; in Vue components, they detect excessively long expressions in templates, overly coupled prop passing between components, or improper use of lifecycle hooks. These rules can leverage framework-specific AST node types and structures for matching, such as identifying bad smells in JSX. The list may be missing a key, or there may be complex calculation logic in the Vue template.
[0069] In a preferred embodiment, the present invention introduces a machine learning model to assist in bad smell detection, thereby improving the accuracy and coverage of detection. For example, a trained deep neural network model can be used, taking a code snippet or AST subtree as input and outputting the probability that it belongs to a certain bad smell. Such models can be trained on a large dataset of labeled code and can learn bad smell patterns that are difficult to describe with explicit rules. For example, by labeling a large number of open-source front-end projects, the model can be trained to identify patterns such as "inappropriate comments" (e.g., comments that are inconsistent with code logic) or "complex nested callbacks". During actual detection, the code snippet is input into the model, and if the probability output by the model exceeds a preset threshold, the corresponding bad smell is marked.
[0070] During the detection process, once a pattern matching the rules or model for identifying "smells" is found, the system records relevant information about the smell, including the file it appears in, line number, smell type, and severity level. For example, if duplicate code is detected, it is recorded that it appears repeatedly in lines 10-20 of file A and lines 15-25 of file B, with the type "Duplicate Code" and a severity level of "Medium". All detection results are compiled into a list of smells for subsequent processing.
[0071] S3. Based on the detection results, generate a corresponding reconstruction strategy, wherein the reconstruction strategy is generated collaboratively by a preset reconstruction rule template and a large language model;
[0072] Specifically, the refactoring strategy generation aims to provide concrete code modification solutions for detected code smells. The system maintains a pre-defined refactoring rule template library, which defines standard repair patterns for common code smell types. For example, for code duplication smells, a pre-defined template for extracting common functions is provided. When generating a strategy, the system first attempts to match the code smell type in the detection results with the rule library templates, and uses information such as position and context from the detection results to instantiate the template parameters, thereby obtaining a deterministic refactoring solution. Simultaneously, the system can format the detection results, including the code smell description and context, and input them into a large language model. Leveraging the generalization understanding and generation capabilities of the large language model trained on a broad code corpus, it can obtain creative refactoring suggestions that may exceed the scope of the pre-defined templates. These two generation methods work synergistically: the pre-defined rules ensure the basic feasibility and safety of the solution, while the large language model provides flexibility to adapt to complex or special contexts. The system can evaluate and select from multiple generated strategies. If a strategy generated by the large language model is deemed too risky or infeasible, the system can automatically adopt a strategy generated by the pre-defined rules as a reliable alternative, thus balancing innovation and robustness.
[0073] S4. Modify the abstract syntax tree according to the refactoring strategy to generate refactored code;
[0074] Specifically, this step translates the logical refactoring strategy into actual modifications to the code structure. Refactoring strategies are typically expressed as a series of instructions executed on the abstract syntax tree (AST), such as inserting a new function declaration child node under a specific parent node, or moving a subtree from its original position to a new one. Because the AST completely and precisely encodes the syntactic structure of the source code, adding, deleting, modifying, and moving its nodes ensures that the generated new tree remains syntactically correct. The modification process strictly follows the definition of the refactoring strategy, executing each instruction one by one. For example, when executing the function extraction strategy, the system locates the subtree node corresponding to the target code block, creates a new function node and sets it as the parent node of that subtree, then replaces it with a function call node in its original position, and properly handles variable passing within the scope. After all modifications are complete, the code generator converts the modified AST back to the source code text format that conforms to the language specification. This tree-based approach allows complex code transformations to be completed in a structured and predictable manner, avoiding the syntactic breakdowns or accidental modifications that might occur with text-based replacement.
[0075] Specifically, for each refactoring strategy, the system traverses the AST to find the corresponding nodes and makes modifications. For example, for the "extract function" strategy, the system finds the AST subtree corresponding to the code block to be extracted, deletes it from its original position, inserts a new function definition node at the appropriate position, and inserts a call node for the new function at the original position; for the "rename variable" strategy, the system finds all declaration and reference nodes of the variable and modifies its name attribute to the new name; for the "split component" strategy, the system may split a large component AST into multiple smaller component ASTs and adjust the calling relationships between them.
[0076] Because the AST carries complete structural information about the code, these operations can be performed with absolute precision. For example, when extracting functions, the system can automatically handle variable scope issues, passing external variables referenced in the original code block as parameters to the new function, and replacing the result of the original code block with the return value of the new function. These detailed processing steps can be completed automatically by manipulating AST nodes and their attributes, without manual intervention.
[0077] After completing all planned refactoring operations, the system converts the modified AST back into source code text. This process can be achieved using a code generator (printer), such as using tools like escodegen to regenerate the ESTree AST into JavaScript code strings. Since the AST retains comments, formatting, and other information (which can be selectively preserved during parsing), the generated source code is as consistent as possible with the original code in terms of format and comments, only changing the parts that need refactoring. This reduces unnecessary formatting changes caused by automatic refactoring and makes it easier for developers to see the differences.
[0078] S5. Perform automated testing and code smell elimination verification on the refactored code, and output a refactoring report that includes a comparison before and after refactoring, a list of code smells, and verification results.
[0079] Specifically, automated test execution involves running unit tests, integration tests, or end-to-end tests if they already exist in the project. The system will automatically run these test cases to check for test failures. If all tests pass, the refactoring has not broken known functionality. If any tests fail, the reasons for the failures need to be analyzed to determine if they are related to the refactoring. For example, improper logic handling during the refactoring process might have caused a change in the behavior of a certain function. In this case, the issue needs to be recorded, and the corresponding refactoring should be considered for rollback.
[0080] Smell Removal Check: The system re-runs the smell detection module to re-examine previously detected smells and confirm whether they have been successfully fixed. For example, has previously detected duplicate code been extracted into common functions to eliminate duplication? Have excessively long functions been split and shortened? If any unfixed smells remain, the system can record them and indicate that further processing is needed.
[0081] Static analysis and code inspection: Static code analysis tools are used to scan the refactored code to check for any new problems (such as unused variables, potential type errors, etc.). If new problems are found, the system will record them as refactoring side effects for developers to evaluate.
[0082] Optional manual verification: In some cases, especially for changes related to the front-end interface, the system can generate previews or difference reports of the interface before and after the refactoring, which can be manually checked by developers or testers to ensure that the interface behavior and visual effects meet expectations.
[0083] Through the aforementioned multi-layered verification, the correctness of the refactoring is guaranteed to the greatest extent possible. If problems are found during the verification process, the system can perform a refactoring rollback operation: replacing the refactored AST back with the original AST, regenerating the source code, and thus restoring the system to its state before refactoring. This avoids defective refactoring from damaging the codebase. In some implementations, the system saves a snapshot of the code before refactoring; if verification fails, the code can be directly restored from the snapshot.
[0084] The list of detected code smells contains information such as the type of smell, the file and location where it is located, and its severity for each record.
[0085] The refactoring strategies applied for each bad smell are explained, such as "extracting duplicate code into the utils.formatDate() function" and "splitting the long function renderChart() into fetchData() and drawChart()", etc.
[0086] The code differences before and after the refactoring can be compared and displayed in diff format, showing the specific lines of code that were changed in each file.
[0087] Summary of test verification results, such as "All unit tests passed" or "X tests failed (unrelated to refactoring)";
[0088] Other statistics include the time spent refactoring, the number of lines of code modified, and the number of code smells eliminated.
[0089] The output can be a text report, an HTML report, or a visual presentation directly within an Integrated Development Environment (IDE). In one embodiment, this method can be implemented as an IDE plugin. When a developer triggers code optimization, a list of code smell detection results and a button to apply the refactoring with one click will pop up on the IDE interface. The developer can review the report, confirm its accuracy, and then click "Apply" to write the refactoring results to the code file.
[0090] In another embodiment, this method can be integrated into a continuous integration / continuous delivery (CI / CD) process. For example, after code is committed to a version control system, the CI server runs this method to scan and automatically refactor the code. The generated refactoring patches can then be submitted as pull requests for team review. This approach helps to automatically clean up code smells before the code enters the main branch, improving codebase quality.
[0091] In some implementations, S2 involves performing bad smell detection on the abstract syntax tree and outputting detection results containing bad smell type, location, severity, and contextual information, including:
[0092] S21. Extract static code features from the abstract syntax tree, wherein the static code features include at least one of node type distribution, code metrics, and framework identifier features;
[0093] Specifically, in the process of bad smell detection, the first step is to extract static features reflecting the inherent properties of the code from the abstract syntax tree. Node type distribution describes the frequency and proportion of various syntax structures (such as function declarations, loop statements, and conditional expressions) in the tree, providing a macro view of the code's structural complexity. Code metrics further characterize code quality through quantitative analysis; for example, cyclomatic complexity measures the complexity of control flow, lines of code reflect module size, and nesting depth indicates logical hierarchy. Framework identifiers are used to identify whether the code uses a specific front-end framework and its version, for example, by detecting whether it contains React's JSX syntax nodes or Vue's directive nodes. Extracting these features provides a basis for subsequent expert routing.
[0094] S22. Input the extracted static code features into the gating network and output a matching expert model, wherein the expert model includes a rule model, a machine learning model and a specific front-end framework model;
[0095] Specifically, the extracted feature vectors are input into a gating network. A gating network is typically a trainable, lightweight classifier or routing function that computes a set of weights based on the input features to dynamically select one or more expert models most likely to perform the current detection task. These expert models include rule-based models for pattern matching based on predefined logical rules, machine learning models that inductively extract patterns from data based on statistical learning, and specific front-end framework models that perform deep semantic analysis for frameworks such as React or Vue. This division of labor allows different types or sources of bad smells to be handled by the most specialized detection units.
[0096] S23. Detect the abstract syntax tree according to the expert model. If the detection result is that there is at least one code smell, determine the detection result of the smell type, location, severity and context information of each code smell.
[0097] Specifically, the selected expert model performs in-depth analysis of the abstract syntax tree or its substructures. Rule-based models determine code smells by traversing tree nodes and applying threshold rules (such as function length exceeding a set value); machine learning models map the tree structure or extracted features to smell probabilities; and specific front-end framework models perform semantic-level matching based on the framework's best practices and anti-pattern libraries. When any model determines a code smell exists, the system integrates the model's output with the feature analysis results to determine the precise type of the smell. Location information is determined by mapping back to the source code row and column numbers of the corresponding nodes in the abstract syntax tree. Severity is typically calculated or graded based on the inherent harm of the smell type, the degree of deviation from relevant metrics, and the scope of the smell's influence within the code context. Context information is constructed by extracting the syntactic environment surrounding the smell node, associated variables and functions, and its role within the module or component. Finally, all this information is integrated into a structured detection result object, completing the entire detection chain from code feature extraction to multi-dimensional smell information output.
[0098] This embodiment employs a Mixture-of-Experts (MoE) architecture to integrate multiple detection methods, fully leveraging the strengths of different tools and models. The MoE model comprises a gate network (Router) and multiple expert models. The gate network determines which expert model to assign to the input code for detection based on its characteristics. Each expert model focuses on detecting one or more types of code smells; it can be a rule-based analyzer or a specific machine learning model. For example, one expert might specialize in detecting "excessively long functions," another in "duplicate code," and yet another in "React-specific code smells," and so on. Through the MoE architecture, the system can automatically select the most suitable detection method for different code features, thereby improving detection efficiency and accuracy. In practice, training data can be used to train the gate network, enabling it to learn to select the correct expert based on the code's structural features (such as the depth of the AST, node type distribution, etc.). This ensemble detection approach covers more types of code smells compared to a single model and has a stronger ability to identify complex code smells.
[0099] In some implementations, when the expert model is a rule-based model, the step of detecting the abstract syntax tree based on the expert model includes:
[0100] Load a bad smell rule base that matches the code type of the abstract syntax tree, wherein each rule in the bad smell rule base includes a bad smell type identifier, detection conditions, and the node type of the associated abstract syntax tree;
[0101] Traverse the abstract syntax tree and calculate metrics, including lines of code, number of statements, cyclomatic complexity, nesting depth, number of parameters, and frequency of variable usage;
[0102] Based on the conditions in the metrics and rules, determine whether code smells have been detected and their severity.
[0103] If a code smell is detected, determine the type, location, and context of the code smell.
[0104] Specifically, when the expert model is a rule-based model, its detection process begins with the matching and loading of the rule base. The system first filters and loads an appropriate bad smell rule base based on the programming language category (e.g., JavaScript, TypeScript) and possible framework context corresponding to the abstract syntax tree (AST). Each rule in this base is a structured detection unit containing three core elements: a bad smell type identifier that uniquely identifies the specific bad smell category; detection conditions that define the decision logic, typically expressed as thresholds or Boolean combinations based on code metrics; and the type of AST node that the rule targets, such as a function declaration node or a block statement node, to narrow down the detection scope. Subsequently, the system traverses the AST according to a predetermined strategy. When it encounters a node that matches the node type specified in the rule, it uses the subtree rooted at that node as the analysis object and calculates a series of predefined quantitative metrics. These metrics aim to characterize code snippets from different dimensions: the number of lines of code and statements reflects the size; cyclomatic complexity assesses the complexity of control flow by calculating the number of linear independent paths; nesting depth describes the hierarchy of the logical structure; and the number of parameters and the frequency of variable usage reveal the complexity of the interface and internal coupling. The values of these metrics provide numerical basis for subsequent conditional judgments. Next, the system executes the conditional judgment of the rules. It substitutes the calculated metric values into the detection condition expression of the corresponding rule in the rule base for calculation and comparison. If the condition is met, it is determined that a code smell defined by that rule exists at the current node. Simultaneously, the severity judgment is also based on this process; it can be an independent derived rule based on the metric value range, or it can be directly embedded in the detection conditions, for example, by setting different threshold levels to correspond to different severity levels. Finally, for each determined code smell, the system aggregates the results and extracts information. The smell type is directly obtained from the identifier of the triggering rule. Location information is determined by extracting the source code location metadata (such as the starting line number, ending line number, column number, and filename) stored in the abstract syntax tree for the current target node. The extraction of contextual information revolves around the problematic node, which may include obtaining the types of its parent and sibling nodes to understand its structural context, analyzing the variables and functions defined within its scope, or extracting relevant code snippets to form a description of the environment in which the problematic node arose. At this point, the rule model has completed a full detection cycle from rule matching, metric calculation, condition judgment to result output.
[0105] The set of bad odor rules includes, but is not limited to, the following types:
[0106] Code duplication: Detects duplicate code snippets or logic;
[0107] Overly long functions / components: Detect functions, methods, or front-end components whose length or complexity exceeds a threshold;
[0108] Excessive nesting conditions: Detect conditional structures such as if-else statements or ternary operators with excessive nesting levels;
[0109] Overly large classes or modules: Classes or modules with overly concentrated detection responsibilities and excessive lines of code;
[0110] Data Clumps: Detect data items or parameter groups that frequently appear together;
[0111] Inappropriate comments: Detect redundant or inconsistent comments;
[0112] Front-end framework-specific "smells": such as unnecessary side effects and unoptimized list rendering in React, and improper use of lifecycle methods in Vue.
[0113] In some implementations, when the expert model is a machine learning model, the step of detecting the abstract syntax tree based on the expert model includes:
[0114] Multidimensional feature vectors are extracted from the subtrees of the abstract syntax tree, and the multidimensional feature vectors include structural features, semantic features and contextual features;
[0115] The multidimensional feature vector is input into the trained machine learning model to obtain at least one candidate bad taste type and the probability of each candidate bad taste type output by the machine learning model.
[0116] If the probability of any candidate bad flavor type exceeds the probability threshold, then the candidate bad flavor type is determined as the bad flavor type of the abstract syntax tree.
[0117] Determine the location and context information of the bad odor node corresponding to the bad odor based on the bad odor type and the corresponding subtree;
[0118] The severity score is calculated based on the metrics of the odor nodes, and the levels are divided according to preset thresholds.
[0119] Specifically, when the expert model is a machine learning model, the core of its detection process lies in transforming the abstract syntax tree representation of the code into machine-understandable features and performing probabilistic reasoning. The system does not directly process the entire abstract syntax tree, but rather uses subtrees with independent syntactic meaning as analysis units. For each subtree to be analyzed, the system needs to extract a multi-dimensional feature vector that comprehensively represents its characteristics. This feature vector typically covers three levels: structural features, used to describe the subtree's topological attributes, such as the tree's depth and width, the distribution ratio of different node types, and the number of control flow edges; semantic features, attempting to capture the code's naming habits and logical intent, for example, by analyzing identifier naming patterns, the presence and density of comments, and the call sequence of specific APIs or methods; and contextual features, focusing on the subtree's role in a larger code environment, such as the type of file it resides in, the module's import list, and the structural features of adjacent functions or components. The extraction of these features aims to convert unstructured tree data into standardized numerical vectors, providing input for model reasoning.
[0120] Subsequently, the extracted multidimensional feature vectors are input into a pre-trained machine learning model. This model is typically trained on a large dataset of labeled code smells and is essentially a multi-classifier. After receiving the feature vectors, the model outputs a probability distribution vector through its internally learned complex nonlinear mapping relationships. Each dimension of this vector corresponds to a candidate code smell type, and its value represents the predicted probability that the current input subtree belongs to that type of code smell. The system pre-sets a probability threshold to filter out low-confidence predictions. By traversing all candidate code smell types output by the model, if the probability value of at least one type exceeds the threshold, the system identifies that type as the code smell type present in the current subtree; otherwise, it considers that no obvious code smell has been detected in the subtree.
[0121] Once the type of flavor is determined, the system needs to backtrack and determine the specific location and context of the flavor. Location information is obtained by locating the root node of the source subtree that generated the feature vector within the complete abstract syntax tree. This allows the system to parse specific location data such as the source code file path, start and end line numbers, etc., associated with that node. Context information is constructed around the flavor node, which may include extracting the type of its direct parent node, obtaining summary information about its sibling nodes, or analyzing the list of symbols defined within its scope, thus forming a description of the syntactic environment in which the flavor exists.
[0122] Finally, to assess the impact of the code smell, the system needs to calculate its severity. This calculation does not directly rely on the probability output of the machine learning model, but rather on more objective code metrics. For the identified code smell node, the system calculates a series of predefined metrics, such as the number of lines of code covered by the node, cyclomatic complexity, and nesting depth. These metric values are calculated using a pre-defined severity assessment function, which may be a simple weighted summation formula or a more complex regression model, ultimately outputting a quantified severity score. Based on pre-defined score interval thresholds, this score is mapped to different severity levels, such as high, medium, and low, thus completing the full transformation from probabilistic prediction to structured detection results with location, context, and severity assessment.
[0123] In some implementations, when the expert model is a specific front-end framework model, the step of detecting the abstract syntax tree based on the expert model includes:
[0124] The front-end framework type is identified based on the feature nodes in the abstract syntax tree;
[0125] Load and recognize the corresponding rule engine;
[0126] Based on the rule engine, pattern matching is performed on the abstract syntax tree to identify and output bad smell types;
[0127] Extract source code location and context information from the abstract syntax tree nodes that trigger rules in pattern matching;
[0128] The impact of the aforementioned flavor types on framework runtime performance, code maintainability, and functional risks is analyzed, and quantitative scores are calculated and severity levels are classified.
[0129] Specifically, when the expert model is a specific front-end framework model, the initial step in the detection is the automatic identification of the framework type. The system accomplishes this task by analyzing the abstract syntax tree to determine if specific feature nodes exist. For example, it detects the presence of nodes representing JSX syntax to infer the React framework, or identifies template, script, and style block nodes specific to Vue single-file components. This identification process provides a prerequisite for subsequently loading targeted analysis components.
[0130] Once the framework type is determined, the system loads the corresponding dedicated rule engine. This engine is not a general-purpose code analyzer, but rather has a built-in deep knowledge base of common design patterns, best practices, and typical anti-patterns for that framework. For example, the rule engine for React might embed detection logic for Hook usage rules, component lifecycle, state management, and performance optimization patterns. The core operation of the engine is to perform a deep traversal and pattern matching on the input abstract syntax tree based on these predefined rule sets. The matching process aims to discover code structures that violate the framework's recommended practices or constitute known "bad smells." When a part of the abstract syntax tree successfully matches the pattern description of a rule, the bad smell type identifier associated with that rule is triggered and output.
[0131] Once the type of flavor defect is identified, it's necessary to obtain its specific location in the source code and its context information. Location information can be directly extracted from the abstract syntax tree nodes that triggered the rule; these nodes typically have their line number, column number, and other positional metadata appended during parsing. Context information extraction is more in-depth; it may include analyzing the flavor defect node's hierarchical position in the framework component tree, tracing its associated Props or State data flow, and identifying the lifecycle hooks or reactive functions involved, thereby constructing the technical scenario in which the flavor defect occurred.
[0132] Finally, the system performs an impact assessment on the identified code smells to determine their severity. This assessment is framework-aware; it goes beyond general code metrics and focuses on analyzing the impact of the smell on the framework's specific runtime behavior. Assessment dimensions primarily include: impact on framework runtime performance, such as whether it leads to unnecessary component re-rendering or inefficient dependency tracking; impact on code maintainability, such as whether it violates design principles advocated by the framework, such as unidirectional data flow, thereby increasing the difficulty of understanding and modification; and risks to functional stability, such as whether improper state updates might cause inconsistent UI states or memory leaks. Based on a predefined assessment model, the system transforms the above analysis into one or more quantitative scores, and then maps these scores to different severity levels (high, medium, low) according to preset threshold ranges, thus forming a framework-relevance judgment on the degree of impact of the smell.
[0133] In some implementations, generating a corresponding reconstruction strategy based on the detection results includes:
[0134] Based on the type of odor in the detection results, the corresponding reconstruction strategy template is searched in the preset reconstruction rule template library;
[0135] The reconstruction strategy template is instantiated using the context information in the detection results to generate a first reconstruction strategy.
[0136] Specifically, the process of generating corresponding refactoring strategies based on detection results begins with querying and adapting a pre-built knowledge base. The system maintains a structured pre-built refactoring rule template library, which uses code smell types as key indexes and predefines standardized repair scheme templates for various known code smells. Each refactoring strategy template not only specifies the macro-level category of the refactoring operation, such as "extracting functions" or "splitting components," but also defines the specific parameter slots required for the operation and the applicable preconditions and constraints. When a detection result containing a specific code smell type is received, the generation module first uses this code smell type as the query key to search the template library and locate one or more matching basic refactoring strategy templates. This invention pre-defines refactoring rules and templates for common code smells. For example:
[0137] To address code duplication, generate refactoring strategies that extract common functions or components;
[0138] For excessively long functions / components, strategies include generating split functions or components, extracting sub-components, or creating custom Hooks.
[0139] For excessively nested conditions, strategies such as converting nested conditions into guard statements or polymorphism are generated;
[0140] For excessively large classes or modules, strategies such as splitting them into larger classes, extracting interfaces, or using composition instead of inheritance can be employed.
[0141] For data clumps, generate strategies to encapsulate related data as objects or use parameter objects;
[0142] For inappropriate comments, generate strategies to delete redundant comments or improve the content of comments;
[0143] To address the unique smells of front-end frameworks, generate refactoring strategies that conform to the framework's best practices (e.g., add React.memo optimization to React components, or adjust lifecycle methods for Vue components).
[0144] After obtaining the base template, the generation process enters the instantiation phase. The goal of this phase is to transform the abstract template into an executable refactoring scheme for the current specific code context. The system extracts rich contextual information from the detection results, which may include the precise location of the code smell, related variable and function names, the module or component structure, and framework-specific environmental data. This contextual data is used to populate the parameter slots reserved in the refactoring strategy template. For example, for a template designed to "extract duplicate code as functions," the system uses the range of duplicate code blocks identified in the detection results to determine the code segment to be extracted and may automatically generate the name and parameter list of the new function based on code semantics or naming conventions. Through this parameter binding and logical deduction, a generalized template is concretized into a clear and actionable first refactoring strategy. This strategy details where, how, and what modifications should be made to the source code, thus providing a clear instruction blueprint for subsequent automated code transformation. This template-instantiation-based approach ensures that the generated refactoring strategy follows proven best practices while also fitting the specific realities of the current code.
[0145] Specifically, this embodiment includes corresponding refactoring rule templates for common code smells. For example, when a "long function" smell is detected, the system can suggest splitting the function into multiple smaller functions or extracting a portion of its logic as an independent helper function; when "duplicate code" is detected, it suggests extracting the duplicated parts as common functions or components; when "too many nested conditions" are detected, it suggests using guard clauses to return early or using polymorphism to replace conditional statements; when "data clumps" are detected, it suggests encapsulating related parameters into an object for passing; and so on. These rule templates can be pre-written by senior developers and architects based on industry best practices to ensure that the generated refactoring strategies are effective.
[0146] In an alternative embodiment, this invention introduces a Large Language Model (LLM) to enhance the generation of refactoring strategies. Specifically, detected bad-smelling contexts (e.g., code snippets containing bad smells, bad-smelling type descriptions, etc.) are input as prompts to a trained LLM, which then generates possible refactoring suggestions. Based on its learning from large amounts of code and refactoring knowledge, the LLM can provide context-appropriate and creative suggestions. For example, for a complex front-end logic, the LLM might suggest adopting a certain design pattern or utilizing new features provided by the framework for refactoring. This invention preferably uses advanced LLM models such as GPT-4 and Claude, as they excel in understanding code semantics and generating high-quality code modification suggestions.
[0147] To improve the accuracy and relevance of LLM recommendations, this invention incorporates expert tool detection results as supplementary information into the prompts. That is, when sending a code snippet to the LLM, it simultaneously informs the LLM of any code smells, their severity, and possible remediation directions. This information essentially provides the LLM with "clues," helping it generate more targeted refactoring solutions. For example, the prompt could be: "The following code has been found to contain an excessively long function smell (150 lines of code, cyclomatic complexity 12). Please provide refactoring suggestions to split this function." In this way, the LLM can combine the specific type and severity of the smell to provide a more tailored refactoring strategy, rather than just general code optimization suggestions.
[0148] The refactoring strategy generated by LLM may be given in the form of natural language descriptions or code snippets. The system needs to translate this into executable refactoring operations. For example, if LLM suggests "extracting the logic for handling user input into a separate function handleInput()", the system can then locate the corresponding code block on the AST and execute the function extraction operation. In implementation, a set of refactoring operation templates can be predefined (e.g., the "extract function" operation requires determining the function name, parameters, and scope of code to be extracted), and then the parameters of these templates can be filled in according to the LLM's suggestions to generate specific refactoring steps.
[0149] It should be noted that LLM suggestions may sometimes be inaccurate or contain errors. Therefore, while utilizing LLM, this invention also retains rule-based deterministic refactoring strategies as an alternative. For LLM-generated solutions, the system can perform basic semantic checks and feasibility analyses, such as ensuring that extracted functions have clear responsibilities and that renaming does not lead to scope conflicts. If the LLM suggestion is not feasible, a fallback solution generated by built-in rules is used to ensure that at least one safe refactoring option is available.
[0150] In some embodiments, the method further includes:
[0151] The detection results are used as prompts to input into the large language model to obtain the second reconstruction strategy;
[0152] The second reconstruction strategy is verified. If the second reconstruction strategy is not feasible, the first reconstruction strategy is adopted as the final reconstruction strategy.
[0153] If the second reconstruction strategy is feasible, then the second reconstruction strategy shall be adopted as the final reconstruction strategy.
[0154] Specifically, the method, based on generating the first refactoring strategy, also introduces a large language model as an auxiliary decision source. Specifically, the system formats the structured detection results, which include information on code smell type, location, severity, and context, into natural language prompts suitable for the large language model's understanding. These prompts aim to clearly describe the discovered code problems and their technical background, thereby guiding the large language model to reason based on its extensive programming knowledge. Subsequently, these prompts are input into the large language model, which analyzes the prompt content and generates a suggestion that may contain novel refactoring ideas or more complex optimization solutions; this is the second refactoring strategy.
[0155] However, the output of the large language model has a certain degree of uncertainty, and its suggestions may contain logical contradictions, syntactic errors, or inconsistencies with project-specific constraints. Therefore, the system establishes a subsequent verification step. The verification process mainly evaluates the feasibility of the second refactoring strategy, including but not limited to checking whether it meets basic syntactic correctness, whether it will cause variable conflicts within the current code scope, whether it meets the version compatibility requirements of the front-end framework used, and whether it fundamentally conflicts with project architectural constraints. If the verification results indicate that the second refactoring strategy is not feasible, the system will initiate a rollback mechanism, abandon the adoption of the model's suggestion, and instead adopt the first refactoring strategy previously generated by the pre-defined rule template as the final refactoring strategy to be executed, thereby ensuring that at least one reliable and safe alternative exists.
[0156] Conversely, if the verification passes, confirming that the second refactoring strategy is feasible in the current context and does not pose significant risks, the system will prioritize adopting this strategy as the final refactoring strategy. This process design achieves synergy and complementarity between rule-driven and AI-driven strategy generation methods: pre-built templates ensure the basic reliability and coverage of the solution, while large language models offer the potential to handle complex, rare, or scenarios requiring deep code semantic understanding. Finally, the system, through the decision of the verification gate, outputs a feasible and definitive refactoring strategy for subsequent code modification steps to execute.
[0157] The following specific examples illustrate this embodiment.
[0158] Suppose we have a React component UserDashboard.js, which contains a large function UserDashboard. This function directly retrieves data within the component using useEffect and renders a complex UI structure in the returned JSX, including... <header>< / header> User list <userlist / > and <footer>< / footer> This component has verbose code and takes on too many responsibilities, making it a typical "God component" smell.
[0159] The component is processed using the method of this invention:
[0160] Code analysis: UserDashboard.js is parsed into an AST, which contains Hooks such as useState and useEffect, as well as JSX element nodes.
[0161] Bad smell detection: The system detected that the UserDashboard function body is very long (e.g., more than 200 lines) and handles data fetching and complex rendering simultaneously in a single component, which is a "too large component" bad smell. At the same time, it was detected that useEffect directly calls the API to fetch data, which may cause performance issues (a bad smell specific to the React framework, i.e., fetching data directly during rendering without optimization).
[0162] Refactoring Strategy Generation: Addressing the issue of overly large components, the system's rules recommend extracting the data fetching logic into a custom Hook and breaking the UI down into smaller components. Simultaneously, after receiving the warning "large component, containing data fetching and complex UI," the LLM also recommends moving the data fetching logic to a custom Hook and extracting the user list rendering into a separate... <userlist>The component (in fact, the original code already partially does this, but it can be further optimized). Combining the rules and LLM recommendations, the final refactoring strategy is: extract the data retrieval logic to the useUsers Hook, and... <userlist>As a standalone component, ensure that useEffect has the correct dependencies.
[0163] Code refactoring execution: The system performs the following operations on the AST: remove useEffect and its related state logic from the UserDashboard function, create a new useUsers custom Hook to encapsulate data retrieval, and then call this Hook in UserDashboard to retrieve data; Confirm. <userlist>It's already a child component and requires no modification; the UserDashboard component is simplified to only contain layout and child component calls. After the refactored AST is converted back to code, the UserDashboard component becomes concise, retaining only the necessary structure.
[0164] Testing and Verification: Run project tests to ensure user data is still loaded and displayed correctly, and that no component rendering errors have occurred due to the refactoring. Since the `useUsers` Hook is newly introduced, it can be unit tested separately to verify the data retrieval logic. Static analysis did not reveal any new issues, and previous bug smells have been eliminated.
[0165] Results: The report shows that the number of lines of code for the UserDashboard component has been reduced from over 200 to less than 100, and the cyclomatic complexity has been significantly reduced; the detected "oversized component" smell has been fixed; and the suggested refactoring (extracting Hooks and child components) has been successfully applied. The developers reviewed the report and diff, confirmed that the changes met expectations, and then committed the changes to the codebase.
[0166] Through the above process, the previously bloated UserDashboard component was successfully refactored, making the code more modular and maintainable. This example demonstrates the application effect of the method of this invention in a real-world project.
[0167] Example 2
[0168] Please see Figure 2 This invention provides a front-end code smell refactoring device, comprising:
[0169] Parsing module 201 is used to parse the front-end source code and generate an abstract syntax tree;
[0170] The detection module 202 is used to perform bad smell detection on the abstract syntax tree and output the detection results including bad smell type, location, severity and context information;
[0171] Strategy module 203 is used to generate a corresponding reconstruction strategy based on the detection results, wherein the reconstruction strategy is generated collaboratively by a preset reconstruction rule template and a large language model;
[0172] Modification module 204 is used to modify the abstract syntax tree according to the refactoring strategy and generate refactored code;
[0173] Report module 205 is used to perform automated testing and smell elimination verification on the refactored code, and output a refactoring report that includes a comparison before and after refactoring, a list of smells, and verification results.
[0174] It should be noted that each module and unit in the front-end code smell refactoring device in this embodiment corresponds one-to-one with each step in the front-end code smell refactoring method in the aforementioned embodiment. Therefore, the specific implementation of this embodiment can refer to the implementation of the aforementioned front-end code smell refactoring method, and will not be repeated here.
[0175] Example 3
[0176] Please see Figure 3 This embodiment provides an electronic device, including at least one processor 301 and a memory 302. Optionally, the device further includes a communication component 303. The processor 301, memory 302, and communication component 303 are connected via a bus 304.
[0177] In a specific implementation, at least one processor 301 executes computer execution instructions stored in memory 302, causing at least one processor 301 to perform the above-described method.
[0178] The specific implementation process of processor 301 can be found in the above method embodiments, and its implementation principle and technical effect are similar. It will not be repeated here.
[0179] In the above embodiments, it should be understood that the processor can be a Central Processing Unit (CPU), or other general-purpose processors, digital signal processors (DSPs), application-specific integrated circuits (ASICs), etc. The general-purpose processor can be a microprocessor or any conventional processor. The steps of the method disclosed in this invention can be directly implemented by a hardware processor, or implemented by a combination of hardware and software modules within the processor.
[0180] The memory may include random access memory (RAM) and may also include non-volatile memory (NVM), such as at least one disk storage device.
[0181] The bus can be an Industry Standard Architecture (ISA) bus, a Peripheral Component Interconnect (PCI) bus, or an Extended Industry Standard Architecture (EISA) bus, etc. Buses can be categorized as address buses, data buses, control buses, etc. For ease of illustration, the buses shown in the accompanying drawings are not limited to a single bus or a single type of bus.
[0182] This application also provides a computer program product, including a computer program that, when executed by a processor, implements the above-described method.
[0183] This application also provides a computer-readable storage medium storing computer-executable instructions, which, when executed by a processor, implement the above-described method.
[0184] The aforementioned readable storage medium can be implemented by any type of volatile or non-volatile storage device or a combination thereof, such as static random access memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic storage, flash memory, magnetic disk, or optical disk. The readable storage medium can be any available medium accessible to a general-purpose or special-purpose computer.
[0185] An exemplary readable storage medium is coupled to a processor, enabling the processor to read information from and write information to the readable storage medium. Of course, the readable storage medium can also be a component of the processor. The processor and the readable storage medium can reside in an Application Specific Integrated Circuit (ASIC). Alternatively, the processor and the readable storage medium can exist as discrete components in the device.
[0186] The division of units is merely a logical functional division; in actual implementation, there may be other division methods. For example, multiple units or components may be combined or integrated into another system, or some features may be ignored or not executed. Furthermore, the coupling or direct coupling or communication connection shown or discussed may be indirect coupling or communication connection through some interfaces, devices, or units, and may be electrical, mechanical, or other forms.
[0187] The units described as separate components may or may not be physically separate. The components shown as units may or may not be physical units; that is, they may be located in one place or distributed across multiple network units. Some or all of the units can be selected to achieve the purpose of this embodiment according to actual needs.
[0188] In addition, the functional units in the various embodiments of the present invention can be integrated into one processing unit, or each unit can exist physically separately, or two or more units can be integrated into one unit.
[0189] If a function is implemented as a software functional unit and sold or used as an independent product, it can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of this invention, or the part that contributes to the prior art, or a part of the technical solution, can be embodied in the form of a software product. This computer software product is stored in a storage medium and includes several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) to execute all or part of the steps of the methods of the various embodiments of this invention. The aforementioned storage medium includes various media capable of storing program code, such as USB flash drives, portable hard drives, read-only memory (ROM), random access memory (RAM), magnetic disks, or optical disks.
[0190] Those skilled in the art will understand that all or part of the steps of the above-described method embodiments can be implemented by hardware related to program instructions. The aforementioned program can be stored in a computer-readable storage medium. When executed, the program performs the steps of the above-described method embodiments; and the aforementioned storage medium includes various media capable of storing program code, such as ROM, RAM, magnetic disks, or optical disks.
[0191] The specific embodiments described above further illustrate the purpose, technical solution, and beneficial effects of the present invention. It should be understood that the above description is only a specific embodiment of the present invention and is not intended to limit the scope of protection of the present invention. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of the present invention should be included within the scope of protection of the present invention.< / userlist> < / userlist> < / userlist>
Claims
1. A method for refactoring front-end code smells, characterized in that, include: Parse the front-end source code and generate an abstract syntax tree; Perform bad smell detection on the abstract syntax tree and output the detection results containing bad smell type, location, severity and context information; Based on the detection results, a corresponding reconstruction strategy is generated, wherein the reconstruction strategy is generated collaboratively by a pre-set reconstruction rule template and a large language model; Based on the refactoring strategy, the abstract syntax tree is modified to generate refactored code; Perform automated testing and code smell elimination verification on the refactored code, and output a refactoring report that includes a comparison of the code before and after refactoring, a list of code smells, and verification results.
2. The method according to claim 1, characterized in that, The step of performing bad smell detection on the abstract syntax tree and outputting detection results containing bad smell type, location, severity, and contextual information includes: Static code features are extracted from the abstract syntax tree, and the static code features include at least one of node type distribution, code metrics, and framework identifier features; The extracted static code features are input into the gating network, and the output is a matching expert model, which includes a rule model, a machine learning model, and a specific front-end framework model. The abstract syntax tree is detected based on the expert model. If the detection result indicates that at least one code smell exists, the detection result of the smell type, location, severity, and context information of each code smell is determined.
3. The method according to claim 2, characterized in that, When the expert model is a rule-based model, the step of detecting the abstract syntax tree based on the expert model includes: Load a bad smell rule base that matches the code type of the abstract syntax tree, wherein each rule in the bad smell rule base includes a bad smell type identifier, detection conditions, and the node type of the associated abstract syntax tree; Traverse the abstract syntax tree and calculate metrics, including lines of code, number of statements, cyclomatic complexity, nesting depth, number of parameters, and frequency of variable usage; Based on the conditions in the metrics and rules, determine whether code smells have been detected and their severity. If a code smell is detected, determine the type, location, and context of the code smell.
4. The method according to claim 2, characterized in that, When the expert model is a machine learning model, the step of detecting the abstract syntax tree based on the expert model includes: Multidimensional feature vectors are extracted from the subtrees of the abstract syntax tree, and the multidimensional feature vectors include structural features, semantic features and contextual features; The multidimensional feature vector is input into the trained machine learning model to obtain at least one candidate bad taste type and the probability of each candidate bad taste type output by the machine learning model. If the probability of any candidate bad flavor type exceeds the probability threshold, then the candidate bad flavor type is determined as the bad flavor type of the abstract syntax tree. Determine the location and context information of the bad odor node corresponding to the bad odor based on the bad odor type and the corresponding subtree; The severity score is calculated based on the metrics of the odor nodes, and the levels are divided according to preset thresholds.
5. The method according to claim 1, characterized in that, When the expert model is a specific front-end framework model, the step of detecting the abstract syntax tree based on the expert model includes: The front-end framework type is identified based on the feature nodes in the abstract syntax tree; Load and recognize the corresponding rule engine; Based on the rule engine, pattern matching is performed on the abstract syntax tree to identify and output bad smell types; Extract source code location and context information from the abstract syntax tree nodes that trigger rules in pattern matching; The impact of the aforementioned flavor types on framework runtime performance, code maintainability, and functional risks is analyzed, and quantitative scores are calculated and severity levels are classified.
6. The method according to claim 1, characterized in that, The step of generating a corresponding reconstruction strategy based on the detection results includes: Based on the type of odor in the detection results, the corresponding reconstruction strategy template is searched in the preset reconstruction rule template library; The reconstruction strategy template is instantiated using the context information in the detection results to generate a first reconstruction strategy.
7. The method according to claim 6, characterized in that, The method further includes: The detection results are used as prompts to input into the large language model to obtain the second reconstruction strategy; The second reconstruction strategy is verified. If the second reconstruction strategy is not feasible, the first reconstruction strategy is adopted as the final reconstruction strategy. If the second reconstruction strategy is feasible, then the second reconstruction strategy shall be adopted as the final reconstruction strategy.
8. A front-end code smell refactoring device, characterized in that, include: The parsing module is used to parse the front-end source code and generate an abstract syntax tree; The detection module is used to perform bad smell detection on the abstract syntax tree and output the detection results including bad smell type, location, severity and context information; The strategy module is used to generate a corresponding reconstruction strategy based on the detection results, wherein the reconstruction strategy is generated collaboratively by a preset reconstruction rule template and a large language model; The modification module is used to modify the abstract syntax tree according to the refactoring strategy and generate refactored code; The reporting module is used to perform automated testing and code smell elimination verification on the refactored code, and output a refactoring report that includes a comparison before and after refactoring, a list of code smells, and verification results.
9. An electronic device, characterized in that, include: At least one processor, at least one memory, and computer program instructions stored in the memory, which, when executed by the processor, implement the method as described in any one of claims 1-7.
10. A computer-readable storage medium having computer program instructions stored thereon, characterized in that, The method as described in any one of claims 1-7 is implemented when the computer program instructions are executed by the processor.