A mutation-based test case automatic generation system
By using an automated test case generation system for XML and JSON formats, and employing randomization and semantic-based mutation strategies, highly versatile test cases are generated without damaging the main structure. This solves the problems of insufficient versatility and easily corrupted formats in existing technologies, thereby improving testing efficiency and accuracy.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- BEIJING INST OF TECH
- Filing Date
- 2022-12-08
- Publication Date
- 2026-06-23
AI Technical Summary
Existing mutation-based automatic test case generation technologies lack versatility without disrupting the original test case structure. Furthermore, mutation operations can easily corrupt test case formats, leading to an increase in unrecognizable test cases and reducing the efficiency and accuracy of software testing.
A mutation-based automatic test case generation system was designed for XML and JSON formats. Through test case parsing, test case structure analysis, content set construction, and test case generation modules, new test cases are generated using randomization and semantic-based mutation strategies. This ensures that the main structure of the test cases is not damaged, while enhancing the relevance and effectiveness of the generated test cases.
The generated test cases are highly targeted and effective, effectively covering various behavioral branches of the software, enhancing the diversity and richness of the test cases, solving the problem of inconsistencies in test case generation tasks from different sources, and improving the efficiency and accuracy of software testing.
Smart Images

Figure CN116010245B_ABST
Abstract
Description
Technical Field
[0001] This invention belongs to the field of software automated testing, and in particular relates to a test case automatic generation system based on mutation. Background Technology
[0002] Software testing is the process of operating a program under specified conditions, comparing the actual output with the expected output, in order to discover program errors or measure software quality and assess whether it meets design requirements. In the field of software testing, automatic test case generation technology is an important research area, especially black-box testing. Black-box testing refers to simulating the software testing environment as an invisible "black box," observing data output through data input to check whether the internal functions of the software are normal.
[0003] With the widespread application of software systems across various fields, the correctness and reliability of software are becoming increasingly important. Currently, software correctness and reliability are primarily ensured through software testing. The software testing process requires writing a large number of test cases to cover as many behavioral branches of the software as possible. As the scale of software systems increases, the manpower required for manually writing test cases becomes increasingly demanding and costly. Therefore, methods for obtaining effective test cases through automation are constantly emerging. One such method is mutation-based automatic test case generation technology, which generates a large number of test cases by mutating several initial test cases. However, existing mutation-based automatic test case generation technologies suffer from problems such as disrupting the original test case structure and insufficient generality in generating test cases for specific objectives.
[0004] Specifically, existing automated test case generation methods are typically only applicable to test case generation tasks for a specific programming language or framework, lacking strong versatility. However, as software systems become increasingly powerful and component relationships more complex, a single test case generation method often only suits a portion of the testing objectives involved in the software testing process, making it difficult to migrate to new application environments and thus lacking versatility. Furthermore, existing mutation-based automated test case generation methods are insufficiently aware of test case formatting, and mutation operations easily corrupt test case formats, resulting in numerous unrecognizable test cases. These unrecognizable test cases are invalid, failing both the test program's verification and the triggering of new program paths, thus reducing the efficiency and accuracy of software testing. Summary of the Invention
[0005] To address the aforementioned issues, this invention provides a mutation-based automatic test case generation system. It designs mutation-based test case generation methods for two common data exchange formats, XML and JSON, solving the problem of how to generate highly versatile test cases without disrupting the original test case structure.
[0006] A mutation-based automatic test case generation system includes a test case parsing module, a test case structure analysis module, a content set construction module, and a test case generation module;
[0007] The test case parsing module is used to generate a test case tree corresponding to each seed test case based on the subordinate and parallel relationships of the nodes in each seed test case. The seed test cases are test cases in XML or JSON format.
[0008] The test case structure analysis module is used to traverse each test case tree to obtain the composition structure and field type information of each seed test case;
[0009] The content collection construction module is used to traverse the test case tree to obtain all values that appear in each corresponding seed test case, and divide all values of all seed test cases into multiple sets according to their respective field types;
[0010] The test case generation module is used to generate new test cases sequentially based on each test case tree without changing the non-leaf nodes of the test case tree.
[0011] Furthermore, the test case generation module is used to generate new test cases sequentially based on each test case tree without changing the non-leaf nodes of the test case tree. Specifically, it selects a corresponding mutation strategy to mutate the leaf nodes and leaf node values of the current test case tree according to the characteristics of the composition structure and field type information corresponding to the current test case tree. The mutation of the values is based on the values of the leaf nodes or on randomly selected values from each content set.
[0012] Furthermore, the test case parsing module parses the seed test cases to generate a test case tree. When the seed test case is in XML format, the information mapped to each node in the test case tree includes the node name, node value, and attribute value of each node in the XML seed test case; when the seed test case is in JSON format, the information mapped to each node in the test case tree includes the key and value of each node in the JSON seed test case, where the key is the attribute name string and the value is the attribute value.
[0013] Furthermore, the workflow of the use case structure analysis module is divided into two steps: field type analysis and composition structure analysis.
[0014] First, field type analysis is performed. This step involves traversing the test case tree and, for all nodes with values, attempting to parse the node values as specific types or special values. The specific types and special values to be attempted include integer types, floating-point types, string types, array types, Base64 encoded string types, boolean types, values of 0, values of 1, and values of -1.
[0015] Next, a structural analysis is performed. This step involves traversing the test case tree starting from the root node and counting each node in the test case tree. The count includes: the total number of child nodes contained in the node, and the category count of the different types of child nodes contained in the node. The types of child nodes include integer type, floating-point type, and string type. The category count is obtained by counting the number of child nodes of different types, that is, counting the number of child nodes with a specific type of value, and using it as the category count value.
[0016] Furthermore, the content collection construction module reads the test case tree to construct a test case content collection. The workflow of the content collection construction module is divided into two steps: test case content collection and test case content collection creation and updating.
[0017] In the test case content collection step, the test case tree is traversed from the root node to collect the values of all nodes with values that appear in the test case tree, and the values of the nodes are used as content. Furthermore, the content is collected according to the type of content, namely integer type, floating-point type, and string type.
[0018] In the use case content set creation and update step, the content of the corresponding type collected in the use case content collection step is added to the corresponding use case content set. The use case content set includes three categories: integer data set, floating-point data set, and string data set. The string data set is further divided into m subsets with a length step of 5, where m>4.
[0019] Furthermore, the workflow of the test case generation module is divided into two steps: mutation strategy construction and mutation operation execution. The mutation strategy construction includes randomized mutation strategy construction. Specifically, the randomized mutation strategy construction has a random probability of executing a randomly selected mutation strategy during the traversal of the test case tree. The mutation strategies used in the randomized mutation strategy construction include node insertion and replacement, splicing, and addition.
[0020] The mutation operation execution step executes the mutation strategy constructed in the randomization-based mutation strategy construction step. Starting from the root node, the test case tree is traversed. When the leaf node of the test case tree is accessed, the leaf node of the current test case tree is mutated according to the mutation strategy randomly selected in the randomization-based mutation strategy construction.
[0021] Furthermore, the workflow of the use case generation module is divided into two steps: mutation strategy construction and mutation operation execution, and the mutation strategy construction includes mutation strategy construction based on type inference.
[0022] The mutation strategy based on type inference is constructed based on the results of use case structure analysis, adding semantic mutation strategies for nodes to the generated strategy set. Specifically, the mutation strategies used in the mutation strategy construction step based on type inference are all associated with one or more specific types, which are field types inferred from the use case structure analysis module. The mutation strategies based on type inference include bitwise reversal, arithmetic operation, arithmetic special value, and node copying. The bitwise reversal strategy is applicable to string type mutation, the arithmetic operation strategy is applicable to integer type and floating-point type, the arithmetic special value strategy is applicable to integer type and floating-point type, and the node copying strategy is applicable to array type.
[0023] The mutation operation execution step executes the mutation strategy constructed in the type inference-based mutation strategy construction step. Starting from the root node, the test case tree is traversed. When the leaf node of the test case tree is accessed, the mutation strategy selected according to the characteristics that the composition structure and field type information of the current leaf node conform to is executed to mutate the value of the leaf node of the current test case tree.
[0024] Furthermore, the use case generation module combines type information and mutation by employing randomization and semantic-based mutation, while enhancing versatility through a randomized mutation strategy.
[0025] Furthermore, the mutation strategy adopted by the system mutates at both the test case tree node level and the test case tree node value level. The mutation strategy constructed by the mutation strategy construction method conforms to the original semantics and will not damage the main structure of the test cases. That is, the connection relationship of nodes with child nodes in the tree structure will not change, only the value of the leaf nodes will change.
[0026] Beneficial effects:
[0027] 1. This invention provides a mutation-based automatic test case generation system. First, the initial test cases are parsed. Then, based on the parsed test case node data, the corresponding composition structure and field type information are obtained. Next, based on the corresponding child node data type, targeted mutation operations with correct semantics are applied to generate new test cases. This solves the problem of mutation operations damaging the test case format and failing to generate valid test cases. Simultaneously, this invention uses randomization and semantic-based mutation, employing mutation strategies with corresponding semantics based on data type to enhance the relevance and effectiveness of the generated test cases. Furthermore, the randomization mutation strategy enhances the diversity and richness of the generated test cases. Additionally, multiple mutation strategies designed specifically for XML and JSON formats are used to minimize the differences in test case generation tasks from different sources, enhancing the universality of the test case generation method.
[0028] 2. This invention provides a mutation-based automatic test case generation system. Based on the characteristics of the composition structure and field type information of the test case tree, the system selects the corresponding mutation strategy to perform mutation on the test cases. That is, this invention constructs mutation strategies with correct semantics for different data types and executes the mutation strategies probabilistically based on random numbers. These mutation strategies can modify test cases on a large scale without damaging the test case format.
[0029] 3. This invention provides a mutation-based automatic test case generation system. Based on a randomized mutation strategy, mutation strategies are randomly selected. These mutation strategies can modify test cases on a large scale, thereby enhancing the diversity and richness of the test cases generated during the mutation operation. At the same time, the type inference-based mutation strategy adds semantic mutation strategies for nodes to the generated strategy set based on the results of test case structure analysis, thereby enhancing the targeting and effectiveness of the test case mutation operation.
[0030] 4. This invention provides a mutation-based automatic test case generation system. The test case generation module combines type information and mutation by employing randomization and semantic-based mutation to enhance the relevance and effectiveness of the generated test cases. At the same time, the randomization mutation strategy enhances the diversity and richness of the generated test cases, thereby enabling this invention to eliminate the differences between test cases from different sources.
[0031] 5. This invention provides a test case automatic generation system based on mutation. The mutation strategy adopted will not damage the main structure of the test cases. That is, the connection relationship of nodes with child nodes in the tree structure will not change. Only the value of the leaf nodes will be changed. Attached Figure Description
[0032] Figure 1 Automatically generate the overall framework of the system for mutation-based test cases;
[0033] Figure 2 Workflow for the test case parsing module;
[0034] Figure 3 The workflow of the use case structure analysis module;
[0035] Figure 4 Workflow for building modules for content collections;
[0036] Figure 5 Workflow for test case generation module. Detailed Implementation
[0037] To enable those skilled in the art to better understand the present application, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings.
[0038] First, let's explain the terminology:
[0039] Software testing: Software testing is the process of verifying the correctness, integrity, security, and quality of software by reviewing or comparing the actual output with the expected output. In software testing, testers operate the program under specified conditions to discover program errors, measure software quality, and evaluate whether it meets design requirements.
[0040] Test case: A test case is a set of test inputs, execution conditions, and expected results designed for a specific goal, used to verify whether a particular software requirement is met.
[0041] Automatic test case generation: Automatic test case generation is a crucial step in automated testing, involving the automatic generation of new test cases through programs or rules. Automatic test case generation techniques are mainly divided into rule-based generation methods and mutation-based generation methods. Rule-based generation methods model the specific program under test and construct test cases according to the model or syntax rules; mutation-based generation methods generate new test cases by modifying well-formatted seed inputs. Rule-based generation methods can generate inputs that easily pass integrity or syntax checks; mutation-based generation methods can effectively test programs with compact and unstructured data formats.
[0042] Seed test cases: Seed test cases refer to the current test cases that undergo mutation during the test case mutation process. Seed test cases are the basis for new test cases. In this invention, seed test cases include manually provided initial test cases (generated in the first round) and several test cases randomly selected from the set of new test cases obtained during mutation (generated in the second round and subsequent rounds).
[0043] Test case mutation: Test case mutation refers to the process of generating new test cases by performing mutation operations on seed test cases in mutation-based test case generation. This process applies random or specified mutation operations to the selected seed test cases, such as flipping, copying, replacing, and adding / deleting / modifying, in order to obtain new test cases that can trigger new paths and improve program test coverage. In mutation-based test case generation methods, the choice of test case mutation strategy determines the quality of the resulting new test case set. A high-quality test case set has strong path coverage of the tested program and can detect more program defects and vulnerabilities.
[0044] Black-box testing: Black-box testing refers to simulating the software testing environment as an invisible "black box" and observing the data output through data input to check whether the internal functions of the software are normal.
[0045] XML format: XML stands for Extensible Markup Language, a structured markup language commonly used to describe and store data. It is one of the standard formats for describing the content and structure of program data and is widely used in various configuration documents, data documents, and network interfaces. An XML file consists of several nested individual XML fragments, each of which is called an XML node.
[0046] JSON format: JSON stands for JavaScript Object Notation, a lightweight data-interchange format. Built on a subset of JavaScript, JSON uses a text format completely independent of programming languages to store and represent data, featuring a concise and clear hierarchical structure. JSON is an ideal data-interchange format, easy to read and write, and also easy for machines to parse and generate, effectively improving network transmission efficiency. It is widely used in various configuration documents, data documents, and network interfaces. A JSON file consists of several key-value pairs, each key-value pair being a JSON node, where the key is a string and the value is a JavaScript object or array.
[0047] Test case tree: The test case tree refers to the unified tree-like data structure generated after parsing XML and JSON format test cases in this invention. This structure is a logical tree-like representation used internally by the program and can represent test case information.
[0048] Content set: The content set refers to the set of values (i.e., content) of all leaf nodes with values in the test case tree contained in this invention.
[0049] This invention provides a mutation-based automatic test case generation system for XML and JSON format files, such as... Figure 1 As shown, it includes a test case parsing module, a test case structure analysis module, a content collection construction module, and a test case generation module;
[0050] The test case parsing module is used to generate test case trees corresponding to each seed test case based on the subordinate and parallel relationships in each seed test case. The seed test cases are in XML or JSON format. When the seed test case is in XML format, the information mapped to each leaf node of the test case tree includes the node name, node value, and attribute value of each XML node in the seed test case. When the seed test case is in JSON format, the information mapped to each leaf node of the test case tree includes the key and value of each JSON node in the seed test case, where the key is the attribute name string and the value is the attribute value.
[0051] The test case structure analysis module is used to traverse each test case tree to obtain the composition structure and field type information of all values for each seed test case. The composition structure information refers to the overall structure of the test case tree, including the field structure and field inclusion relationships; the field type information refers to the data type of the values of the leaf nodes in the test case tree, such as integer or floating-point numbers.
[0052] The content collection construction module is used to traverse each test case tree to obtain all values appearing in each corresponding seed test case, and divide all values of all seed test cases into multiple sets according to their respective field types.
[0053] The test case generation module is used to generate new test cases sequentially based on each test case tree without changing the non-leaf nodes of each test case tree. Specifically, it selects a corresponding mutation strategy to mutate the values of the leaf nodes of the current test case tree according to the characteristics of the composition structure and field type information corresponding to the current test case tree. The mutation of the values is based on the values of the leaf nodes, or on randomly selected values from the content set of the corresponding type.
[0054] Furthermore, the test case generation module reads the analysis results of the test case structure analysis module and the content collection module. Based on the analysis results, it selects a semantically appropriate mutation strategy for each node of the test case. Then, according to the constructed generation strategy, it performs mutation operations on the leaf nodes of the test case tree. After the execution is completed, the mutated test case tree is restored to XML or JSON format, thereby generating new test cases.
[0055] Furthermore, the test case mutation operation can automatically identify the syntax markers and high-level semantics of the test case tree through the feedback information provided by the aforementioned test case structure analysis module and content collection module. It can detect whether the content constitutes a certain specific valid syntax. The test case generation module can perform targeted mutation operations based on the content and structure of the test cases.
[0056] Therefore, the entire process of automatic test case generation begins with several initial input test cases provided manually. These test cases are then parsed by the test case parsing module. Next, the test case structure analysis module and the content set construction module work in parallel to analyze the overall structure of the test cases. The analyzed information is then passed to the test case generation module to construct and execute a mutation strategy, generating new test cases. The system employs a cyclical structure. After performing one round of mutation on all initial test cases, several test cases are randomly selected from the newly generated test cases as seed test cases and sent back to the test case parsing module, marking the start of a new round of test case generation. Once the number of test cases generated by the system reaches the user-specified number, the test case generation loop ends, and the system returns the newly generated test cases to the user.
[0057] The following section provides a detailed description of each module of the automatic test case generation system, with reference to the accompanying diagrams.
[0058] The workflow of the test case parsing module is as follows: Figure 2 As shown, the test case parsing module parses seed test cases, converting test cases in text form (XML or JSON) into a test case tree. This is the foundation for solving the problem of weak generality in automatic test case generation.
[0059] For XML test cases, the test case parsing module uses an XML parser to parse the seed test cases in XML text format into an operable tree-structured XML document object model. The XML parser is provided by the standard library. After the test case parsing module generates the tree-structured XML document object model from the XML text, other modules of the automatic test case generation system can access each child node of the XML document object through the tree structure, including the node name (nodeName), node value (nodeValue), and attribute values (attributes). Furthermore, starting from a node in the document object model, it can access its parent and child nodes, thus allowing traversal of the entire document object model from any node. The tree structure processed by the test case parsing module serves as the foundational data structure for subsequent XML test case structure analysis and content set construction.
[0060] For JSON test cases, the test case parsing module uses a JSON parser to parse the seed test cases in JSON text format into operable JSON objects. The JSON parser is provided by the standard library. A JSON object is a dictionary with attribute names as keys and attribute values as values, and attribute values can be nested JSON objects, resulting in a tree-like structure. After the test case parsing module parses the text-format JSON object into a tree-like structure, other modules of the test case automatic generation system can access the leaf nodes of the JSON object and, starting from a given node, access its parent and child nodes, thus traversing the entire JSON object from any node. The JSON object processed by the test case parsing module forms the foundational data structure for subsequent analysis of the JSON test case structure and the construction of content sets.
[0061] As can be seen, the test case parsing module abstracts both XML and JSON into a tree structure, which will be referred to as the test case tree structure in the following text.
[0062] The workflow of the use case structure analysis module is as follows: Figure 3 As shown, use case structure analysis consists of two steps: field type analysis and composition structure analysis.
[0063] First, field type analysis is performed. This step involves traversing the test case tree and, for all nodes with values, attempting to parse those values as specific types or special values. The types and special values attempted include integers, floating-point numbers, strings, arrays, Base64-encoded strings, booleans, values of 0, values of 1, and values of -1.
[0064] Next, a structural analysis is performed. This step involves traversing the test case tree starting from the root node, counting each node in the tree. This counting includes: the total number of child nodes contained in that node, and the category count of the different types of child nodes contained in that node. The types of child nodes include integer, floating-point, and string types. The category count is obtained by counting the number of child nodes of different types, that is, counting the number of child nodes with values of a specific type, and using this count as the category count value.
[0065] Therefore, the test case structure analysis module analyzes the test case structure and generates test case structure information containing field types and composition structure, providing sufficient information for subsequent construction of mutation strategies.
[0066] The workflow of the content collection building module is as follows: Figure 4 As shown, it is divided into two steps: use case content collection and use case content collection creation and updating.
[0067] In the test case content collection step, the values (i.e., content) of all nodes with values appearing in the test case tree are collected by traversing the test case tree from the root node. Then, the content is collected according to the type of content, namely integer type, floating-point type, and string type.
[0068] In the use case content set creation and update step, the content of the corresponding type collected in the use case content collection step is added to the corresponding use case content set. The use case content set includes three categories: integer data set, floating-point data set, and string data set. The string data set is divided into m (m>4) subsets with a length step of 5, based on the string length. The resulting subsets are shown in the formula {str|(i*5+1)≤str.length<(i*5+5)}, where i∈N.
[0069] The test case generation system uses different seed test cases to mutate and generate new test cases. Therefore, the content collection built in the content collection module will be continuously updated incrementally according to the new seed test cases, and new content will be continuously added to the collection. The original content will not be deleted after a single loop ends.
[0070] The content collection construction module categorizes and stores the content in test cases by building content collections, thereby enabling quick retrieval of required content values within mutation strategies. These content collections are used in some mutation strategies described below.
[0071] The workflow of the test case generation module is as follows: Figure 5 As shown, it includes two steps: constructing the mutation strategy and executing the mutation operation.
[0072] The mutation strategy construction steps include randomization-based mutation strategy construction and type inference-based mutation strategy construction.
[0073] The randomization-based mutation strategy constructs a random selection of mutation strategies, which can modify test cases on a large scale, thereby enhancing the diversity and richness of the generated test cases.
[0074] The mutation strategy based on type inference is constructed based on the results of test case structure analysis. It adds semantic mutation strategies for nodes to the generated strategy set, thereby enhancing the targeting of test case mutation operations.
[0075] Specifically, the mutation strategies used in the type inference-based mutation strategy construction step are all associated with one or more specific types. The specific types are the field types inferred in the use case structure analysis module. The type inference-based mutation strategies include bitwise reversal, arithmetic operations, arithmetic special values, and node copying. The bitwise reversal strategy is applicable to string type mutation, the arithmetic operation strategy is applicable to integer type and floating-point type, the arithmetic special value strategy is applicable to integer type and floating-point type, and the node copying strategy is applicable to array type.
[0076] The mutation strategy based on type inference is described as follows:
[0077] (1) Bitwise Flip: The bitwise flip strategy performs bitwise flips on random positions in a string. The bitwise flip is a binary-level operation performed on the encoded value of a character in the string. The flip length of the bitwise flip operation is randomly selected from 1 bit to 32 bits. When the string conforms to the Base64 encoding format, it is first Base64 decoded. After decoding, the bitwise flip operation is applied to the Base64 decoded string. Then, the mutated string is re-encoded using Base64.
[0078] (2) Arithmetic operations: Arithmetic operations perform addition and subtraction operations on integers and floating-point numbers. The value of the operand for the addition and subtraction operations is determined based on the field being mutated. The value of the operand will be set within the range of k% (10<=k<=50) of the base value to prevent the value from being set to an obviously invalid value.
[0079] (3) Arithmetic special values: Special values may have special meanings in child nodes of integer and floating-point types. Therefore, setting the value of a child node to a special value may cause the program to enter special behavior. The special values include 1, 0, -1, -1.0, 0.0, 1.0 and values that appear more than 10 times in the integer and floating-point sets of the content set.
[0080] (4) Node copy: Copy a random number of elements from the array type to a random number of copies, and then insert the new values into the array.
[0081] Specifically, the mutation strategy used in the randomization-based mutation strategy construction step has a random probability of being randomly selected during the traversal of the test case tree. The mutation strategy is described as follows:
[0082] (1) Node insertion and replacement: Insert or replace the values of nodes in the content collection constructed in the content collection construction module into the values of nodes in the use case, wherein the node values are inserted for array types and the node values are replaced for integers, floating-point numbers and strings.
[0083] (2) Concatenation: A new node is obtained by concatenating two test case tree nodes. The value of the concatenated node is a random number of values selected from the previously constructed content set. This strategy is only executed on nodes with more than 10 child nodes.
[0084] (3) Add: Randomly select a value from the content set constructed in the content set construction module and add it to a random node position in the test case tree. The node containing the new value becomes a child node of that node. This strategy is only executed on nodes with fewer than 5 child nodes.
[0085] The mutation operation execution step executes the mutation strategy constructed in the mutation strategy construction step. For the mutation strategy constructed in the type inference-based mutation strategy construction step, the mutation operation execution step traverses the test case tree starting from the root node. When accessing a leaf node of the test case tree, it executes a mutation strategy selected based on the characteristics conforming to the composition structure and field type information of the current leaf node to mutate the value of the leaf node. After the mutation operation is completed, the mutated test case tree is restored to XML or JSON format, thereby generating new test cases. The mutation operation execution step, for the mutation strategy constructed in the randomization-based mutation strategy construction step, traverses the test case tree starting from the root node. When accessing a leaf node of the test case tree, it executes a mutation strategy randomly selected in the randomization-based mutation strategy construction step to mutate the leaf node. After the mutation operation is completed, the mutated test case tree is restored to XML or JSON format, thereby generating new test cases.
[0086] Therefore, it can be seen that the test case generation module combines type information and mutation by employing randomization and semantic-based mutation to enhance the relevance and effectiveness of the generated test cases. At the same time, the randomization mutation strategy enhances the diversity and richness of the generated test cases. Thus, this invention can smooth out the differences between test cases from different sources. Based on the two common data exchange formats, JSON and XML, it generates test cases for software from multiple sources, solving the problem of the universality of automatic test case generation.
[0087] Meanwhile, the mutation strategy used in this invention mutates at both the test case tree node level and the test case tree node value level. The mutation strategy constructed by the mutation strategy construction method proposed in this invention conforms to the original semantics and will not damage the main structure of the test case. That is, the connection relationship of nodes with child nodes in the tree structure will not change, only the value of the leaf nodes will change. Therefore, it solves the problem that mutation-based test case generation schemes will destroy the test case structure.
[0088] Therefore, this invention designs a mutation-based test case generation method for two common data exchange formats: XML and JSON. Both formats are widely used in various configuration documents, data documents, and network interfaces, and are not limited to any specific programming language, framework, or generation target. Thus, the test cases generated by this invention have universality. At the same time, this invention uses randomization and semantic-based mutation to combine type information and mutation to enhance relevance and effectiveness. Furthermore, the randomization mutation strategy enhances diversity and richness, maximizing the smoothing out of differences in test case generation tasks from various sources.
[0089] Meanwhile, this invention addresses the problem of mutation-based methods disrupting test case format by designing a targeted mutation strategy. First, the initial test case is parsed. Then, based on the parsed test case node data, the corresponding data type is inferred. Finally, based on the corresponding child node data type, targeted mutation operations with correct semantics are applied, thus solving the problem of mutation disrupting test case format.
[0090] Of course, the present invention may have other various embodiments. Without departing from the spirit and essence of the present invention, those skilled in the art can make various corresponding changes and modifications according to the present invention, but these corresponding changes and modifications should all fall within the protection scope of the appended claims.
Claims
1. A mutation-based automatic test case generation system, characterized in that, It includes a test case parsing module, a test case structure analysis module, a content collection construction module, and a test case generation module; The test case parsing module is used to generate a test case tree corresponding to each seed test case based on the subordinate and parallel relationships of the nodes in each seed test case. The seed test cases are test cases in XML or JSON format. The test case structure analysis module is used to traverse each test case tree to obtain the composition structure and field type information of each corresponding seed test case; The content collection construction module is used to traverse the test case tree to obtain all values that appear in each seed test case, and divide all values of all seed test cases into multiple sets according to their respective field types; The test case generation module is used to generate new test cases sequentially based on each test case tree without changing the non-leaf nodes of the test case tree; The workflow of the test case generation module is divided into two steps: mutation strategy construction and mutation operation execution. The mutation strategy construction includes mutation strategy construction based on type inference. The mutation strategy based on type inference is constructed based on the results of use case structure analysis, adding semantic mutation strategies for nodes to the generated strategy set. Specifically, the mutation strategies used in the mutation strategy construction step based on type inference are all associated with one or more specific types, which are field types inferred from the use case structure analysis module. The mutation strategies based on type inference include bitwise reversal, arithmetic operation, arithmetic special value, and node copying. The bitwise reversal strategy is applicable to string type mutation, the arithmetic operation strategy is applicable to integer type and floating-point type, the arithmetic special value strategy is applicable to integer type and floating-point type, and the node copying strategy is applicable to array type. In bitwise reversal, when the string conforms to the Base64 encoding format, the string is first Base64 decoded, and then the bitwise reversal operation is applied to the Base64 decoded string. After that, the mutated string is re-encoded using Base64. The mutation operation execution step executes the mutation strategy constructed in the type inference-based mutation strategy construction step. Starting from the root node, the test case tree is traversed. When the leaf node of the test case tree is accessed, the mutation strategy selected according to the characteristics that the composition structure and field type information of the current leaf node conform to is executed to mutate the value of the leaf node of the current test case tree. The use case generation module combines type information and mutation by employing randomization and semantic-based mutation, while enhancing versatility through a randomization mutation strategy. The mutation strategy adopted by the system mutates at both the node level and the value level of the test case tree. The mutation strategy constructed by the mutation strategy construction method conforms to the original semantics and will not damage the main structure of the test cases. That is, the connection relationship of nodes with child nodes in the tree structure will not change, only the value of the leaf nodes will be changed.
2. The automatic test case generation system based on mutation as described in claim 1, characterized in that, The test case generation module is used to generate new test cases sequentially based on each test case tree without changing the non-leaf nodes of the test case tree. Specifically, it selects a corresponding mutation strategy to mutate the leaf nodes and leaf node values of the current test case tree according to the characteristics of the composition structure and field type information corresponding to the current test case tree. The mutation of the values is based on the values of the leaf nodes or on randomly selected values from each content set.
3. The automatic test case generation system based on variation as described in claim 1, characterized in that, The test case parsing module parses the seed test cases to generate a test case tree; When the seed test case is in XML format, the information mapped to each node of the test case tree includes the node name, node value, and attribute value of each node in the XML seed test case; when the seed test case is in JSON format, the information mapped to each node of the test case tree includes the key and value of each node in the JSON seed test case, where the key is the attribute name string and the value is the attribute value.
4. The automatic test case generation system based on mutation as described in claim 1, characterized in that, The workflow of the use case structure analysis module is divided into two steps: field type analysis and composition structure analysis. First, field type analysis is performed. This step involves traversing the test case tree and, for all nodes with values, attempting to parse the node values as specific types or special values. The specific types and special values to be attempted include integer types, floating-point types, string types, array types, Base64 encoded string types, boolean types, values of 0, values of 1, and values of -1. Next, a structural analysis is performed. This step involves traversing the test case tree starting from the root node and counting each node in the test case tree. The count includes: the total number of child nodes contained in the node, and the category count of the different types of child nodes contained in the node. The types of child nodes include integer type, floating-point type, and string type. The category count is obtained by counting the number of child nodes of different types, that is, counting the number of child nodes with a specific type of value, and using it as the category count value.
5. The automatic test case generation system based on variation as described in claim 1, characterized in that, The content collection construction module reads the test case tree to construct a test case content collection. The workflow of the content collection construction module is divided into two steps: test case content collection and test case content collection creation and updating. In the test case content collection step, the test case tree is traversed from the root node to collect the values of all nodes with values that appear in the test case tree, and the values of the nodes are used as content. Furthermore, the content is collected according to the type of content, namely integer type, floating-point type, and string type. In the use case content set creation and update step, the content of the corresponding type collected in the use case content collection step is added to the corresponding use case content set. The use case content set includes three categories: integer data set, floating-point data set, and string data set. The string data set is further divided into m subsets with a length step of 5, where m>4.
6. The automatic test case generation system based on variation as described in claim 1, characterized in that, The workflow of the test case generation module is divided into two steps: mutation strategy construction and mutation operation execution. The mutation strategy construction includes randomized mutation strategy construction. Specifically, during the traversal of the test case tree, the randomized mutation strategy construction has a random probability of executing a randomly selected mutation strategy. The mutation strategies used in the randomized mutation strategy construction include node insertion and replacement, splicing, and addition. The mutation operation execution step executes the mutation strategy constructed in the randomization-based mutation strategy construction step. Starting from the root node, the test case tree is traversed. When the leaf node of the test case tree is accessed, the leaf node of the current test case tree is mutated according to the mutation strategy randomly selected in the randomization-based mutation strategy construction.