Software testing method and software testing system

By using vector databases and feature vectors to generate supplementary test cases in the software testing system, the problems of incomplete coverage and inaccuracy in incremental code testing in existing technologies are solved, thereby improving the efficiency of full coverage and automated testing.

WO2026123673A1PCT designated stage Publication Date: 2026-06-18DIGIWIN CO LTD

Patent Information

Authority / Receiving Office
WO · WO
Patent Type
Applications
Current Assignee / Owner
DIGIWIN CO LTD
Filing Date
2025-07-11
Publication Date
2026-06-18

AI Technical Summary

Technical Problem

Existing software testing systems cannot achieve full coverage and automated testing, and lack unified testing standards, resulting in low testing efficiency, especially when dealing with incremental code, they cannot achieve accurate testing.

Method used

Test cases are obtained by the processor, and supplementary test cases are generated using feature vectors from the vector database based on the coverage of incremental code, thus automatically completing full coverage and accurate testing.

🎯Benefits of technology

It achieves full-coverage automated testing, improves the efficiency of testing operations, and ensures the accuracy and completeness of testing.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN2025108063_18062026_PF_FP_ABST
    Figure CN2025108063_18062026_PF_FP_ABST
Patent Text Reader

Abstract

A software testing method and a software testing system. The software testing method comprises the following steps: a processor acquires a test case; and when the test case has an incremental code, the processor generates a supplementary test case on the basis of coverage of the incremental code, a plurality of feature vectors in a vector database, and the test case. The step of generating a supplementary test case further comprises the following steps: the processor extracts key information of the test case to obtain a plurality of first data vector units; the processor performs similarity search on the basis of the plurality of first data vector units and the plurality of feature vectors to obtain a plurality of second data vector units; and the processor combines the plurality of second data vector units to generate the supplementary test case. Therefore, the performance of incremental code-based testing operations can be improved.
Need to check novelty before this filing date? Find Prior Art

Description

Software testing methodologies and software testing systems

[0001] This application is required to be filed with the China Patent Office on December 9, 2024, application number:

[0002] Priority is given to Chinese Patent Application No. 202411801653.X, entitled "Software Testing Method and Software Testing System", the entire contents of which are incorporated herein by reference. Technical Field

[0003] This invention relates to a testing method, and more particularly to a software testing method and system applied to software. Background Technology

[0004] During software development or application, developers can modify the code to improve its functionality. Generally, software testing systems generate test cases and test the added or modified code (i.e., incremental code). In this way, the software testing system can evaluate the scope and effectiveness of the incremental code's impact on the software based on the test case results.

[0005] However, with changes in software requirements and versions, the codebase used by the software contains a large amount of redundant code. Furthermore, current software testing systems rely on developers' experience to judge the impact of incremental code, lacking unified testing standards. As a result, current software testing systems not only fail to achieve full coverage and automated testing based on incremental code, but also fail to achieve accurate testing, leading to low testing efficiency. Summary of the Invention

[0006] This invention relates to a software testing method that can improve the efficiency of incremental code-based testing operations.

[0007] According to an embodiment of the present invention, the software testing method of the present invention includes the following steps: A processor acquires test cases. When a test case has incremental code, the processor generates supplementary test cases based on the coverage of the incremental code, according to multiple feature vectors in a vector database and the test cases. The aforementioned steps regarding generating supplementary test cases further include the following steps: The processor extracts key information of the test cases to obtain multiple first data vector units. The processor performs a similarity search based on the multiple first data vector units and multiple feature vectors to obtain multiple second data vector units. The processor combines the multiple second data vector units to generate supplementary test cases.

[0008] According to an embodiment of the present invention, the software testing system of the present invention includes memory and a processor. The memory is used to store a vector database. The processor is coupled to the memory. The processor is used to execute the software testing method described above.

[0009] Based on the above, the software testing method and system of the present invention acquire multiple second data vector units corresponding to test cases through a processor, and can break down the vector data corresponding to the test cases to automatically collect all code logic and data to be tested. By combining these second data vector units through the processor, the software testing system can automatically present test cases in a new combination. In this way, the software testing system can automatically complete full-coverage testing to achieve accurate testing, thereby improving the efficiency of testing operations.

[0010] To make the above features and advantages of the present invention more apparent and understandable, specific embodiments are described below in conjunction with the accompanying drawings. Attached Figure Description

[0011] Figure 1 is a block diagram of a software testing system according to an embodiment of the present invention;

[0012] Figure 2 is a flowchart of a software testing method according to an embodiment of the present invention;

[0013] Figure 3 is a block diagram of a software testing system according to another embodiment of the present invention;

[0014] Figures 4A and 4B are schematic diagrams of the operation of the software testing system of the embodiment of Figure 3 of the present invention;

[0015] Figure 5 is a partial flowchart of the software testing system of the embodiment of Figure 3 of the present invention;

[0016] Figure 6 is a partial flowchart of the software testing system of the embodiment of Figure 3 of the present invention;

[0017] Figure 7 is a flowchart of a software testing method according to another embodiment of the present invention.

[0018] Figure Label Explanation: 100, 300: Software testing system; 110, 310: Processor; 120, 320: Memory; 121, 321: Vector database; 322: Test case database; 324: Quality database; 325: Configuration table; 331: Relationship acquisition module; 332: Code mapping module; 333: Difference detection module; 334: Test case generation module; 335: Test case execution module; 336: Result analysis module; 340: Large model; 400: System under test; 410: Relationship system; 420: Recommendation system; 430: Monitoring system; B340: Box; EV: Feature vector; S210~S220, S221~S223, S411~S414, S421~S425, A, S430, S441~S445, S451~S459, S510~550, S610~S660, S710~S790: Steps; TS: Test Cases; TS': Supplementary Test Cases; UR1: User; UR2: Tester. Detailed Implementation

[0019] Reference will now be made in detail to exemplary embodiments of the invention, examples of which are illustrated in the accompanying drawings. Wherever possible, the same component symbols are used in the drawings and description to denote the same or similar parts.

[0020] Figure 1 is a block diagram of a software testing system according to an embodiment of the present invention. Referring to Figure 1, the software testing system 100 is suitable for testing software and is used to achieve automated, full-coverage testing and accurate testing. Users can operate electronic devices to call the software testing system 100 through an Application Programming Interface (API). Electronic devices may be, for example, mobile phones, tablet computers, laptops, and desktop computers.

[0021] In this embodiment, the software testing system 100 includes a processor 110 and a memory 120. The processor 110 is coupled to the memory 120. The memory 120 stores a vector database 121. The vector database 121 includes vectorized data, such as multiple feature vectors EV. These feature vectors EV indicate various source code, logic code, input parameters, and output parameters.

[0022] In this embodiment, memory 120 also stores computational software and other related algorithms, programs, and data used to implement the functions of feature extraction, combination, and various calculations of this invention. Memory 120 may be, for example, Dynamic Random Access Memory (DRAM), Flash memory, Non-Volatile Random Access Memory (NVRAM), or a combination of these.

[0023] In this embodiment, processor 110 accesses memory 120. Processor 110 may be, for example, a server, signal converter, field programmable gate array (FPGA), central processing unit (CPU), or other programmable general-purpose or special-purpose microprocessor, digital signal processor (DSP), programmable controller, application-specific integrated circuit (ASIC), programmable logic device (PLD), or other similar device or combination of these devices, which can load and execute computer program-related firmware or software to perform functions such as feature extraction, combination, and various calculations.

[0024] Figure 2 is a flowchart of a software testing method according to an embodiment of the present invention. Referring to Figures 1 and 2, the software testing system 100 can execute steps S210-S220 and S221-S223. The order of these steps S210-S220 and S221-S223 is merely illustrative and is not intended to be limiting.

[0025] In step S210, the processor 110 obtains the test case TS. In this embodiment, the processor 110 may invoke the software system coupled to the software testing system 100 to obtain the test case TS. Alternatively, the processor 110 may access memory 120 to obtain the temporarily stored test case TS.

[0026] When the acquired test case TS contains incremental code, it means that compared to the original software code, the test case TS includes code that has been added, deleted, or modified. At this time, processor 110 calculates the coverage of the incremental code. The coverage of the incremental code can, for example, indicate the completeness of the test case TS, such as having code that has not been executed and is therefore not covered. Processor 110 continues with step S220.

[0027] In step S220, processor 110 generates supplementary test cases TS' based on the coverage of incremental code, according to multiple feature vectors EV and test cases TS. Supplementary test cases TS' may, for example, be another test case that includes code not covered in test cases TS.

[0028] In other words, when the incremental code of test case TS is not fully covered, processor 110 accesses vector database 121 and generates new test cases (i.e., supplementary test cases TS') based on multiple feature vectors EV and test case TS.

[0029] In detail, step S220 includes multiple steps S221 to S223. In step S221, the processor 110 extracts key information of the test case TS to obtain multiple first data vector units. These first data vector units indicate the characteristics of various codes and parameters in the test case TS and are vectorized data.

[0030] In step S222, the processor 110 performs a similarity search based on a plurality of first data vector units and a plurality of feature vectors EV to obtain a plurality of second data vector units. These second data vector units include various codes and parameters similar to the plurality of first data vector units, and are vectorized data.

[0031] In other words, processor 110 performs feature extraction on test case TS and converts the extracted key information into multiple vectors (i.e., multiple first data vector units). Next, processor 110 calculates the similarity between these vectors and various known source code, logic code, input parameters, and output parameters (i.e., multiple feature vectors EV). Processor 110 selects the most similar feature vectors EV from each set as multiple second data vector units.

[0032] In step S223, the processor 110 combines multiple second data vector units to generate a supplementary test case TS'. That is, the processor 110 collects multiple logical codes and multiple parameters that are operable by the software testing system 100 and correspond to the test case TS, and reassembles these logical codes and parameters to generate another test case (i.e., the supplementary test case TS').

[0033] It is worth mentioning that, based on the incremental code coverage, the software testing system 100 performs similarity searches on multiple vectors (i.e., multiple first data vector units) and multiple feature vectors EV corresponding to the test case TS through the processor 110. This allows the system to filter out some feature vectors EV (i.e., multiple second data vector units) corresponding to these vectors. Since the multiple second data vector units represent various codes and parameters known and operable by the software testing system 100, the processor 110 reassembles these second data vector units to generate supplementary test cases TS' with full coverage. In this way, the software testing system 100 can automatically complete full-coverage testing to achieve accurate testing, thereby improving the efficiency of the testing operation.

[0034] Figure 3 is a block diagram of a software testing system according to another embodiment of the present invention. Referring to Figure 3, the software testing system 300 includes a processor 310 and a memory 320. The memory 320 stores a vector database 321. The vector database 321 includes a plurality of feature vectors EV. The processor 310, memory 320, and vector database 321 can be deduced by referring to the relevant description of the software testing system 100.

[0035] In the embodiment of Figure 3, the software testing system 300 is coupled to multiple systems to which the software is applied. The software testing system 300 invokes these systems to operate collaboratively with them. The multiple systems include a relationship system 410, a recommendation system 420, and a monitoring system 430. In some embodiments, the relationship system 410, the recommendation system 420, and the monitoring system 430 are integrated into the software testing system 300.

[0036] In this embodiment, the relational system 410 stores multiple use case sets, multiple use case groups, multiple use cases, and multiple source codes through a relational database or file system (not shown). The recommendation system 420 executes a diff engine module (e.g., a diff engine) to identify differences among the multiple source codes. The recommendation system 420 also recommends regression use cases based on the mapping relationship between source codes and use cases. The monitoring system 430 executes a coverage tool (e.g., Jacoco) to calculate the code coverage of each use case in the relational system 410.

[0037] In this embodiment, memory 320 also stores a test case database 322. The test case database 322 includes multiple test cases (e.g., test case TS). Memory 320 also stores multiple modules and a large model 340. The multiple modules include a relationship acquisition module 331, a code mapping module 332, a difference detection module 333, a test case generation module 334, a test case execution module 335, and a result analysis module 336. These modules 331-336 can be implemented, for example, in firmware or software. Processor 310 executes these modules 331-336 to implement various functions.

[0038] Specifically, the relation acquisition module 331 is used to collect data from the relational system 410. This data includes various test cases such as manual test cases, automated test cases, performance test cases, and security test cases. Furthermore, the relation acquisition module 331 also stores the collected data in the relational database or file system within the relational system 410. Thus, by invoking the relational system 410 through the relation acquisition module 331, the processor 310 further processes and analyzes the aforementioned various test cases.

[0039] In this embodiment, the code mapping module 332 is used to associate the data (i.e., use cases) collected by the relationship acquisition module 331 with the source code and establish a mapping relationship between the two. In this embodiment, the code mapping module 332 is implemented using a reverse engineering tool. The code mapping module 332 may be, for example, an APK decompiler for an Android application or a disassembler for a Windows program.

[0040] In this embodiment, the difference detection module 333 is used to periodically scan for changes in the source code and automatically update the mapping relationship established by the code mapping module 332 accordingly. In this embodiment, the difference detection module 333 is implemented using a version control system (e.g., a Git system).

[0041] In other words, when the source code is modified or contains new code, the difference detection module 333 automatically identifies the changed parts of the source code. The difference detection module 333 adjusts the mapping relationship between the use cases and the source code based on the aforementioned parts of the source code to update this mapping relationship.

[0042] In this embodiment, the test case generation module 334 is used to automatically generate new test cases (e.g., supplementary test cases TS') based on changes in the source code. In this embodiment, the test case generation module 334 infers the possible input and output points of the program by statically analyzing the structure, variable assignments, and function calls of the changed source code, and generates corresponding test cases accordingly.

[0043] In this embodiment, the test case execution module 335 executes the test cases generated by the test case generation module 334 according to a specific strategy and sequence. The aforementioned strategy and sequence are related to the simulation of scenarios such as concurrent testing and stress testing.

[0044] In this embodiment, the result analysis module 336 is used to analyze and evaluate the results of the test cases and determine whether the results meet the expected behavior. In this embodiment, the result analysis module 336 generates a judgment result by comparing the actual output of the test cases with the expected output. Alternatively, the result analysis module 336 generates a judgment result by statistically analyzing information such as the test coverage and error rate of the test cases. Thus, the result analysis module 336 evaluates the quality of the test operation based on the judgment result.

[0045] In this embodiment, the large model 340 is implemented as a trained large language model (LLM). The large model 340 is used to generate another use case (e.g., a supplementary test case TS') based on the use case (e.g., test case TS) and the vector database 321. The large model 340 may be, for example, a Chat Generation Pre-trained Transformer (ChatGPT).

[0046] Figure 4A is a schematic diagram of the operation of the software testing system according to the embodiment of Figure 3 of the present invention. Referring to Figures 3 and 4A, the software testing system 300 and the system under test 400 can operate collaboratively to perform test operations on the software in the system under test 400. The software testing system 300 can execute steps S411-S414, S421-S425, S430, and step A.

[0047] In this embodiment, the software testing system 300 can perform test operations and generate supplementary test cases TS' non-automatically through the operation of user UR1 or tester UR2. Alternatively, the software testing system 300 can periodically and automatically perform test operations and generate supplementary test cases TS'.

[0048] In detail, user UR1 can operate the system under test 400 and, through the system under test 400, call the software testing system 300 to initiate test operations. User UR1 dynamically analyzes and adjusts the tool scripts of the system under test 400 by operating the software testing system 300, and generates a configuration table 325 accordingly. User UR1 manually generates supplementary test cases TS' based on the configuration table 325 by operating the software testing system 300. In addition, user UR1 operates the software testing system 300 to store the configuration table 325 in the quality database 324. The configuration table 325 can be used to store relevant values ​​such as Business Importance (BIM), Historical Failure Rate (HFM), and adjustment factors (WeightFactorCC, WeightFactorBI, WeightFactorHF). These values ​​are initialized manually and subsequently automatically analyzed and adjusted using a machine learning model.

[0049] On the other hand, in step S411, the software testing system 300 accesses the system under test 400 and the test case database 322 therein through the processor 310 to perform automated testing operations on the test cases TS in the test case database 322.

[0050] In step S412, the processor 310 standardizes the test platform used to execute the test operations. Based on this test platform, the processor 310 executes test operations on the test case TS to continue steps S413-S414. Alternatively, based on this test platform, the processor 310 detects whether the test case TS has been modified to have incremental code to continue steps S421-S425, S430, and step A, and continues steps S413-S414.

[0051] Specifically, when a test case TS has an error code and is considered a failed test case, in step S413, the processor 310 executes the test case TS to generate a test report and analyzes the cause of the execution error. The test report also includes the level corresponding to the aforementioned cause to indicate the priority of the error.

[0052] In step S414, the software testing system 300 sends a notification message to the tester UR2 via the processor 310 according to the level. The notification message can be delivered, for example, via email or electronic bulletin. Thus, the tester UR2 receives the test case TS indicated as failed and the corresponding test report. The tester UR2 then accesses the configuration table 325 through the software testing system 300 to generate a supplementary test case TS'.

[0053] In this embodiment, the processor 310 sends notification messages to the tester UR2 at different frequencies based on different priorities. For example, when the test case TS has the highest priority level (e.g., indicated as a very high priority P0 test case level), the processor 310 sends an email notification to the tester UR2 in real time. When the test case TS has other priorities (e.g., indicated as a low priority P3 test case level), the processor 310 sends an email notification to the tester UR2 once a day at a set time.

[0054] When test case TS has incremental code, in step S421, processor 310 detects which functions the modified code (i.e., the incremental code) involves, and calculates a regression test case set based on the aforementioned functions. The regression test case set includes one or more regression test cases. Regression test cases may be, for example, test cases used to confirm whether the incremental code introduces new errors or causes errors in the original code generation.

[0055] In step S424, processor 310 executes each regression test case in the regression test case set to generate a corresponding test report, and filters the corresponding regression test cases based on the test reports indicating errors. Processor 310 stores the filtered regression test cases as newly generated test cases in test case database 322.

[0056] In step S422, the processor 310 detects that the code in the incremental code has been deleted. In step S425, the processor 310 accesses the test case database 322 to delete the test cases corresponding to the code deleted in step S422.

[0057] In step S423, the processor 310 executes test case TS and detects which code in the executed test case TS was not executed to detect which code was not covered by the executed test case TS. In addition, the processor 310 automatically supplements test cases based on the unexecuted code to continue step A, and then generates supplementary test case TS' in step S430.

[0058] Referring also to Figure 4B, which is a schematic diagram of the operation of the software testing system of the embodiment of Figure 3 of the present invention, when a test case TS has incremental code, and when the coverage of this incremental code indicates that the test case TS has uncovered code, the software testing system 300 can execute the large model 340 through the processor 310 to perform steps S441 to S445 and S451 to S459 in block B340. The embodiment in Figure 4 is used to illustrate how the software testing system 300 automatically supplements test cases based on the large model 340 to generate supplementary test cases TS'.

[0059] In steps S441 to S445, the processor 310 periodically executes a timed task to access the test case database 322 and to create and update multiple feature vectors EV in the vector database 321 based on the test cases (e.g., test case TS).

[0060] In step S441, the processor 310 extracts key information about the test case TS. This key information includes the input parameters, output parameters, call request parameters, return value, and call order of the test case TS.

[0061] In step S442, the processor 310 breaks down the key information from step S441. The processor 310 uses the broken-down key information as the smallest unit and vectorizes each smallest unit to generate a corresponding feature vector EV. The processor 310 stores these feature vectors EV in the vector database 321 to update the vector database 321. These feature vectors EV indicate various source code, logic code, input parameters, and output parameters.

[0062] For example, processor 310 splits the input parameters of test case TS into individual parameter values ​​and individual parameter names. Processor 310 splits the output parameters of test case TS into individual parameter values, individual parameter names, and at least one return value. Processor 310 vectorizes the aforementioned various parameter values, parameter names, and return values ​​to generate multiple feature vectors EV.

[0063] Simultaneously, in step S443, the processor 310 standardizes the key information from step S441. That is, the processor 310 performs preprocessing operations on the extracted key information. These preprocessing operations include removing irrelevant information from the key information, standardizing the data format of the key information, and handling missing values ​​in the key information.

[0064] In step S444, processor 310 extracts feature vectors from the standardized key information in step S443 to generate multiple first data vector units. Processor 310, for example, executes a word2vec model to perform vector feature extraction on metadata, converting the standardized key information into computable and structured vector data (i.e., multiple first data vector units). The feature vectors (i.e., multiple first data vector units) indicate various standardized source code, logic code, input parameters, and output parameters. The metadata refers to structured information extracted from the dataset obtained from the Application Programming Interface (API) and used to describe the data. The metadata may contain various types of information, such as field names (i.e., labels or titles of each data field), data types (i.e., the type of each field, such as string, integer, or floating-point number), field length / size (i.e., the maximum allowed value or number of characters for certain fields), constraints (e.g., whether it can be nullable or uniqueness requirements), and / or relationship information. If there are relationships between datasets (such as relationships between tables), it may also include details of foreign keys and relationships with other tables.

[0065] In step S445, processor 310 creates multiple indices for the feature vectors (i.e., multiple first data vector units) from step S444. Processor 310 assigns these indices to the multiple first data vector units respectively. Thus, each first data vector unit has its own index as a unique identifier.

[0066] In addition, processor 310 stores multiple first data vector units with identifiers into vector database 321. Processor 310 also constructs associations between these first data vector units and metadata.

[0067] In steps S451 to S459, processor 310 begins operation in response to the incremental code and coverage of test case TS. Processor 310 performs a single task based on the incremental code and coverage of test case TS to generate supplementary test case TS' based on test case TS.

[0068] In step S451, the processor 310 extracts key information of the test case TS.

[0069] In step S452, processor 310 standardizes the key information from step S451.

[0070] In step S453, processor 310 extracts feature vectors of the standardized key information from step S452 to generate multiple first data vector units.

[0071] In this embodiment, steps S451 to S453 can be deduced by referring to the relevant descriptions of steps S441, S443 to S444.

[0072] In step S454, processor 310 accesses vector database 321 to obtain the latest plurality of feature vectors EV. Furthermore, processor 310 performs a similarity search (e.g., cosine similarity) based on the plurality of first data vector units and the plurality of feature vectors EV in step S453, thereby recommending and obtaining a plurality of vectors similar to the plurality of first data vector units from these feature vectors EV. Processor 310 uses these similar vectors as a plurality of second data vector units.

[0073] Next, the processor 310 determines, based on the matching degree between the multiple second data vector units, whether to continue step S446 to generate a supplementary test case TS' according to the test case template in the test case database 322 and the incremental code of the test case TS, or to continue step S457 to combine the multiple second data vector units to generate the supplementary test case TS'. In this embodiment, the matching degree indicates the correlation between the multiple second data vector units, that is, it indicates the degree to which the known source code, logic code, input parameters, and output parameters correspond to shared functions.

[0074] Specifically, in step S455, the processor 310 determines whether the matching degree between the multiple similar vectors (i.e., multiple second data vector units) recommended in step S454 is greater than a preset value.

[0075] When the matching degree is greater than a preset value (e.g., 70%), it indicates that in step S454, multiple second data vector units indicate the same components and / or code to achieve shared functionality. Processor 310 continues with step S446.

[0076] In step S446, processor 310 retrieves corresponding metadata from vector database 321 based on the multiple similar vectors (i.e., multiple second data vector units) recommended in step S454. Furthermore, processor 310 generates new test cases according to the test case template of the metadata. Processor 310 uses the generated test cases as supplementary test cases TS' and stores the supplementary test cases TS' in test case database 322.

[0077] On the other hand, when the matching degree is less than or equal to a preset value (e.g., 70%), it indicates that in step S454, the multiple second data vector units indicate different components and / or codes to achieve their respective functions. The processor 310 continues with step S457.

[0078] In step S457, the processor 310 divides the multiple similar vectors (i.e., multiple second data vector units) recommended in step S454 into at least one group to shuffle the data, and determines whether the matching degree between the multiple second data vector units in each group is greater than a preset value (e.g., 70%), and then determines whether the matching operation corresponding to the shuffled data is successful.

[0079] When the shuffled data is successfully matched, it indicates that at least one group of multiple second data vector units in step S447 indicates the same components and / or code to achieve shared functionality. Processor 310 continues to step S458.

[0080] In step S458, the processor 310, according to preset logic, uses the shuffled data (i.e., multiple second data vector units) that indicate a successful match to reorganize the test cases to generate new test cases. The processor 310 uses the generated test cases as supplementary test cases TS' and stores the supplementary test cases TS' in the test case database 322. The preset logic includes logic for combining parameters and logic for arranging requests.

[0081] On the other hand, when matching of the shuffled data fails, it indicates that in step S457, the multiple second data vector units of each group indicate different components and / or codes to achieve their respective functions. Processor 310 continues with step S459.

[0082] In step S459, processor 310 executes large model 340 to generate new test cases based on the similar multiple vectors (i.e., multiple second data vector units) recommended in step S454. Processor 310 uses the generated test cases as supplementary test cases TS' and stores the supplementary test cases TS' in test case database 322.

[0083] Figure 5 is a partial flowchart of the software testing system of the embodiment of Figure 3 of the present invention. Referring to Figures 3 and 5, the software testing system 300 can execute steps S510 to S550 to illustrate how the processor 310 breaks down the data of the test case TS to obtain the corresponding vector data (e.g., multiple first data vector units, or multiple feature vectors EV), and further illustrate the implementation details of steps S451 to S453 or S441 to S445.

[0084] In this embodiment, in response to a test case TS with incremental code, the processor 310 executes steps S510 to S550 to obtain a plurality of first data vector units and update the vector database 321 accordingly.

[0085] In step S510, the processor 310 accesses the test case database 322 to obtain test cases TS, and extracts key information of the test cases TS. The key information includes the input parameters, output parameters, call request parameters, return values, and call order of the test cases TS.

[0086] In step S520, the processor 310 performs data preprocessing operations on the key information to generate key data based on the key information. Specifically, the processor 310 removes irrelevant information from the key information, standardizes the data format of the key information, and processes missing values ​​in the key information to obtain the key data.

[0087] In step S530, the processor 310 performs a data shuffling operation on the critical data to split the critical data into multiple data units based on multiple identifiers. These data units are multiple parts of the critical data and each has multiple identifiers. Each data unit may be, for example, the smallest unit of data such as a single parameter value, a single parameter name, or a single return value. Each data unit has a unique identifier.

[0088] In step S540, processor 310 vectorizes the multiple data units to generate a plurality of first data vector units. That is, processor 310 uses word embedding operations to convert the shredded data (i.e., the multiple data units) into a vector data format. Alternatively, processor 310 uses other operations related to converting text data into numerical data to generate a plurality of first data vector units based on the multiple data units.

[0089] In step S550, the processor 310 stores a plurality of first data vector units into the vector database 321 to update the vector database 321. Since each first data vector unit has its own identifier, the processor 310 can search the vector database 321 based on the identifier (e.g., similarity search) to obtain the corresponding first data vector unit.

[0090] In this embodiment, based on the current test case database 322, the processor 310 periodically executes steps S510 to S550 to obtain multiple feature vectors EV, and updates the vector database 321 accordingly. The details of the aforementioned operation can be simulated by multiple feature vectors EV as multiple first data vector units, and can be deduced by referring to the above steps S510 to S550.

[0091] Figure 6 is a partial flowchart of the software testing system of the embodiment of Figure 3 of the present invention. Referring to Figures 3 and 6, the software testing system 300 can execute steps S610 to S660 to illustrate how the processor 310 uses shuffled data (i.e., a portion of multiple second data vector units) to reorganize test cases to generate supplementary test cases TS', and further illustrate the implementation details of steps S454 to S458.

[0092] In step S610, the processor 310 accesses the test case database 322 to obtain test cases TS with incremental code, and vectorizes the extracted key information in the test cases TS to generate multiple first data vector units. That is, for the test case TS, the processor 310 performs data preprocessing and vectorization operations on the key information of this test case TS, as described in the explanation of steps S510 to S540.

[0093] In step S620, processor 310 accesses vector database 321 to obtain multiple feature vectors EV, and performs a similarity search (e.g., cosine similarity) based on multiple first data vector units and multiple feature vectors EV to generate similarity search results.

[0094] In step S630, the processor 310 obtains multiple feature vectors EV that are most similar to the query input data (i.e., multiple first data vector units) based on the similarity search results, and uses them as second data vector units.

[0095] In other words, the processor 310 uses vectorized query input data (i.e., multiple first data vector units) in the vector database 321 to perform a similarity search operation. Based on the similarity search results, the processor 310 obtains multiple feature vectors EV (i.e., multiple second data vector units) that are most similar to the multiple first data vector units respectively.

[0096] In step S640, the processor 310 performs a data recombination operation to combine multiple second data vector units according to preset logic to obtain a recombination result. The preset logic includes logic for combining parameters and logic for arranging requests.

[0097] In step S650, the processor 310 uses the reorganization result as a new test case to generate a supplementary test case TS'. Furthermore, the processor 310 stores the supplementary test case TS' in the test case database 322 to update the test case database 322.

[0098] In other words, the processor 310 recombines the shuffled data (i.e., the multiple second data vector units in step S630) according to specific logic, and presents the test cases (i.e., supplementary test cases TS') in a new combination manner.

[0099] In step S660, the processor 310 verifies and adjusts the supplementary test case TS'. In this way, the processor 310 ensures that the supplementary test case TS' meets the test requirements and functional objectives. Furthermore, if the verification result indicates that the objectives are not met, the processor 310 adjusts the supplementary test case TS' based on multiple feature vectors EV in the vector database 321, and repeats step S660 until the verification result indicates that the objectives are met.

[0100] Figure 7 is a flowchart of a software testing method according to another embodiment of the present invention. Referring to Figures 3 and 7, the software testing system 300 can execute steps S710 to S790 via the processor 310.

[0101] In step S710, processor 310 executes the diff engine to obtain the incremental code of test case TS. That is, processor 310 executes the diff engine to compare the differences between two versions of test case TS. The diff engine executes the `diff-git` command to query the line numbers corresponding to the aforementioned differences. Based on these line numbers, the diff engine performs a difference analysis operation on the two versions of test case TS. The difference analysis operation includes steps such as file splitting, file filtering, removing redundant data, and parsing line numbers from the difference sections.

[0102] In this embodiment, the processor 310 can perform a chained call operation to sequentially call multiple modules or devices applied by the diff engine (e.g., a slicing module, a filtering device, a difference analysis device, and a stubbing device). Thus, the processor 310 generates incremental code coverage based on the test case TS.

[0103] In detail, the processor 310 obtains the JSON file corresponding to the test case TS through the diff engine. The processor 310 then uses the diff engine to split the JSON file corresponding to the test case TS based on the differences between the two versions. The diff engine determines whether the split JSON file contains invalid files.

[0104] When the above judgment result is yes, the diff engine filters out invalid files from the split JSON file to generate the file to be analyzed. On the other hand, when the above judgment result is no, the diff engine uses the split JSON file as the file to be analyzed. Next, the diff engine obtains the change details information in the file to be analyzed. The diff engine analyzes the changed line numbers based on the change details information to calculate the incremental code coverage.

[0105] In this embodiment, the JSON file includes one or more variable line numbers. The diff engine performs an Abstract Syntax Tree (AST) operation based on the aforementioned variable line numbers to instantiate the JSON file into a concrete method in a Java file.

[0106] For example, the diff engine retrieves method-level difference information from Java files. The diff engine defines several fields for each method to form a specific method. These fields include startLine, endLine, change flag (isChange), requestmapping, changed method (requestMethod), method parameters (paramList), and changeperson (changePerson).

[0107] In step S720, processor 310 performs instrumentation on the code of test case TS. Processor 310 may, for example, perform on-the-fly instrumentation.

[0108] In other words, processor 310 executes the Java Virtual Machine (JVM) to start an instrumented agent program by specifying a particular JAR file through a program (e.g., a Java Agent). Processor 310 also uses the Java Class Loader to determine whether the code includes transformations or modifications to class files before loading a class into the agent program, and inserts statistical code into this class file. Thus, incremental code coverage analysis can be performed during the execution of the code by the Java Virtual Machine.

[0109] In step S730, processor 310 isolates incremental code based on its coverage. Processor 310 can achieve intelligent isolation of incremental code, for example, by adding labels to the incremental code using traffic annotations or by adding custom fields or specific parameters to the HTTP request header. In this way, processor 310 distinguishes between code coverage generated by manual test cases, automated test cases, or traffic replay test cases, thus ensuring that testing operations for different test cases do not interfere with each other.

[0110] In step S740, processor 310 applies data-driven and automation technology to the aforementioned steps S710 to S730 to improve the efficiency of the test operation and the coverage of the code.

[0111] In step S750, processor 310 generates an incremental code coverage report based on the operation results of steps S710 to S740.

[0112] In step S760, processor 310 executes large model 340 to intelligently generate supplementary test cases TS'. Operational details of step S760 can be found in the relevant description of the embodiment shown in Figure 4B.

[0113] In other words, based on various applications including model-based, data-driven, natural language processing, reinforcement learning, and code analysis operations, processor 310 executes large model 340 to generate one or more strategies based on test cases TS and multiple feature vectors EV. Among various strategies, processor 310 executes large model 340 to analyze the acquired incremental code based on the correlation between test cases TS and the code within them, and automatically generates one or more test cases as supplementary test cases TS'.

[0114] In addition, the processor 310 automatically executes each supplementary test case TS'. If the executed supplementary test case TS' fails multiple times consecutively (e.g., 3 times), the processor 310 automatically sends a notification message of the corresponding test report to the tester.

[0115] In step S770, processor 310 performs continuous integration (CI) / continuous delivery / deployment (CD) operations to automate the integration, testing, and final deployment of the software for various use cases.

[0116] In step S780, processor 310 sorts the uncovered code included in test cases TS according to multiple factors to define test case levels. Specifically, processor 310 collects source data from other application systems such as code repositories, continuous integration systems, and defect tracking systems. Processor 310 calculates a comprehensive score based on the source data to classify test case levels.

[0117] In this embodiment, the source data includes the number of lines of code used by the software, code complexity (e.g., cyclomatic complexity), business importance, and historical failure rate, each represented by multiple parameters R1 to R4. Business importance can be evaluated, for example, based on business requirements and priorities. The historical failure rate is evaluated, for example, based on the number of failures that have occurred in the past.

[0118] For example, processor 310 calculates the number of lines of code that are not covered for each method corresponding to the uncovered code to generate a calculation result, and represents the calculation result as parameter R1.

[0119] In this embodiment, the processor 310 also configures corresponding weight values ​​for multiple data such as code complexity (e.g., cyclomatic complexity), business importance, and historical failure rate according to the uncovered code, and represents them respectively by multiple code weights W2 to W4.

[0120] For example, processor 310 assigns a weight W2 to the code complexity of each method corresponding to the uncovered code. The higher the code complexity (i.e., parameter R2), the larger the weight W2. Processor 310 assigns a weight W3 to the impact of each method corresponding to the uncovered code on the business logic. The higher the business importance (i.e., parameter R3), the larger the weight W3. Processor 310 assigns a weight W4 to the number of past failures of each method corresponding to the uncovered code. The higher the failure rate (i.e., parameter R4), the larger the weight W4.

[0121] Continuing from the above explanation, the processor 310 calculates the weighted sum of the uncovered code based on multiple code weights W2 to W4 to obtain a comprehensive score, and classifies the test case TS into test case levels based on the comprehensive score. The test case levels include multiple test case levels P0 to P3, which respectively indicate that the test case TS is of the highest level P0, the highest level P1, the middle level P2, and the lowest level P3.

[0122] In this embodiment, the multiple code weights W2 to W4 include the code complexity weight W2, business importance weight W3, and historical failure rate weight W4 corresponding to the uncovered code, respectively.

[0123] In detail, processor 310 calculates the code complexity weight W2 of the individual method corresponding to the uncovered code, which can be expressed by the following formula (1). In formula (1), WCC,M represents the code complexity weight W2, CCM represents the code complexity of this method, MinCC represents the minimum code complexity among all methods, MaxCC represents the maximum code complexity among all methods, and WFCC represents the adjustment factor. The adjustment factor is used to control the proportion of the code complexity weight W2 among all weights W2 to W4.

[0124] In this embodiment, the processor 310 calculates the business importance weight W3 of the individual method corresponding to the uncovered code, which can be expressed by the following formula (2). In formula (2), WBI,M represents the business importance weight W3, BIM represents the business importance of this method, MinBI represents the minimum business importance among all methods, MaxBI represents the maximum business importance among all methods, and WFBI represents the adjustment factor of business importance.

[0125] In this embodiment, the processor 310 calculates the historical failure rate weight W4 of the individual method corresponding to the uncovered code, and it can be expressed by the following formula (3). In formula (3), WHF,M represents the historical failure rate weight W4, HFM represents the historical failure rate of this method, MinHF represents the minimum historical failure rate among all methods, MaxHF represents the maximum historical failure rate among all methods, and WFHF represents the adjustment factor of the historical failure rate.

[0126] In this embodiment, based on formulas (1) to (3), the minimum and minimum values ​​of code complexity, business importance, historical failure rate, and multiple adjustment factors for all methods can be stored in memory 320 and can be stored in the form of a configuration table (e.g., configuration table 325 shown in Figure 4A). In this embodiment, processor 310 can execute large model 340 to automatically analyze the data applied to test case TS based on the number of covered lines of code (i.e., parameter R1), thereby generating and adjusting the initial values ​​of the aforementioned multiple parameters, the business importance of a single method, and the historical failure rate of a single method.

[0127] In this embodiment, based on formulas (1) to (3), the processor 310 calculates the weighted total value of the uncovered code, which can be expressed by formula (4). In formula (4), Score,M represents the weighted total value (i.e., the comprehensive score), WCC,i represents the code complexity of each method, WBI,i represents the business importance of each method, and WCC,i represents the historical failure rate of each method.

[0128] In this embodiment, when the weighted total value (i.e., the overall score) is greater than or equal to a first preset value, the processor 310 defines the test case TS as a high-level test case (P0). When the weighted total value is less than the first preset value but greater than or equal to a second preset value, the processor 310 defines the test case TS as a high-level test case (P1). When the weighted total value is less than the second preset value but greater than or equal to a third preset value, the processor 310 defines the test case TS as a medium-level test case (P2). When the weighted total value is less than the third preset value, the processor 310 defines the test case TS as a low-level test case (P3).

[0129] In this embodiment, the aforementioned preset values ​​can be stored in memory 320. Processor 310 can execute large model 340 to automatically analyze the data applied by test case TS based on the number of lines of code covered (i.e., parameter R1), and then define these preset values.

[0130] In step S790, processor 310 executes at least one strategy to generate supplementary test cases TS' based on the test case level corresponding to the weighted total value (i.e., the overall score), according to the test case database 322, the vector database 321, and the uncovered code. The operation of generating supplementary test cases TS' can be referred to and deduced from the relevant descriptions of step S430 in Figures 4A and 4B.

[0131] In addition, processor 310 sends notifications to testers based on the use case level. These notifications instruct software testing system 300 to generate supplementary test cases TS'. The notifications may be sent, for example, via email or electronic bulletin.

[0132] In this embodiment, before generating supplementary test cases TS', the processor 310 first executes the large model 340 to search the test case database 322 to determine whether the test case database 322 includes test cases with uncovered code (e.g., pre-test cases and / or post-test cases).

[0133] When the above determination result is yes, the processor 310 accesses the corresponding test case as at least a part of the supplementary test case TS', thereby avoiding the generation of duplicate test cases. Furthermore, the processor 310 may execute the large model 340 to generate the supplementary test case TS' based on the vector database 321 and the uncovered code. On the other hand, when the above determination result is no, the processor 310 executes the large model 340 to generate the supplementary test case TS' based on the vector database 321 and the uncovered code.

[0134] In this embodiment, when the use case level is very high (P0), the processor 310 executes the first and second strategies. When the use case level is high (P1), the processor 310 executes the second and third strategies. When the use case level is medium (P2), the processor 310 executes the third strategy. When the use case level is low (P3), the processor 310 executes the fourth strategy.

[0135] In this embodiment, the first strategy instructs the generation of supplementary test cases TS' based on functionality. In the first strategy, the processor 310 calculates the cyclomatic complexity of a single method corresponding to the uncovered code. The cyclomatic complexity is equal to the number of nodes judged in the flowchart of the method (e.g., represented by parameter P) plus 1. That is, the cyclomatic complexity is equal to P+1.

[0136] Continuing from the above explanation, the determined number of nodes indicates the number of nodes used to change the process control flow, such as the number of nodes corresponding to if statements, while loops, etc. Based on the first strategy, the processor 310 generates one or more (i.e., 4) supplementary test cases TS' according to the cyclomatic complexity (e.g., P+1 = 4).

[0137] In this embodiment, the second strategy instructs the generation of supplementary test cases TS' based on boundary tests. In the second strategy, the processor 310 supplements at least four supplementary test cases TS'. The supplementary test cases respectively indicate multiple boundary values ​​such as slightly less than the minimum value, the minimum value, the maximum value, and slightly greater than the maximum value.

[0138] For example, suppose the uncovered code instructs a numeric input box to require the user to enter an integer between 10 and 50. Processor 310 may generate four supplementary test cases TS'. The first supplementary test case TS' generated by processor 310 indicates that the input is slightly less than the minimum value (e.g., 9) and instructs the system under test (e.g., software test system 300 or other application systems it calls) to provide a prompt indicating that the input is less than the minimum value.

[0139] Furthermore, the second supplementary test case TS' generated by processor 310 indicates that the input is the minimum value (e.g., 10) and indicates whether the system under test accepts the minimum value as input. The third supplementary test case TS' generated by processor 310 indicates that the input is the maximum value (e.g., 50) and indicates whether the system under test accepts the maximum value as input. The fourth supplementary test case TS' generated by processor 310 indicates that the input is slightly larger than the maximum value (e.g., 51) and indicates whether the system under test provides a prompt indicating that the input is larger than the maximum value.

[0140] In this embodiment, the third strategy instructs the generation of supplementary test cases TS' based on state testing. In the third strategy, the processor 310 sets the initial state, inputs or events, expected results, and verification steps. The initial state indicates the initial state required for the supplementary test case TS'. The inputs or events indicate the necessary inputs or triggering events provided by the supplementary test case TS' to perform state transitions. The expected results indicate the defined target state of the supplementary test case TS' after executing the inputs or events, and any expected outputs or behaviors. The verification steps indicate how the supplementary test case TS' demonstrates whether the system under test has achieved the expected state and results.

[0141] In this embodiment, the fourth strategy instructs the generation of supplementary test cases TS' based on anomaly testing. In the fourth strategy, the processor 310 writes supplementary test cases TS' to verify the behavior of supplementary test cases TS' under abnormal or error conditions. The aforementioned behaviors include, for example, parameter errors, format errors, and errors in required fields.

[0142] In summary, the software testing method and system of the present invention, through a processor performing similarity searches on multiple vectors (i.e., multiple first data vector units) and multiple feature vectors corresponding to test cases, can filter out some feature vectors as multiple second data vector units. Thus, by reorganizing these second data vector units by the processor, the software testing system can generate fully covered supplementary test cases, thereby automatically completing full-coverage testing to achieve accurate testing and improve the efficiency of testing operations.

[0143] Furthermore, based on the above description, the software testing method and system of the present invention support software regression and reduce reliance on human resources. The software testing system can create scenarios and data that are difficult to achieve manually, and can execute supplementary test cases beyond the capabilities of manual testing, thereby significantly improving the coverage of test operations. The software testing system can execute test operations in advance, allowing test operations and development operations to be executed simultaneously, thereby improving the agility of development projects and reducing the time spent in the testing phase.

[0144] Based on a thorough understanding of the requirements of testing operations, the corresponding system architecture, and implementation details, the software testing system can clearly define the scope of testing operations by focusing on code-level development. The software testing system can accurately determine the concepts of continuous integration, continuous deployment, and continuous testing that need to be integrated into the testing operations. Simultaneously, by employing white-box and black-box testing methods, the software testing system can ensure code standardization, quality, and security. The aforementioned testing methods include executing unit tests and evaluating test coverage, as well as utilizing automated testing operations to complete functional verification.

[0145] Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention, and not to limit them. Although the present invention has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that modifications can still be made to the technical solutions described in the foregoing embodiments, or equivalent substitutions can be made to some or all of the technical features therein. Such modifications or substitutions do not cause the essence of the corresponding technical solutions to deviate from the scope of the technical solutions of the embodiments of the present invention.

Claims

A software testing method, characterized in that, include: Obtain test cases through the processor; as well as Through the processor, when the test case has incremental code, supplementary test cases are generated based on the coverage of the incremental code, according to multiple feature vectors in the vector database and the test case, including: Extract key information from the test cases to obtain multiple first data vector units; A similarity search is performed based on the plurality of first data vector units and the plurality of feature vectors to obtain a plurality of second data vector units; and The multiple second data vector units are combined to generate the supplementary test cases. The software testing method according to claim 1 is characterized in that, The steps for extracting the key information of the test cases to obtain the plurality of first data vector units include: The processor generates key data based on the key information. The processor divides the key data into multiple data units based on multiple identifiers, wherein each of the multiple data units has the multiple identifiers; and The processor vectorizes the plurality of data units to generate the plurality of first data vector units. The software testing method according to claim 2 is characterized in that, The processor combines the plurality of second data vector units according to preset logic to generate the supplementary test cases, wherein the preset logic includes parameter combination logic and request arrangement logic. The software testing method according to claim 1 is characterized in that, The step of generating the supplementary test cases based on the coverage of the incremental code, according to the plurality of feature vectors in the vector database and the test cases, further includes: Based on the matching degree between the plurality of second data vector units, it is determined whether to combine the plurality of second data vector units to generate the supplementary test case, or to generate the supplementary test case based on the test case template in the test case database and the incremental code. The software testing method according to claim 1 is characterized in that, Also includes: The processor stores the plurality of first data vector units into the vector database to update the vector database. The software testing method according to claim 1 is characterized in that, Also includes: The processor accesses the test case database to obtain the test cases; as well as The processor stores the supplementary test cases in the test case database to update the test case database. The software testing method according to claim 1 is characterized in that, Also includes: The processor calculates the weighted total value of the uncovered code included in the test case based on multiple code weights. as well as The processor executes at least one strategy, based on the test case level corresponding to the weighted sum value, to generate the supplementary test cases from the test case database, the vector database, and the uncovered code. The software testing method according to claim 7 is characterized in that, Also includes: The processor issues notification information based on the use case level to instruct the generation of the supplementary test cases. The software testing method according to claim 7 is characterized in that, The multiple code weights include the code complexity weight, business importance weight, and historical failure rate weight corresponding to the uncovered code, respectively. A software testing system, characterized in that, include: Memory, used to store the vector database; as well as A processor, coupled to the memory, is used to execute the software testing method as described in claim 1.