Assertion information generation method and apparatus, computer device, and storage medium

By converting the numerical dictionary of the test case output dataset and generating centroid vectors, assertion information is automatically determined, solving the problem of tedious and time-consuming manual assertion writing and improving testing efficiency and assertion accuracy.

CN114780392BActive Publication Date: 2026-06-23INDUSTRIAL AND COMMERCIAL BANK OF CHINA

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
INDUSTRIAL AND COMMERCIAL BANK OF CHINA
Filing Date
2022-04-08
Publication Date
2026-06-23

AI Technical Summary

Technical Problem

Manually writing assertions is tedious, time-consuming, and inefficient.

Method used

By obtaining the output dataset of the test cases, the field values ​​are transformed and processed using a numeric dictionary, the centroid vector is determined, assertion information is generated, and stored in the database for subsequent retrieval.

Benefits of technology

It enables automatic determination of test case assertions, improving the efficiency of test script writing and the sufficiency of assertions.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN114780392B_ABST
    Figure CN114780392B_ABST
Patent Text Reader

Abstract

The application relates to an assertion information generation method and device, computer equipment and a storage medium, and is applied to the technical field of big data. The method comprises the following steps: acquiring output data sets of test cases; the output data sets comprise multiple groups of output data, each group of output data is output data of the test case executed once, and each group of output data comprises field values of multiple fields; the field values of the fields in each group of output data are converted and processed through a digital dictionary to obtain output vectors corresponding to each group of output data; a target output vector closest to a centroid vector is determined from the output vectors; the centroid vector is the center of a vector set formed by the output vectors; the target output vector is restored to original field values, and assertion information of the test case is generated based on the original field values. The method realizes automatic determination of data needing assertion in the test case, and can improve the programming efficiency of a test script.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of big data technology, and in particular to an assertion information generation method, apparatus, computer equipment, storage medium and computer program product. Background Technology

[0002] An assertion is a first-order logic statement (e.g., a logical condition that results in either true or false) in a program. Its purpose is to represent and verify the results expected by the software developer—when the program executes to the assertion, the corresponding assertion should be true. If the assertion is false, the program will stop execution and provide an error message. Current testing standards require that automated scripts written by testers must include assertions; these assertions are often implemented manually by the testers.

[0003] However, manually writing assertions is tedious, time-consuming, and inefficient. Summary of the Invention

[0004] Therefore, it is necessary to address the technical problems of the cumbersome, time-consuming, and inefficient manual assertion writing method, and to provide an assertion information generation method, apparatus, computer device, computer-readable storage medium, and computer program product.

[0005] Firstly, this application provides a method for generating assertion information. The method includes:

[0006] Obtain the output dataset of the test case; the output dataset includes multiple sets of output data, each set of output data is the output data of the test case executed once, and each set of output data includes the field values ​​of multiple fields;

[0007] The field values ​​of each field in each set of output data are transformed using a digital dictionary to obtain the output vector corresponding to each set of output data.

[0008] From each set of output vectors, determine the target output vector that is closest to the centroid vector; the centroid vector is the center of the vector set formed by each set of output vectors.

[0009] The target output vector is restored to its original field values, and assertion information for the test case is generated based on these original field values.

[0010] In one embodiment, the number of times each field value of each field appears in the output dataset is obtained;

[0011] Based on the frequency of occurrence, a target field that meets the preset conditions is determined from the plurality of fields;

[0012] The process of converting the field values ​​of each field in each set of output data using a numeric dictionary includes:

[0013] The value of the target field in each set of output data is transformed using the numerical dictionary.

[0014] In one embodiment, determining the target field that meets preset conditions from the plurality of fields based on the number of occurrences includes:

[0015] For each field, based on the number of occurrences, determine the probability of occurrence of each field value, and determine the target occurrence probability with the largest value from the probability of occurrence of each field value;

[0016] The target field is obtained by removing fields from the plurality of fields whose probability of occurrence is less than a threshold.

[0017] In one embodiment, before determining the target output vector closest to the centroid vector from each set of output vectors, the method further includes:

[0018] Combine the output vectors of each group into a determinant and obtain the value of the determinant;

[0019] The ratio of the sum of the output vectors of each group to the value of the determinant is obtained and used as the centroid vector.

[0020] In one embodiment, before determining the target output vector closest to the centroid vector from each set of output vectors, the method further includes:

[0021] Normalize the output vectors of each group to obtain the normalized vectors corresponding to each group of output vectors;

[0022] The step of determining the target output vector closest to the centroid vector from each set of output vectors includes:

[0023] From each group of normalized vectors, determine the target normalized vector that is closest to the centroid vector;

[0024] The step of restoring the target output vector to its original field values ​​includes:

[0025] The normalized vector of the target is restored to the original field value.

[0026] In one embodiment, after restoring the target output vector to its original field values ​​and generating the assertion information for the test cases based on those original field values, the method further includes:

[0027] Store the test cases and their assertion information in the database;

[0028] When the test case is executed again, the assertion information of the test case is retrieved from the database;

[0029] When the actual output data of the test case does not match the assertion information, an error message is fed back to the test terminal, enabling the test terminal to perform error analysis based on the error message.

[0030] Secondly, this application also provides an assertion information generation apparatus. The apparatus includes:

[0031] The acquisition module is used to acquire the output dataset of the test case; the output dataset includes multiple sets of output data, each set of output data is the output data of the test case executed once, and each set of output data includes the field values ​​of multiple fields;

[0032] The conversion module is used to convert the field values ​​of each field in each set of output data using a number dictionary, so as to obtain the output vector corresponding to each set of output data.

[0033] The determination module is used to determine the target output vector that is closest to the centroid vector from each group of output vectors; the centroid vector is the center of the vector set formed by each group of output vectors.

[0034] The generation module is used to restore the target output vector to its original field values ​​and generate assertion information for the test cases based on the original field values.

[0035] In one embodiment, the device further includes a filtering module for obtaining the number of times each field value of each field appears in the output dataset; and determining a target field that meets preset conditions from the plurality of fields based on the number of occurrences.

[0036] The conversion module is also used to convert the field values ​​of the target field in each set of output data using the digital dictionary.

[0037] In one embodiment, the filtering module is further configured to, for each field, determine the probability of occurrence of each field value based on the number of occurrences, and determine the target occurrence probability with the largest value from the occurrence probabilities of each field value; and remove fields from the plurality of fields whose target occurrence probability is less than a threshold to obtain the target field.

[0038] In one embodiment, the apparatus further includes a centroid determination module, configured to combine the output vectors into a determinant and obtain the value of the determinant; and to obtain the ratio of the sum of the output vectors to the value of the determinant as the centroid vector.

[0039] In one embodiment, the device further includes a normalization module for normalizing each group of output vectors to obtain normalized vectors corresponding to each group of output vectors.

[0040] The determination module is also used to determine the target normalized vector that is closest to the centroid vector from each group of normalized vectors;

[0041] The generation module is also used to restore the normalized target vector back to the original field value.

[0042] In one embodiment, the device further includes:

[0043] A storage module is used to store the test cases and the assertion information of the test cases in a database;

[0044] The assertion verification module is used to retrieve the assertion information of the test case from the database when the test case is executed again; when the actual output data of the test case does not match the assertion information, it feeds back error information to the test terminal, so that the test terminal can perform error analysis based on the error information.

[0045] Thirdly, this application also provides a computer device. The computer device includes a memory and a processor, the memory storing a computer program, and the processor executing the computer program to perform the following steps:

[0046] Obtain the output dataset of the test case; the output dataset includes multiple sets of output data, each set of output data is the output data of the test case executed once, and each set of output data includes the field values ​​of multiple fields;

[0047] The field values ​​of each field in each set of output data are transformed using a digital dictionary to obtain the output vector corresponding to each set of output data.

[0048] From each set of output vectors, determine the target output vector that is closest to the centroid vector; the centroid vector is the center of the vector set formed by each set of output vectors.

[0049] The target output vector is restored to its original field values, and assertion information for the test case is generated based on these original field values.

[0050] Fourthly, this application also provides a computer-readable storage medium. The computer-readable storage medium stores a computer program thereon, which, when executed by a processor, performs the following steps:

[0051] Obtain the output dataset of the test case; the output dataset includes multiple sets of output data, each set of output data is the output data of the test case executed once, and each set of output data includes the field values ​​of multiple fields;

[0052] The field values ​​of each field in each set of output data are transformed using a digital dictionary to obtain the output vector corresponding to each set of output data.

[0053] From each set of output vectors, determine the target output vector that is closest to the centroid vector; the centroid vector is the center of the vector set formed by each set of output vectors.

[0054] The target output vector is restored to its original field values, and assertion information for the test case is generated based on these original field values.

[0055] Fifthly, this application also provides a computer program product. The computer program product includes a computer program that, when executed by a processor, performs the following steps:

[0056] Obtain the output dataset of the test case; the output dataset includes multiple sets of output data, each set of output data is the output data of the test case executed once, and each set of output data includes the field values ​​of multiple fields;

[0057] The field values ​​of each field in each set of output data are transformed using a digital dictionary to obtain the output vector corresponding to each set of output data.

[0058] From each set of output vectors, determine the target output vector that is closest to the centroid vector; the centroid vector is the center of the vector set formed by each set of output vectors.

[0059] The target output vector is restored to its original field values, and assertion information for the test case is generated based on these original field values.

[0060] The aforementioned assertion information generation method, apparatus, computer device, storage medium, and computer program product, after acquiring the output dataset of the test case, transforms the field values ​​of each field in each group of output data using a digital dictionary to obtain the output vector corresponding to each group of output data. Then, from each group of output vectors, the target output vector closest to the centroid vector is determined, where the centroid vector is the center of the vector set formed by the output vectors of each group. The target output vector is restored to its original field values, and assertion information for the test case is generated based on these original field values. This method, by analyzing the output dataset obtained after the test case has been executed multiple times, determines the benchmark data that can be used for assertion verification of the test case, realizing the automatic determination of the data that needs to be asserted in the test case. Testers do not need to spend too much time writing assertion content; they only need to call the corresponding assertion comparison interface, thereby improving the efficiency of test script writing and the sufficiency of assertions. Attached Figure Description

[0061] Figure 1 This is a flowchart illustrating an assertion information generation method in one embodiment;

[0062] Figure 2 This is a flowchart illustrating the field value conversion process in one embodiment;

[0063] Figure 3 This is a flowchart illustrating the target field determination step in one embodiment;

[0064] Figure 4 This is a block diagram of an assertion information generation system in one embodiment;

[0065] Figure 5 This is a complete flowchart of the assertion information generation method in another embodiment;

[0066] Figure 6 This is a structural block diagram of an assertion information generation device in one embodiment;

[0067] Figure 7 This is an internal structural diagram of a computer device in one embodiment. Detailed Implementation

[0068] To make the objectives, technical solutions, and advantages of this application clearer, the following detailed description is provided in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative and not intended to limit the scope of this application.

[0069] It should be noted that the user information (including but not limited to user device information, user personal information, etc.) and data (including but not limited to data used for analysis, data stored, data displayed, etc.) involved in this application are all information and data authorized by the user or fully authorized by all parties.

[0070] In one embodiment, such as Figure 1 As shown, an assertion information generation method is provided. This embodiment illustrates the application of this method to a terminal. It is understood that this method can also be applied to a server, and to a system including both a terminal and a server, and is implemented through interaction between the terminal and the server. The terminal can be, but is not limited to, various personal computers, laptops, smartphones, tablets, IoT devices, and portable wearable devices. IoT devices can include smart speakers, smart TVs, smart air conditioners, smart in-vehicle devices, etc. Portable wearable devices can include smartwatches, smart bracelets, head-mounted devices, etc. The server can be implemented using a standalone server or a server cluster consisting of multiple servers. In this embodiment, the method includes the following steps:

[0071] Step S110: Obtain the output dataset of the test case; the output dataset includes multiple sets of output data, each set of output data is the output data of the test case executed once, and each set of output data includes the field values ​​of multiple fields.

[0072] Test cases, or test scenarios, describe the tasks involved in testing a specific software product, and mainly include test objectives, test environment, and test scripts. It should be noted that, in this application, test cases capable of generating assertion information must be successfully executed test cases.

[0073] The output data can be the data output during the execution of the test script in the test case. During the execution of the test script, multiple data will be output, each corresponding to a field. Therefore, when the test script is executed once, the field values ​​of multiple fields will be output, forming a set of output data.

[0074] In this context, a field can be understood as a data attribute, and a field value represents the value that the attribute can take. For example, a field can be "name" or "age," and the field value under "age" can be "20 years old" or "30 years old," etc.

[0075] In the specific implementation, for ease of description, this application takes the generation of assertion information for a test case as an example. Let the test case be A. The output data of each execution of test case A is saved in advance. The output data of each execution is used as a set of output data to form the output dataset of test case A.

[0076] The data structure for saving output data can be: script class name, @test method name, case description, and interface return data (i.e., output data). The first three fields uniquely identify a test case for an automated script. The interface return data saves the data returned by the interface being tested, stored in JSON format, as shown in Table 1 below:

[0077] Table 1. Data Structure for Saving Output Data

[0078]

[0079] Table 1 shows that the data returned by the interface contains four fields: Inprivate.from represents a field with the value "online banking"; Inprivate.currtype represents a field with the value "1.00"; InfoCommV10.trxCode represents a field with the value "2631"; and InfoCommV10.brno represents a field with the value "00998".

[0080] Step S120: The field values ​​of each field in each set of output data are transformed using a digital dictionary to obtain the output vector corresponding to each set of output data.

[0081] In the implementation, since a field may have multiple types of values ​​(e.g., the `Inprivate.from` field could have values ​​like "online banking," "counter service," and "online"), to avoid data errors and confusion, the possible values ​​for each field can be listed and represented using a numeric dictionary. For example, 1 represents online banking, 2 represents counter service, and 3 represents online service. Similarly, the values ​​of other fields are converted accordingly to obtain a numeric dictionary of different values ​​for multiple fields. By representing the field values ​​of each set of output data according to the corresponding values ​​in the numeric dictionary, the converted values ​​of each field in each set of output data can be obtained. Based on the converted values ​​of each field, the output vector corresponding to each set of output data is obtained.

[0082] For example, suppose the output data includes three fields: field A, field B, and field C. Field A has three possible values: A1, A2, and A3, corresponding to 1, 2, and 3 in the numeric dictionary. Field B has two possible values: B1 and B2, corresponding to 1 and 2 in the numeric dictionary. Field C has one possible value: C1, corresponding to 1 in the numeric dictionary. For the output data: {field A: A2, field B: B1, field C: C1}, after conversion using the numeric dictionary, the output vector will be: [2 1 1].

[0083] Step S130: Determine the target output vector that is closest to the centroid vector from each set of output vectors; the centroid vector is the center of the vector set formed by each set of output vectors.

[0084] In the specific implementation, since assertions are used for verification, the data used to generate assertions should be the baseline data. Therefore, after obtaining the output vectors corresponding to each group of output data, the center vector of the vector set formed by each group of output vectors can be determined as the centroid vector. The output vector closest to the centroid vector from each group of output vectors is determined as the target output vector for generating assertion information.

[0085] More specifically, the output vectors of each group can be combined into a determinant, the value of the determinant can be obtained, and the centroid vector of the vector set formed by each group of output vectors can be determined based on the value of the determinant and the sum of the output vectors of each group.

[0086] Step S140: Restore the target output vector to the original field values ​​and generate assertion information for test cases based on the original field values.

[0087] In the specific implementation, after determining the target output vector closest to the centroid vector, it is necessary to restore the target output vector to its original field values ​​before transformation using a digital dictionary. These original field values ​​serve as the benchmark data for assertion verification of the test cases. Therefore, assertion information for the test cases can be generated based on these original field values. Thus, based on the above steps S110-S140, assertion information for any successfully executed test case can be obtained, and the test case and its assertion information can be stored in the database. This allows for direct calling of the corresponding assertion comparison interface to determine the truth value of the assertions when the test case is executed subsequently.

[0088] In the aforementioned assertion information generation method, after obtaining the output dataset of the test case, the field values ​​of each field in each group of output data are transformed using a numeric dictionary to obtain the output vector corresponding to each group of output data. Then, from each group of output vectors, the target output vector closest to the centroid vector is determined. The centroid vector is the center of the vector set formed by each group of output vectors. The target output vector is restored to its original field values, and the assertion information of the test case is generated based on the original field values. This method, by analyzing the output dataset obtained after the test case has been executed multiple times, determines the benchmark data that can be used for assertion verification of the test case. It realizes the automatic determination of the data that needs to be asserted in the test case, without requiring testers to spend too much time writing assertion content. Instead, they only need to call the corresponding assertion comparison interface, thereby improving the efficiency of test script writing and the sufficiency of assertions.

[0089] In one exemplary embodiment, such as Figure 2As shown, before step S120, which involves converting the field values ​​of each field in each set of output data using a numeric dictionary, the following steps are also included:

[0090] Step S111: Obtain the number of times each field value of each field appears in the output dataset;

[0091] Step S112: Based on the frequency of occurrence, determine the target field that meets the preset conditions from multiple fields;

[0092] The specific implementation of step S120 above, which involves converting the field values ​​of each field in each set of output data using a numeric dictionary, can be as follows:

[0093] Step S113: Convert the field values ​​of the target field in each set of output data using a numeric dictionary.

[0094] In practical implementation, since a field may have multiple values, it is necessary to obtain the occurrence count of each field value in the output dataset. If the occurrence count of each field value of a field is low—for example, for a time field, time is constantly changing, so the occurrence probability of each field value will be low—it indicates that the field value changes frequently and has a weak correlation with the test cases. Therefore, it can be considered not to check this field during assertion. Thus, the field values ​​that need to be asserted and checked should be relatively constant with little change, and the occurrence probability of the field value should be greater than or equal to a threshold. Such fields are identified as target fields. Based on the characteristics of the target fields, the target fields that meet the conditions can be determined from multiple fields in the output data. When transforming the field values ​​of each field in each set of output data using a numeric dictionary, the field values ​​of the target fields can be transformed, without needing to transform the field values ​​of other fields, thereby improving the efficiency of determining the subsequent target output vector.

[0095] In this embodiment, the target field is determined from multiple fields based on the number of times each field value appears in the output dataset. The field value of the target field in each set of output data is transformed using a number dictionary, without needing to transform the field values ​​of other fields besides the target field. This can improve the efficiency of determining the subsequent target output vector and save unnecessary computing resources.

[0096] In one exemplary embodiment, such as Figure 3 As shown, step S112 above, which determines the target field that meets the preset conditions from multiple fields based on the frequency of occurrence, can be implemented through the following steps:

[0097] Step S112A: For each field, based on the number of occurrences, determine the probability of occurrence of each field value, and determine the target with the highest probability of occurrence from the probability of occurrence of each field value;

[0098] Step S112B: Remove fields from multiple fields whose probability of occurrence of the target is less than a threshold to obtain the target field.

[0099] The probability of occurrence represents the proportion of the number of times a certain value of a field appears out of the total number of times all values ​​of that field appear.

[0100] In practical implementation, for a field with only one value, its corresponding target occurrence probability should be 100%. For a field with multiple values, it is necessary to obtain the ratio of the occurrence frequency of each field value to the total occurrence frequency of all field values ​​to get the occurrence probability of each field value. The field with the highest occurrence probability is then selected as the target occurrence probability. This target occurrence probability is compared to a threshold. If the target occurrence probability is less than the threshold, it indicates that the corresponding field value changes frequently, and therefore the corresponding field can be removed. The remaining field after removing fields with target occurrence probabilities less than the threshold is determined as the target field.

[0101] In this embodiment, by comparing the probability of the target with the highest probability of occurrence among the various field values ​​with a threshold, fields with a probability of occurrence less than the threshold are removed from multiple fields to obtain the target field. This achieves the field filtering process, making the obtained target field a field with a relatively constant field value that needs to be asserted and verified.

[0102] In an exemplary embodiment, before determining the target output vector closest to the centroid vector from each set of output vectors in step S130, the method further includes:

[0103] Step S121: Combine the output vectors of each group into a determinant and obtain the value of the determinant;

[0104] Step S122: Obtain the ratio of the sum of the output vectors of each group to the value of the output determinant, and use it as the centroid vector.

[0105] In practice, after determining the output vectors corresponding to each set of output data, each set of output vectors can correspond to a point in multidimensional space. For the vector set formed by each set of output vectors, there are multiple points in multidimensional space. The process of determining the centroid vector is the process of determining the centroids of these points. The determination process can be represented by the following relational expression:

[0106]

[0107] Where μ represents the centroid vector, C represents the vector set consisting of each set of output vectors, and |C| * | represents the value of the determinant obtained by combining the output vectors of each group, and x represents the output vector.

[0108] For example, given two sets of output data, with output vectors (a, b) and (c, d) corresponding to each set, the centroid vector of the vector set formed by the output vectors of each set can be determined by the following formula:

[0109]

[0110] After obtaining the centroid vector, the Euclidean distance between each set of output vectors and the centroid vector can be calculated. The output vector with the smallest Euclidean distance to the centroid vector is determined as the target output vector.

[0111] In this embodiment, the centroid vector of the vector set composed of each group of output vectors is determined so that the target output vector can be determined from each group of output vectors based on the centroid vector, and the benchmark data for assertion verification can be obtained.

[0112] In an exemplary embodiment, before determining the target output vector closest to the centroid vector from each set of output vectors in step S130, the method further includes: normalizing each set of output vectors to obtain the normalized vectors corresponding to each set of output vectors;

[0113] Step S130 includes: determining the target normalized vector that is closest to the centroid vector from each group of normalized vectors;

[0114] In step S140, restoring the target output vector to its original field values ​​includes: restoring the normalized target vector to its original field values.

[0115] Normalization is a method of transforming a dimensional expression into a dimensionless expression.

[0116] In the specific implementation, after converting each set of output data into an output vector using a digital dictionary, before determining the target output vector from each set of output vectors, it is necessary to normalize each set of output vectors to obtain the normalized vectors corresponding to each set of output vectors. From each set of normalized vectors, the target normalized vector closest to the centroid vector is determined, and the target normalized vector is restored to its original field value.

[0117] More specifically, the normalization process for each group of output vectors is as follows: For each group of output vectors, the difference between each transformed value in the group and the minimum transformed value in the group is obtained as the first difference; the difference between the maximum and minimum transformed values ​​in the group is obtained as the second difference; the ratio of the first difference to the second difference is obtained as the normalized value of each transformed value; based on the normalized value of each transformed value, the normalized vector corresponding to each group of output vectors is obtained.

[0118] In this embodiment, before determining the target output vector from multiple output vectors, the output vectors of each group are normalized, and the centroid vector of the normalized output vector is determined. This reduces computational complexity, increases the centroid vector determination rate, and thus increases the target output vector determination rate.

[0119] In an exemplary embodiment, after step S140, which restores the target output vector to its original field values ​​and generates assertion information for test cases based on the original field values, the method further includes:

[0120] Step S150: Store the test cases and their assertion information in the database;

[0121] Step S160: When the test case is executed again, retrieve the assertion information of the test case from the database;

[0122] Step S170: When the actual output data of the test case does not match the assertion information, an error message is fed back to the test terminal, so that the test terminal can perform error analysis based on the error message.

[0123] In practice, after obtaining the test case and its assertion information, the test case and its assertion information can be stored in the database. When the test script of the test case is executed again through the interface, the assertion information of the test case can be retrieved from the database based on the identifier of the test case and compared with the actual output data returned by the interface. If they do not match, it means that the actual output data of this test is different from the data when it was successfully executed before, and the assertion fails. The content of the assertion failure is fed back to the test terminal as an error message, so that the test terminal can perform error analysis based on the error message.

[0124] In this embodiment, test cases and their assertion information are stored in a database so that when the test case is executed again in a later stage, the assertion information of the test case can be directly obtained from the database for assertion verification, and the test result can be determined based on the verification result.

[0125] In one embodiment, to facilitate understanding of the embodiments of this application by those skilled in the art, specific examples will be described below in conjunction with the accompanying drawings. It should be noted that this application mainly focuses on automatically asserting the interface return data or database registration status after each execution of an automated test script. The assertion information generation method of this application will be described in detail below using an automated test script for interface testing as an example. It is understood that this assertion information generation method can also be applied to the generation and verification of assertions for database tables.

[0126] refer to Figure 4 The diagram illustrates a structural block diagram of an assertion information generation system, which mainly comprises three parts: an output data collection unit 410, an assertion data training unit 420, and an assertion verification unit 430, wherein:

[0127] The output data collection unit 410 is used to collect the interface return data (i.e., output data) of the test scripts of successfully executed test cases. When saving, it can be saved using the data structure shown in Table 1: script class name, @test method name, case description, interface return data (i.e. output data). The first three fields uniquely identify the test case of an automated script. The interface return data saves the data returned by the tested interface in JSON format.

[0128] Assertion data training unit 420 is used to train data on JSON fields saved during the execution of test cases for a single interface. After multiple executions of the interface test script, output data returned by many interfaces is saved in the database, forming an output dataset. For each field value in the output data, its occurrence probability is calculated. If the highest occurrence probability of a field value is less than a threshold (e.g., 50%), it indicates that the field value changes frequently and has little correlation with the test case, so no assertion is needed. Therefore, fields that do not need assertion can be removed from each field to obtain the target fields. The target fields in each set of output data are transformed using a numeric dictionary to obtain output vectors. The center of the vector set formed by each set of output vectors is determined as the centroid vector. The target output vector closest to the centroid vector is determined from each set of output vectors. Based on the restoration processing of the target output vector, the verification benchmark data corresponding to the test case is obtained, forming the assertion of the test case.

[0129] The assertion verification unit 430 is used to register the obtained assertions and test cases in the database. When the test case is executed subsequently, the assertion information of the test case is retrieved from the database and compared with the actual output data returned by the interface. If they do not match, it means that the actual output data of this test is different from the data when it was successfully executed before. In this case, the assertion fails and the content of the assertion failure is fed back to the test terminal as an error message, so that the test terminal can perform error analysis based on the error message.

[0130] refer to Figure 5 The diagram below illustrates the complete process of an assertion information generation method in an exemplary embodiment, including the following steps:

[0131] Step S510: Obtain the set of output data for each successful execution of a test case. The output data collection unit 410 has already saved the output data returned by the interface each time the test case is successfully executed. Therefore, this step can query all the returned output data saved for the test case based on the script class name, the @test method name, and the case description, primarily obtaining JSON format data.

[0132] Step S520: Determine if there is a next set of output data in the output data set. If there is, proceed to step S530; otherwise, proceed to step S540.

[0133] Step S530: Count the number of times each field value appears. Specifically, the interface returns data stored in JSON format, which contains many fields. Here, we count the number of times each field value appears and record all occurrences of that field value.

[0134] Step S540: Calculate the highest probability of occurrence of each field's value. Specifically, based on the frequency of occurrence of each field's value, calculate the highest probability of occurrence for that field's value.

[0135] Step S550: Remove fields that do not need to be compared. Specifically, determine whether the highest probability of occurrence of each field's value is less than 50%. If it is less than 50%, it means that the field's value changes frequently and is not strongly related to the test case. It is advisable not to check it during assertion. For example, the value of the time in the interface return value can be left unchecked. For fields that do not need to be checked, remove the field from the JSON string.

[0136] Step S560: Perform dictionary conversion on the field values ​​of each field to obtain the converted values ​​of each field. Specifically, after removing fields that do not need to be compared, the field values ​​of the remaining fields are represented by vectors. Each field value may have multiple possibilities. For example, the Inprivate.from field may have online banking, counter service, online, etc. Here, it is necessary to list all the multiple possibilities of the field value and represent them with numeric dictionaries. For example, use 1 to represent online banking, 2 to represent counter service, 3 to represent online, etc. The number of numeric dictionaries corresponds to the number of possibilities. Finally, the data vector will become a vector represented by numeric dictionaries, such as [1,1,2,1].

[0137] Step S570: Normalize the converted values ​​of each field. Specifically, the converted values ​​of each set of output data constitute a set of output vectors. Normalize each set of output vectors. The main algorithm is to subtract the minimum converted value from the converted value in each output vector, and then divide by the difference between the maximum and minimum converted values, thereby converting the initial output vector into a normalized vector.

[0138] Step S580: Calculate the centroid. Specifically, each normalized vector corresponds to a point in a multidimensional space. For the set of vectors formed by the normalized vectors, there are many points in the multidimensional space. Calculate the centroid of the vector set.

[0139] Step S580: Determine the point closest to the centroid. Specifically, calculate the Euclidean distance from the centroid to the point corresponding to each normalized vector, and determine the point closest to the centroid; restore the normalized vector corresponding to this point to the original field value, which is the benchmark data for assertion verification of the test cases.

[0140] This embodiment obtains assertion comparison benchmark data based on big data analysis. When writing test scripts, it is not necessary to spend too much time writing assertion content. It is only necessary to call the corresponding assertion comparison interface, which helps to improve the efficiency of script writing and the sufficiency of assertions.

[0141] It should be understood that although the steps in the flowcharts of the embodiments described above are shown sequentially according to the arrows, these steps are not necessarily executed in the order indicated by the arrows. Unless explicitly stated herein, there is no strict order restriction on the execution of these steps, and they can be executed in other orders. Moreover, at least some steps in the flowcharts of the embodiments described above may include multiple steps or multiple stages. These steps or stages are not necessarily completed at the same time, but can be executed at different times. The execution order of these steps or stages is not necessarily sequential, but can be performed alternately or in turn with other steps or at least some of the steps or stages of other steps.

[0142] Based on the same inventive concept, this application also provides an assertion information generation apparatus for implementing the assertion information generation method described above. The solution provided by this apparatus is similar to the implementation described in the above method; therefore, the specific limitations in one or more embodiments of the assertion information generation apparatus provided below can be found in the limitations of the assertion information generation method described above, and will not be repeated here.

[0143] In one embodiment, such as Figure 6As shown, an assertion information generation device is provided, including: an acquisition module 610, a conversion module 620, a determination module 630, and a generation module 640, wherein:

[0144] The acquisition module 610 is used to acquire the output dataset of the test case; the output dataset includes multiple sets of output data, each set of output data is the output data of the test case executed once, and each set of output data includes the field values ​​of multiple fields;

[0145] The conversion module 620 is used to convert the field values ​​of each field in each set of output data through a number dictionary to obtain the output vector corresponding to each set of output data.

[0146] The determination module 630 is used to determine the target output vector that is closest to the centroid vector from each group of output vectors; the centroid vector is the center of the vector set formed by each group of output vectors.

[0147] The generation module 640 is used to restore the target output vector to the original field values ​​and generate assertion information for test cases based on the original field values.

[0148] In one embodiment, the above apparatus further includes a filtering module for obtaining the number of times each field value of each field appears in the output dataset; and determining the target field that meets the preset conditions from multiple fields based on the number of occurrences.

[0149] The conversion module 620 is also used to convert the field values ​​of the target field in each set of output data using a numeric dictionary.

[0150] In one embodiment, the filtering module is further configured to determine the probability of occurrence of each field value based on the number of occurrences for each field, and determine the target occurrence probability with the largest value from the occurrence probabilities of each field value; and remove fields whose target occurrence probability is less than a threshold from multiple fields to obtain the target field.

[0151] In one embodiment, the above-mentioned apparatus further includes a centroid determination module, which is used to combine the output vectors of each group into a determinant and obtain the value of the determinant; and obtain the ratio of the sum of the output vectors of each group to the value of the determinant as the centroid vector.

[0152] In one embodiment, the above-mentioned device further includes a normalization module, which is used to normalize each group of output vectors to obtain the normalized vectors corresponding to each group of output vectors.

[0153] The determination module 630 is also used to determine the target normalized vector that is closest to the centroid vector from each group of normalized vectors;

[0154] The generation module 640 is also used to restore the target normalized vector to the original field values.

[0155] In one embodiment, the above-mentioned apparatus further includes:

[0156] The storage module is used to store test cases and their assertion information in the database.

[0157] The assertion verification module is used to retrieve the assertion information of the test cases from the database when the test cases are executed again; when the actual output data of the test cases does not match the assertion information, it sends an error message to the test terminal, enabling the test terminal to perform error analysis based on the error message.

[0158] Each module in the aforementioned assertion information generation device can be implemented entirely or partially through software, hardware, or a combination thereof. These modules can be embedded in or independent of the processor in a computer device, or stored in the memory of a computer device as software, so that the processor can call and execute the operations corresponding to each module.

[0159] In one embodiment, a computer device is provided, which may be a terminal, and its internal structure diagram may be as follows: Figure 7 As shown. The computer device includes a processor, memory, communication interface, display screen, and input devices connected via a system bus. The processor provides computing and control capabilities. The memory includes non-volatile storage media and internal memory. The non-volatile storage media stores the operating system and computer programs. The internal memory provides an environment for the operation of the operating system and computer programs stored in the non-volatile storage media. The communication interface is used for wired or wireless communication with external terminals; wireless communication can be achieved through Wi-Fi, mobile cellular networks, NFC (Near Field Communication), or other technologies. When executed by the processor, the computer program implements an assertion information generation method. The display screen can be an LCD screen or an e-ink screen. The input devices can be a touch layer covering the display screen, buttons, a trackball, or a touchpad mounted on the computer device casing, or an external keyboard, touchpad, or mouse.

[0160] Those skilled in the art will understand that Figure 7 The structure shown is merely a block diagram of a portion of the structure related to the present application and does not constitute a limitation on the computer device to which the present application is applied. Specific computer devices may include more or fewer components than those shown in the figure, or combine certain components, or have different component arrangements.

[0161] In one embodiment, a computer device is provided, including a memory and a processor, wherein the memory stores a computer program, and the processor executes the computer program to perform the following steps:

[0162] Obtain the output dataset of the test case; the output dataset includes multiple sets of output data, each set of output data is the output data of the test case executed once, and each set of output data includes the field values ​​of multiple fields;

[0163] The field values ​​of each field in each set of output data are transformed using a digital dictionary to obtain the output vector corresponding to each set of output data.

[0164] From each set of output vectors, determine the target output vector that is closest to the centroid vector; the centroid vector is the center of the vector set formed by each set of output vectors.

[0165] The target output vector is restored to its original field values, and assertion information for test cases is generated based on these original field values.

[0166] In one embodiment, a computer device is also provided, including a memory and a processor, wherein the memory stores a computer program, and the processor executes the computer program to implement the steps in the above method embodiments.

[0167] In one embodiment, a computer-readable storage medium is provided having a computer program stored thereon, the computer program performing the following steps when executed by a processor:

[0168] Obtain the output dataset of the test case; the output dataset includes multiple sets of output data, each set of output data is the output data of the test case executed once, and each set of output data includes the field values ​​of multiple fields;

[0169] The field values ​​of each field in each set of output data are transformed using a digital dictionary to obtain the output vector corresponding to each set of output data.

[0170] From each set of output vectors, determine the target output vector that is closest to the centroid vector; the centroid vector is the center of the vector set formed by each set of output vectors.

[0171] The target output vector is restored to its original field values, and assertion information for test cases is generated based on these original field values.

[0172] In one embodiment, a computer-readable storage medium is provided having a computer program stored thereon that, when executed by a processor, implements the steps in the above method embodiments.

[0173] In one embodiment, a computer program product is provided, including a computer program that, when executed by a processor, implements the steps in the above method embodiments.

[0174] Those skilled in the art will understand that all or part of the processes in the methods of the above embodiments can be implemented by a computer program instructing related hardware. The computer program can be stored in a non-volatile computer-readable storage medium, and when executed, it can include the processes of the embodiments of the above methods. Any references to memory, databases, or other media used in the embodiments provided in this application can include at least one of non-volatile and volatile memory. Non-volatile memory can include read-only memory (ROM), magnetic tape, floppy disk, flash memory, optical memory, high-density embedded non-volatile memory, resistive random access memory (ReRAM), magnetic random access memory (MRAM), ferroelectric random access memory (FRAM), phase change memory (PCM), graphene memory, etc. Volatile memory can include random access memory (RAM) or external cache memory, etc. By way of illustration and not limitation, RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM). The databases involved in the embodiments provided in this application may include at least one type of relational database and non-relational database. Non-relational databases may include, but are not limited to, blockchain-based distributed databases. The processors involved in the embodiments provided in this application may be general-purpose processors, central processing units, graphics processing units, digital signal processors, programmable logic devices, quantum computing-based data processing logic devices, etc., and are not limited to these.

[0175] The technical features of the above embodiments can be combined in any way. For the sake of brevity, not all possible combinations of the technical features in the above embodiments are described. However, as long as there is no contradiction in the combination of these technical features, they should be considered to be within the scope of this specification.

[0176] The embodiments described above are merely illustrative of several implementation methods of this application, and while the descriptions are specific and detailed, they should not be construed as limiting the scope of this patent application. It should be noted that those skilled in the art can make various modifications and improvements without departing from the concept of this application, and these all fall within the protection scope of this application. Therefore, the protection scope of this application should be determined by the appended claims.

Claims

1. A method for generating assertion information, characterized in that, The method includes: Obtain the output dataset of the test case; the output dataset includes multiple sets of output data, each set of output data is the output data of the test case executed once, and each set of output data includes the field values ​​of multiple fields; The field values ​​of each field in each set of output data are transformed using a digital dictionary to obtain the output vector corresponding to each set of output data. From each set of output vectors, determine the target output vector that is closest to the centroid vector; the centroid vector is the center of the vector set formed by each set of output vectors. The target output vector is restored to its original field values ​​using a digital dictionary. Assertion information for the test case is generated based on the original field values. The original field values ​​serve as the baseline data for assertion verification of the test case.

2. The method according to claim 1, characterized in that, Before converting the field values ​​of each field in each set of output data using a numeric dictionary, the process also includes: Obtain the number of times each field value appears in the output dataset; Based on the frequency of occurrence, a target field that meets the preset conditions is determined from the plurality of fields; The process of converting the field values ​​of each field in each set of output data using a numeric dictionary includes: The value of the target field in each set of output data is transformed using the numerical dictionary.

3. The method according to claim 2, characterized in that, The step of determining the target field that meets the preset conditions from the plurality of fields based on the occurrence frequency includes: For each field, based on the number of occurrences, determine the probability of occurrence of each field value, and determine the target occurrence probability with the largest value from the probability of occurrence of each field value; The target field is obtained by removing fields from the plurality of fields whose probability of occurrence is less than a threshold.

4. The method according to claim 1, characterized in that, Before determining the target output vector closest to the centroid vector from each set of output vectors, the following steps are also included: Combine the output vectors of each group into a determinant and obtain the value of the determinant; The ratio of the sum of the output vectors of each group to the value of the determinant is obtained and used as the centroid vector.

5. The method according to claim 1, characterized in that, Before determining the target output vector closest to the centroid vector from each set of output vectors, the following steps are also included: Normalize the output vectors of each group to obtain the normalized vectors corresponding to each group of output vectors; The step of determining the target output vector closest to the centroid vector from each set of output vectors includes: From each group of normalized vectors, determine the target normalized vector that is closest to the centroid vector; The step of restoring the target output vector to its original field values ​​includes: The normalized vector of the target is restored to the original field value.

6. The method according to claim 1, characterized in that, After restoring the target output vector to its original field values ​​and generating the assertion information for the test case based on those original field values, the process further includes: Store the test cases and their assertion information in the database; When the test case is executed again, the assertion information of the test case is retrieved from the database; When the actual output data of the test case does not match the assertion information, an error message is fed back to the test terminal, enabling the test terminal to perform error analysis based on the error message.

7. An assertion information generation device, characterized in that, The device includes: The acquisition module is used to acquire the output dataset of the test case; the output dataset includes multiple sets of output data, each set of output data is the output data of the test case executed once, and each set of output data includes the field values ​​of multiple fields; The conversion module is used to convert the field values ​​of each field in each set of output data using a number dictionary, so as to obtain the output vector corresponding to each set of output data. The determination module is used to determine the target output vector that is closest to the centroid vector from each group of output vectors; the centroid vector is the center of the vector set formed by each group of output vectors. The generation module is used to restore the target output vector to its original field values ​​using a number dictionary, and to generate assertion information for the test cases based on the original field values. The original field values ​​are the baseline data for assertion verification of the test cases.

8. A computer device comprising a memory and a processor, wherein the memory stores a computer program, characterized in that, When the processor executes the computer program, it implements the steps of the assertion information generation method according to any one of claims 1 to 6.

9. A computer-readable storage medium having a computer program stored thereon, characterized in that, When the computer program is executed by a processor, it implements the steps of the assertion information generation method according to any one of claims 1 to 6.

10. A computer program product, comprising a computer program, characterized in that, When the computer program is executed by a processor, it implements the steps of the assertion information generation method according to any one of claims 1 to 6.