Test case generation method and device, computer device, and storage medium
By acquiring database table information and utilizing schema document generation models and large language models, high-quality test cases that conform to business scenarios are generated, solving the problem of poor quality test case generation in Text-to-SQL systems and achieving precise control and efficient generation.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- GUANGZHOU QUWAN NETWORK TECH CO LTD
- Filing Date
- 2026-03-17
- Publication Date
- 2026-06-12
AI Technical Summary
The existing Text-to-SQL system generates poor-quality performance test cases, which cannot meet testing requirements.
By acquiring data from database tables, and utilizing pre-defined pattern document generation models and large language models, a dimension matrix is constructed to generate test cases that conform to specific business scenarios. This allows for precise control over the complexity of SQL queries and the dimensions of business analysis, avoiding information loss and confusion.
High-quality test cases were generated, meeting the performance testing requirements of the Text-to-SQL system and improving the quality and efficiency of test case generation.
Smart Images

Figure CN122195841A_ABST
Abstract
Description
Technical Field
[0001] This application relates to the fields of artificial intelligence and database technology, and in particular to a test case generation method, apparatus, computer device, and computer-readable storage medium. Background Technology
[0002] Text-to-SQL systems can convert natural language queries into corresponding SQL queries, enabling business users lacking programming skills to directly retrieve data relevant to their data analysis needs from the database. Performance testing of Text-to-SQL systems requires a large number of high-quality test cases.
[0003] In related technologies, many rely on publicly available test case sets (such as Spider and WikiSQL) or manually written test cases. These test cases are of poor quality and cannot meet the performance testing requirements of Text-to-SQL systems. Summary of the Invention
[0004] Therefore, it is necessary to provide a test case generation method, apparatus, computer equipment, and computer-readable storage medium that can improve the quality of test case generation in order to address the above-mentioned technical problems.
[0005] Firstly, this application provides a test case generation method, the method comprising:
[0006] Retrieve data information from database tables; the data information includes field information for each table in the data table;
[0007] The model is generated based on the data information of the database tables by a preset schema document, and the schema document includes the table structure of the database tables and the field description of each table field.
[0008] Construct a dimension matrix. Based on the pattern document, the pre-set list of business scenarios, and the dimension matrix, a test case generation scheme is obtained through the large language model. The list of business scenarios includes multiple business scenarios, and the dimension matrix includes a combination of the SQL query complexity dimension and the business analysis dimension of the test cases. The test case generation scheme is used to assign test cases with different SQL query complexity dimensions and different business analysis dimensions to each business scenario.
[0009] By using a large language model based on pattern documents and test case generation schemes, a list of target test cases corresponding to database tables is obtained.
[0010] In one embodiment, a test case generation scheme is obtained based on a large language model, a pattern document, a pre-defined list of business scenarios, and a dimension matrix, including:
[0011] Get the preset number of test cases generated;
[0012] Generate the first prompt word based on the pattern document, business scenario list, dimension matrix, and number of test cases generated;
[0013] Input the first prompt word into the large language model to obtain the test case generation scheme.
[0014] In one embodiment, after obtaining the test case generation scheme, the method further includes:
[0015] Obtain the number of test cases allocated to each business scenario from the test case generation scheme;
[0016] The total number of test cases is obtained by summing the number of test cases for each business scenario.
[0017] If the total number of test cases is inconsistent with the preset number of test cases generated, the first prompt word is corrected to obtain the corrected first prompt word;
[0018] The revised first prompt word is input into the large language model to obtain a new test case generation scheme.
[0019] In one embodiment, a list of target test cases corresponding to database tables is obtained by using a large language model based on pattern documents and a test case generation scheme, including:
[0020] Based on the pattern document and test case generation plan, generate a second prompt word;
[0021] Input the second prompt word into the large language model to obtain the list of target test cases.
[0022] In one embodiment, the target test case list includes target test cases and corresponding tag information for the target test cases; the tag information includes the business scenario, SQL query complexity dimension, and business analysis dimension of the target test cases; after obtaining the target test case list, the method further includes:
[0023] Generate the target file based on the target test case list;
[0024] Receive the review results for the target file; the review results are obtained by evaluating the business scenario, SQL query complexity, and business analysis dimensions of the target test cases based on the test case generation plan;
[0025] Based on the review results, the second prompt word is revised to obtain the revised second prompt word;
[0026] The revised second prompt word is input into the large language model to obtain a new list of target test cases.
[0027] In one embodiment, the field information of each table field includes the field name, data type, field constraint information, enumeration value, and statistical characteristics of each table field; based on the data information of the database tables, a schema document is obtained through a preset schema document generation model, including:
[0028] Retrieve preset task description information; the task description information represents the document content of the pattern document;
[0029] A third prompt word is generated based on the field name, data type, field constraint information, enumeration value, statistical characteristics, and task description information of each table field.
[0030] Input the third prompt word into the pattern document generation model to obtain the pattern document.
[0031] In one embodiment, after obtaining the schema document, the method further includes:
[0032] The table structure of the schema document and the field descriptions of each table field are validated, and the validation results are obtained.
[0033] If there are errors in the structure of the validation result table and the field descriptions of each table field, the third prompt word is corrected to obtain the corrected third prompt word.
[0034] Input the revised third prompt word into the pattern document generation model to obtain a new pattern document.
[0035] Secondly, this application also provides a test case generation apparatus, the apparatus comprising:
[0036] The data information acquisition module is used to acquire data information from database tables; the data information includes field information of each table field in the data table.
[0037] The schema document acquisition module is used to generate data information of the model based on the database table from the preset schema document and obtain the schema document; the schema document includes the table structure of the database table and the field description of each table field;
[0038] The test case generation scheme acquisition module is used to construct a dimension matrix. Based on the pattern document, the preset business scenario list, and the dimension matrix, the module obtains a test case generation scheme. The business scenario list includes multiple business scenarios, and the dimension matrix includes a combination of the SQL query complexity dimension and the business analysis dimension of the test cases. The test case generation scheme is used to assign test cases with different SQL query complexity dimensions and different business analysis dimensions to each business scenario.
[0039] The target test case list acquisition module is used to obtain a list of target test cases corresponding to database tables based on the pattern document and test case generation scheme using a large language model.
[0040] Thirdly, this application also provides a computer device, including a memory and a processor, wherein the memory stores a computer program, and the processor executes the computer program to implement the method steps of the first aspect.
[0041] Fourthly, this application also provides a computer-readable storage medium having a computer program stored thereon, which, when executed by a processor, implements the method steps of the first aspect.
[0042] The aforementioned test case generation method, apparatus, computer equipment, and computer-readable storage medium acquire data information from a database table. This data information includes field information for each table field. A pre-defined pattern document generation model is used to generate a pattern document based on this data information. The pattern document includes the table structure and field descriptions for each table field. A dimension matrix is constructed. A large language model is then used to generate a test case generation scheme based on the pattern document, a pre-defined list of business scenarios, and the dimension matrix. The business scenario list includes multiple business scenarios, and the dimension matrix includes a combination of SQL query complexity and business analysis dimensions for test cases. The test case generation scheme is used to assign test cases with different SQL query complexity and business analysis dimensions to each business scenario. Finally, a list of target test cases corresponding to the database table is obtained using the large language model based on the pattern document and the test case generation scheme. As can be seen from the above, this application obtains the pattern document based on the database table data information using a pre-defined pattern document generation model, avoiding missing or incomplete information in the pattern document. By using a large language model based on pattern documents, a pre-defined list of business scenarios, and a dimension matrix, a test case generation scheme is obtained. Based on the pattern documents and the test case generation scheme, a list of target test cases is obtained. This allows for the generation of multiple visually approximate test cases that conform to specific business scenarios. Furthermore, it enables precise control over the SQL query complexity and business analysis dimensions of test cases, avoiding confusion between SQL query complexity and business analysis, and improving the generation quality of target test cases. Attached Figure Description
[0043] To more clearly illustrate the technical solutions in the embodiments of this application or related technologies, the drawings used in the description of the embodiments of this application or related technologies will be briefly introduced below. Obviously, the drawings described below are only some embodiments of this application. For those skilled in the art, other related drawings can be obtained based on these drawings without creative effort.
[0044] Figure 1 This is a diagram illustrating the application environment of a test case generation method in one embodiment.
[0045] Figure 2 This is a flowchart illustrating a test case generation method in one embodiment;
[0046] Figure 3 This is a flowchart illustrating the process of obtaining a test case generation scheme in one embodiment;
[0047] Figure 4 This is a structural block diagram of a test case generation device in one embodiment;
[0048] Figure 5 This is an internal structural diagram of a computer device in one embodiment. Detailed Implementation
[0049] To make the objectives, technical solutions, and advantages of this application clearer, the following detailed description is provided in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative and not intended to limit the scope of this application.
[0050] It should be noted that the terms "first," "second," etc., used in this application can be used to describe various elements, but these elements are not limited by these terms. These terms are only used to distinguish the first element from the second element. The terms "comprising" and "having," and any variations thereof, used in this application, are intended to cover non-exclusive inclusion. The term "multiple" used in this application refers to two or more. The term "and / or" used in this application refers to one of the embodiments, or any combination of multiple embodiments.
[0051] The test case generation method provided in this application embodiment can be applied to, for example... Figure 1In the application environment shown, terminal 102 communicates with server 104 via a network. A data storage system can store the data that server 104 needs to process. The data storage system can be integrated onto server 104 or placed on a cloud or other network server. Terminal 102 obtains data information from database tables; the data information includes field information for each table. Based on the data information from the database tables, a schema document is obtained using a preset schema document generation model. The schema document includes the table structure of the database tables and field descriptions for each table field. A dimension matrix is constructed, and a test case generation scheme is obtained using a large language model based on the schema document, a preset list of business scenarios, and the dimension matrix. The business scenario list includes multiple business scenarios, the dimension matrix includes a combination of SQL query complexity dimensions and business analysis dimensions for test cases, and the test case generation scheme includes assigning test cases with different SQL query complexity dimensions and different business analysis dimensions to each business scenario. Based on the schema document and the test case generation scheme, a list of target test cases corresponding to the database tables is obtained using a large language model. Terminal 102 can be, but is not limited to, various personal computers, laptops, smartphones, tablets, drones, low-altitude aircraft, IoT devices, and portable wearable devices. IoT devices can include smart speakers, smart TVs, smart air conditioners, smart in-vehicle devices, and projection equipment. Portable wearable devices can include smartwatches, smart bracelets, and head-mounted displays. Head-mounted displays can be virtual reality (VR) devices, augmented reality (AR) devices, and smart glasses. Server 104 can be a standalone physical server, a server cluster or distributed system composed of multiple physical servers, or a cloud server providing cloud computing services.
[0052] In one embodiment, such as Figure 2 As shown, a test case generation method is provided. This embodiment applies this method to... Figure 1 Taking terminal 102 as an example, the method includes the following steps:
[0053] Step S210: Obtain data information from the database tables; the data information includes field information for each table field in the data tables.
[0054] Database tables are the most basic building blocks in a database, used to organize and store data. A database table consists of rows and columns, with each column containing specific types of data information; a column is a field. A database can contain one or more database tables.
[0055] The field information includes, but is not limited to, field name, field meaning, field data type, field data coverage, field constraint information, enumeration values, and statistical characteristics. Field constraint information refers to whether the field is a primary key of the database table, and enumeration values refer to the value range of the category field (such as status, type). Statistical characteristics include, but are not limited to, maximum value, minimum value, non-null rate, and time range.
[0056] In this embodiment, database tables can be obtained from a database or a CSV (Comma-Separated Values) file. Data processing and analysis tools are then used to process and analyze the data in the tables to obtain the data information. A CSV file is a plain text file where fields are separated by commas (or other delimiters), and each line represents a data record. In Text-to-SQL testing, CSV files are typically used to store data from simulated database tables, such as order information or user information. The data processing and analysis tool can be a Pandas analysis tool.
[0057] Step S220: Generate a model based on the data information of the database table using a preset schema document to obtain a schema document; the schema document includes the table structure of the database table and the field description of each table field.
[0058] The preset pattern document generation models include, but are not limited to, large language models, machine learning models, and knowledge graphs.
[0059] The schema document, also known as the database schema document, is a metadata file that describes the database structure. It is typically in the form of JSON, YAML, or SQL script. It defines key information such as table names, field names, data types, primary keys, and foreign keys in the database, providing the Text-to-SQL system with the context to understand the database structure.
[0060] The table structure of a database table includes, but is not limited to, field names, field data types, field constraint information, and the business meaning of the fields.
[0061] The field descriptions include, but are not limited to, descriptions of key enumeration values, field definitions (such as the unit of the field, time definition, etc.), and data coverage.
[0062] In this embodiment, data from a database table can be input into a large language model to obtain a pattern document. Alternatively, the data from the database table can be input into a machine learning model to obtain a pattern document. The machine learning model can be a trained schema inference model (such as a BERT-based field semantic classification model). Furthermore, a domain knowledge graph can be used to infer field semantics and table relationships based on the data from the database table to obtain a pattern document.
[0063] Step S230: Construct a dimension matrix. Based on the pattern document, the preset business scenario list, and the dimension matrix, a test case generation scheme is obtained using the large language model. The business scenario list includes multiple business scenarios, and the dimension matrix includes a combination of the SQL query complexity dimension and the business analysis dimension of the test cases. The test case generation scheme is used to assign test cases with different SQL query complexity dimensions and different business analysis dimensions to each business scenario.
[0064] The SQL query complexity dimensions of the test cases include, but are not limited to, single-table no aggregation (simple SELECT + WHERE), single-table light aggregation (SUM / COUNT / AVG + GROUP BY), multi-table JOIN (2-3 table join queries), window functions / subqueries / CTEs (advanced SQL techniques), and cross-source / recursive / complex calculations (the highest level of SQL capability).
[0065] The business analysis dimensions of test cases include, but are not limited to, description (querying raw data or simple statistics), comparison (comparing different entities / time periods), attribution (exploring causes and analyzing influencing factors), prediction (inferring future trends based on historical data), and strategy (proposing actionable business suggestions).
[0066] The pre-defined list of business scenarios includes several specific business scenarios, including but not limited to sales analysis, channel analysis, and user retention analysis scenarios.
[0067] In this embodiment, SQL query complexity and business analysis dimensions are extracted from multiple sample test cases. These dimensions are then combined to obtain a dimension matrix. Specifically, a dimension matrix QnIm is established with the SQL query complexity dimension as the Q-axis and the business analysis dimension as the I-axis. Here, Q0 represents a single table with no aggregation, Q1 represents a single table with light aggregation, Q2 represents a multi-table JOIN, Q3 represents window functions / subqueries / CTEs, and Q4 represents cross-source / recursive / complex computations. I0 represents description, I1 represents comparison, I2 represents attribution, I3 represents prediction, and I4 represents strategy. The dimension matrix QnIm includes 25 different dimension combinations.
[0068] The business domain is divided into multiple specific business scenarios, resulting in a pre-defined list of business scenarios. Based on the pattern document and the pre-defined list of business scenarios, a large language model is used to call the dimension matrix to obtain a test case generation scheme. This scheme is used to assign test cases with different SQL query complexity dimensions and different business analysis dimensions to each business scenario. For example, 10 Q1I1 test cases and 5 Q2I2 test cases are assigned to the sales analysis scenario. A Q1I1 test case refers to a test case where the SQL query complexity dimension is Q1 and the business analysis dimension is I1.
[0069] Step S240: Based on the pattern document and test case generation scheme of the large language model, obtain the list of target test cases corresponding to the database table.
[0070] The target test case list includes multiple target test cases and tag information for each target test case. The tag information includes the business scenario, SQL query complexity, and business analysis dimensions of the target test case.
[0071] In this embodiment of the application, the pattern document and the test case generation scheme are input into the large language model to obtain a list of target test cases corresponding to the database table.
[0072] Applying the aforementioned test case generation method, the following steps are taken: First, data information from a database table is obtained. This data includes field information for each table. A schema document is generated based on this data information using a pre-defined schema document generation model. This schema document includes the table structure and field descriptions for each table. A dimension matrix is constructed. Then, a test case generation scheme is obtained using a large language model based on the schema document, a pre-defined list of business scenarios, and the dimension matrix. The business scenario list includes multiple business scenarios, and the dimension matrix includes combinations of SQL query complexity and business analysis dimensions for test cases. The test case generation scheme assigns test cases with different SQL query complexity and business analysis dimensions to each business scenario. Finally, a list of target test cases corresponding to the database table is obtained using a large language model based on the schema document and the test case generation scheme. As can be seen from the above, this application obtains the schema document based on the database table data information using a pre-defined schema document generation model, avoiding missing or incomplete information in the schema document. By using a large language model based on pattern documents, a pre-defined list of business scenarios, and a dimension matrix, a test case generation scheme is obtained. Based on the pattern documents and the test case generation scheme, a list of target test cases is obtained. This allows for the generation of multiple visually approximate test cases that conform to specific business scenarios. Furthermore, it enables precise control over the SQL query complexity and business analysis dimensions of test cases, avoiding confusion between SQL query complexity and business analysis, and improving the generation quality of target test cases.
[0073] In one embodiment, the field information of each table field includes the field name, data type, field constraint information, enumeration value, and statistical characteristics of each table field; based on the data information of the database tables, a schema document is obtained through a preset schema document generation model, including:
[0074] Step S221: Obtain preset task description information; the task description information represents the document content of the pattern document.
[0075] The preset task description information is used to indicate the specific format of the pattern document output by the pattern document generation model.
[0076] Step S222: Generate a third prompt word based on the field name, data type, field constraint information, enumeration value, statistical characteristics, and task description information of each table field.
[0077] In this embodiment, the field names, data types, field constraint information, enumeration values, statistical characteristics, and task description information of each table field are used to generate third prompt words according to a preset third prompt word template. For example, the content of the third prompt word is as follows:
[0078] Table name: sales_txn (Sales Transaction Table)
[0079] Field list:
[0080] - txn_id: INT, primary key candidate (100% unique), range 1-10000
[0081] - product_id: INT, enumeration values [101, 102, 103, 201, 202]
[0082] - channel: VARCHAR, enumeration values ["Direct Sales", "Channel A", "Channel B", "E-commerce"]
[0083] - Amount: DECIMAL, Range: 10.5-9999.8
[0084] - txn_date: DATETIME, time range: 2023-01-01 to 2023-12-31
[0085] Please generate a Markdown-formatted schema document, including:
[0086] 1. Table structure (field names, types, constraints, business meanings)
[0087] 2. Explanation of key enumeration values
[0088] 3. Field definition (units, time period, etc.)
[0089] 4. Data Coverage
[0090] Step S223: Input the third prompt word into the pattern document generation model to obtain the pattern document.
[0091] In this embodiment of the application, the content of the pattern document is as follows:
[0092] Markdown
[0093] ## Table: sales_txn (Sales Transaction Table)
[0094] ### Table Structure
[0095] | Field Name | Data Type | Constraint | Business Meaning |
[0096] |--------|----------|------|----------|
[0097] | txn_id | INT | PRIMARY KEY | Transaction serial number, unique identifier |
[0098] | product_id | INT | NOT NULL | Product ID, foreign key associated with the product table |
[0099] | channel | VARCHAR(50) | NOT NULL | Sales Channel |
[0100] | amount | DECIMAL(10,2) | NOT NULL | Transaction amount (unit: yuan) |
[0101] | txn_date | DATETIME | NOT NULL | Transaction time |
[0102] ### Enumeration Value Description
[0103] - **channel (sales channel)**:
[0104] - "Directly Operated": Company-owned stores
[0105] - "Channel A": Partner Channel A
[0106] - "Channel B": Partner Channel B
[0107] - "E-commerce": Sales on e-commerce platforms
[0108] ### Caliber Explanation
[0109] - **amount**: Unit is Chinese Yuan, rounded to two decimal places.
[0110] - **txn_date**: Transaction time, accurate to the second.
[0111] ### Data Coverage
[0112] - Time range: The entire year of 2023 (January 1, 2023 to December 31, 2023)
[0113] - Data volume: Approximately 10,000 transaction records
[0114] This application uses a preset pattern document generation model to automatically and quickly output pattern documents, eliminating the need for manual pattern document writing and improving the efficiency of pattern document generation.
[0115] In one embodiment, after obtaining the schema document, the method further includes:
[0116] Step S224: Verify the table structure of the schema document and the field descriptions of each table field to obtain the verification results.
[0117] In this embodiment of the application, the field names, data types, constraint information, and business meanings of the fields in the table structure of the schema document are verified. The key enumeration values, field scope descriptions (such as the unit and time scope of the field), and data coverage of the table fields are also verified to obtain the verification results.
[0118] Step S225: If there are errors in the structure of the verification result characterization table and the field descriptions of each table field, the third prompt word is corrected to obtain the corrected third prompt word.
[0119] In this embodiment of the application, if there are omissions in field descriptions, errors in enumeration values, or omissions in the descriptions of relationships between tables, new task description information is added to the third prompt word to obtain a corrected third prompt word.
[0120] Step S226: Input the corrected third prompt word into the pattern document generation model to obtain a new pattern document.
[0121] In this embodiment of the application, the corrected third prompt word is input into the pattern document generation model so that the pattern document generation model re-outputs the pattern document, ensuring that the pattern document information is correct and without omissions.
[0122] This application embodiment can avoid missing information in the pattern document by verifying the pattern document, thereby improving the quality of subsequent target test case generation.
[0123] In one embodiment, such as Figure 3 As shown, a test case generation scheme is obtained based on a large language model, a pattern document, a pre-defined list of business scenarios, and a dimension matrix, including:
[0124] Step S310: Obtain the preset number of test cases generated.
[0125] The preset number of test cases to be generated represents the number of test cases to be generated, which can be set according to actual needs.
[0126] Step S320: Generate the first prompt word based on the pattern document, business scenario list, dimension matrix, and number of test cases generated;
[0127] Step S330: Input the first prompt word into the large language model to obtain the test case generation scheme.
[0128] In this embodiment, the pattern document, business scenario list, dimension matrix, and number of test cases generated are concatenated according to a preset first prompt word template to obtain the first prompt word. The first prompt word is then input into the large language model to obtain the test case generation scheme. An exemplary test case generation scheme is as follows:
[0129] JSON
[0130] {
[0131] "plan": [
[0132] {
[0133] "business_scenario": "Sales Analysis",
[0134] "q_level": "Q2",
[0135] "i_level": "I1",
[0136] "test_point": "aggregate.multi_table_join.clear ",
[0137] "count": 10
[0138] },
[0139] {
[0140] "business_scenario": "channel analysis",
[0141] "q_level": "Q1",
[0142] "i_level": "I2",
[0143] "test_point": "group_by.single_table.colloquial",
[0144] "count": 8
[0145] }, ... ]
[0148] }
[0149] This test case generation scheme assigns 10 Q2I1 test cases to the sales analysis scenario and 8 Q1I2 test cases to the channel analysis scenario.
[0150] This application embodiment uses a large language model to automatically and quickly generate test case generation schemes. The test case generation scheme can allocate multiple test cases for various business scenarios, which facilitates the subsequent large-scale generation of test cases and improves the efficiency of test case generation.
[0151] In one embodiment, after obtaining the test case generation scheme, the method further includes:
[0152] Step S340: Obtain the number of test cases allocated to each business scenario from the test case generation scheme;
[0153] Step S350: Calculate the total number of test cases for each business scenario to obtain the total number of test cases;
[0154] Step S360: If the total number of test cases is inconsistent with the preset number of test cases generated, the first prompt word is corrected to obtain the corrected first prompt word.
[0155] Step S370: Input the corrected first prompt word into the large language model to obtain a new test case generation scheme.
[0156] In this embodiment, the total number of test cases is obtained by summing the number of test cases corresponding to all business scenarios in the test case generation scheme. It is then determined whether the total number of test cases matches the preset number of test cases generated. If they match, it indicates that the large language model outputs a correct test case generation scheme. If they do not match, it indicates that the large language model outputs an incorrect test case generation scheme. New task description information is added to the first prompt word, instructing the large language model to re-output the test case generation scheme to ensure that the total number of test cases matches the preset number of test cases generated.
[0157] This application embodiment ensures the correctness of the test case generation scheme and improves the quality of subsequent target test case generation by statistically analyzing the total number of test cases in the test case generation scheme.
[0158] In one embodiment, a list of target test cases corresponding to database tables is obtained by using a large language model based on pattern documents and a test case generation scheme, including:
[0159] Step S241: Generate the second prompt word based on the pattern document and test case generation plan;
[0160] Step S242: Input the second prompt word into the large language model to obtain a list of target test cases corresponding to the database table.
[0161] In this embodiment, the pattern document and test case generation scheme are concatenated according to a preset second prompt word template to obtain a second prompt word. The second prompt word is then input into a large language model to obtain a target test case list. For example, the target test case list is as follows:
[0162] {
[0163] "test_cases": [
[0164] {
[0165] Question: "Compare the total sales of direct sales and channel A last month?"
[0166] "tags": {
[0167] "business_scenario": "Sales Analysis",
[0168] "q_level": "Q2",
[0169] "i_level": "I1",
[0170] "test_point": "aggregate.multi_table_join.clear",
[0171] "query_pattern": "Multi-table JOIN + Aggregation",
[0172] "interaction_style": "Explicit instruction"
[0173] }
[0174] }, ... ]
[0177] }
[0178] Here, "question" represents the generated target test case, and "tags" represents the tag information of the target test case.
[0179] This application embodiment uses a large language model to automatically and quickly generate a list of target test cases, enabling large-scale generation of target test cases and meeting the performance testing requirements of Text-To-SQL systems.
[0180] In one embodiment, the target test case list includes target test cases and corresponding tag information for each target test case; the tag information includes the business scenario, SQL query complexity dimension, and business analysis dimension of the target test cases; after obtaining the target test case list, the method further includes:
[0181] Step S250: Generate the target file based on the target test case list.
[0182] In this embodiment of the application, the openpyxl library can be used to convert the target test case list into a structured Excel file (target file).
[0183] Step S260: Receive the review results for the target file; the review results are obtained by evaluating the business scenario, SQL query complexity dimension, and business analysis dimension of the target test cases based on the test case generation scheme.
[0184] In this embodiment, the business scenario of the target test case can be manually evaluated to determine whether it conforms to the business scenario in the test case generation scheme, whether the SQL query complexity dimension and business analysis dimension of the target test case conform to the SQL query complexity dimension and business analysis dimension specified in the test case generation scheme, and whether the target test case depends on fields or enumeration values that do not exist in the schema document. The manual review results are then obtained. These results are received and added to the corresponding review fields in the target file.
[0185] Step S270: Based on the review results, revise the second prompt word to obtain the revised second prompt word;
[0186] Step S280: Input the corrected second prompt word into the large language model to obtain a new list of target test cases.
[0187] In this embodiment of the application, when the review results indicate that the business scenario of the target test case does not conform to the business scenario in the test case generation scheme, the SQL query complexity dimension of the target test case does not conform to the SQL query complexity dimension specified in the test case generation scheme, and the business analysis dimension of the target test case does not conform to the business analysis dimension specified in the test case generation scheme, a new task description is added to the second prompt word, instructing the large language model to regenerate a new list of target test cases, so as to ensure that the business scenario, SQL query complexity dimension, and business analysis dimension of the target test case conform to the test case generation scheme.
[0188] This application embodiment can optimize the target test case list and improve the quality of target test case generation by exporting the target implementation test case list to Excel and conducting manual review.
[0189] It should be understood that although the steps in the flowcharts of the above embodiments are shown sequentially according to the arrows, these steps are not necessarily executed in the order indicated by the arrows. Unless explicitly stated herein, there is no strict order restriction on the execution of these steps, and they can be executed in other orders. Moreover, at least some steps in the flowcharts of the above embodiments may include multiple steps or multiple stages. These steps or stages are not necessarily completed at the same time, but can be executed at different times. The execution order of these steps or stages is not necessarily sequential, but can be performed alternately or in turn with other steps or at least some of the steps or stages in other steps. It is understood that the steps in different embodiments can be freely combined as needed, and all non-contradictory solutions formed by such combinations are within the scope of protection of this application.
[0190] Based on the same inventive concept, this application also provides a test case generation apparatus for implementing the test case generation method described above. The solution provided by this apparatus is similar to the implementation scheme described in the above method; therefore, the specific limitations in one or more test case generation apparatus embodiments provided below can be found in the limitations of the test case generation method described above, and will not be repeated here.
[0191] In one exemplary embodiment, such as Figure 4As shown, a test case generation device is provided, the device comprising:
[0192] The data information acquisition module 410 is used to acquire data information from database tables; the data information includes field information of each table field in the data table.
[0193] The schema document acquisition module 420 is used to generate data information of the model based on the database table through a preset schema document to obtain the schema document; the schema document includes the table structure of the database table and the field description of each table field;
[0194] The test case generation scheme acquisition module 430 is used to construct a dimension matrix. Based on the pattern document, the preset business scenario list and the dimension matrix, the test case generation scheme is obtained through the large language model. The business scenario list includes multiple business scenarios, and the dimension matrix includes a combination of the SQL query complexity dimension and the business analysis dimension of the test cases. The test case generation scheme is used to assign test cases with different SQL query complexity dimensions and different business analysis dimensions to each business scenario.
[0195] The target test case list acquisition module 440 is used to obtain a list of target test cases corresponding to database tables based on the pattern document and test case generation scheme using a large language model.
[0196] In one embodiment, a test case generation scheme is obtained based on a large language model, a pattern document, a pre-defined list of business scenarios, and a dimension matrix, including:
[0197] Get the preset number of test cases generated;
[0198] Generate the first prompt word based on the pattern document, business scenario list, dimension matrix, and number of test cases generated;
[0199] Input the first prompt word into the large language model to obtain the test case generation scheme.
[0200] In one embodiment, after obtaining the test case generation scheme, the method further includes:
[0201] Obtain the number of test cases allocated to each business scenario from the test case generation scheme;
[0202] The total number of test cases is obtained by summing the number of test cases for each business scenario.
[0203] If the total number of test cases is inconsistent with the preset number of test cases generated, the first prompt word is corrected to obtain the corrected first prompt word;
[0204] The revised first prompt word is input into the large language model to obtain a new test case generation scheme.
[0205] In one embodiment, a list of target test cases corresponding to database tables is obtained by using a large language model based on pattern documents and a test case generation scheme, including:
[0206] Based on the pattern document and test case generation plan, generate a second prompt word;
[0207] Input the second prompt word into the large language model to obtain a list of target test cases corresponding to the database table.
[0208] In one embodiment, the target test case list includes target test cases and corresponding tag information for the target test cases; the tag information includes the business scenario, SQL query complexity dimension, and business analysis dimension of the target test cases; after obtaining the target test case list, the method further includes:
[0209] Generate the target file based on the target test case list;
[0210] Receive the review results for the target file; the review results are obtained by evaluating the business scenario, SQL query complexity, and business analysis dimensions of the target test cases based on the test case generation plan;
[0211] Based on the review results, the second prompt word is revised to obtain the revised second prompt word;
[0212] The revised second prompt word is input into the large language model to obtain a new list of target test cases.
[0213] In one embodiment, the field information of each table field includes the field name, data type, field constraint information, enumeration value, and statistical characteristics of each table field; based on the data information of the database tables, a schema document is obtained through a preset schema document generation model, including:
[0214] Retrieve preset task description information; the task description information represents the document content of the pattern document;
[0215] A third prompt word is generated based on the field name, data type, field constraint information, enumeration value, statistical characteristics, and task description information of each table field.
[0216] Input the third prompt word into the pattern document generation model to obtain the pattern document.
[0217] In one embodiment, after obtaining the schema document, the method further includes:
[0218] The table structure of the schema document and the field descriptions of each table field are validated, and the validation results are obtained.
[0219] If there are errors in the structure of the validation result table and the field descriptions of each table field, the third prompt word is corrected to obtain the corrected third prompt word.
[0220] Input the revised third prompt word into the pattern document generation model to obtain a new pattern document.
[0221] Each module in the aforementioned test case generation device can be implemented entirely or partially through software, hardware, or a combination thereof. These modules can be embedded in or independent of the processor in a computer device, or stored in the memory of a computer device as software, so that the processor can call and execute the operations corresponding to each module.
[0222] In one exemplary embodiment, a computer device is provided, which may be a server, and its internal structure diagram may be as follows. Figure 5 As shown, this computer device includes a processor, memory, input / output interfaces (I / O), and a communication interface. The processor, memory, and I / O interfaces are connected via a system bus, and the communication interface is also connected to the system bus via the I / O interfaces. The processor provides computational and control capabilities. The memory includes non-volatile storage media and internal memory. The non-volatile storage media stores the operating system, computer programs, and a database. The internal memory provides the environment for the operation of the operating system and computer programs stored in the non-volatile storage media. The database stores test case generation data. The I / O interfaces are used for exchanging information between the processor and external devices. The communication interface is used for communicating with external terminals via a network connection. When the computer program is executed by the processor, it implements a test case generation method.
[0223] Those skilled in the art will understand that Figure 5 The structures shown are merely block diagrams of some structures related to the present application and do not constitute a limitation on the computer device to which the present application is applied. Specific computer devices may include more or fewer components than shown in the figures, or combine certain components, or have different component arrangements. In one embodiment, a computer device is provided, including a memory and a processor. The memory stores a computer program, which, when executed by the processor, causes the processor to perform the steps of the aforementioned large test case generation method. The steps of the test case generation method described here may be steps from one of the test case generation methods in the various embodiments described above.
[0224] In one embodiment, a computer-readable storage medium is provided, storing a computer program that, when executed by a processor, causes the processor to perform the steps of the test case generation method described above. The steps of the test case generation method described here may be steps from one of the test case generation methods in the various embodiments described above.
[0225] In one embodiment, a computer program product is provided, including a computer program that, when executed by a processor, causes the processor to perform the steps of the test case generation method described above. The steps of the test case generation method described here may be steps from one of the test case generation methods in the various embodiments described above.
[0226] It should be noted that the user information (including but not limited to user device information, user personal information, etc.) and data (including but not limited to data used for analysis, data stored, data displayed, etc.) involved in this application are all information and data authorized by the user or fully authorized by all parties, and the collection, use and processing of the relevant data must comply with relevant regulations.
[0227] Those skilled in the art will understand that all or part of the processes in the methods of the above embodiments can be implemented by a computer program instructing related hardware. The computer program can be stored in a non-volatile computer-readable storage medium. When executed, the computer program can include the processes of the embodiments of the above methods. Any references to memory, databases, or other media used in the embodiments provided in this application can include at least one of non-volatile memory and volatile memory. Non-volatile memory can include read-only memory (ROM), magnetic tape, floppy disk, flash memory, optical memory, high-density embedded non-volatile memory, resistive random access memory (ReRAM), magnetic random access memory (MRAM), ferroelectric random access memory (FRAM), phase change memory (PCM), graphene memory, etc. Volatile memory can include random access memory (RAM) or external cache memory, etc. By way of illustration and not limitation, RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM). The databases involved in the embodiments provided in this application may include at least one type of relational database and non-relational database. Non-relational databases may include, but are not limited to, blockchain-based distributed databases. The processors involved in the embodiments provided in this application may be general-purpose processors, central processing units, graphics processing units, digital signal processors, programmable logic devices, quantum computing-based data processing logic devices, artificial intelligence (AI) processors, etc., and are not limited to these.
[0228] The technical features of the above embodiments can be combined in any way. For the sake of brevity, not all possible combinations of the technical features in the above embodiments are described. However, as long as there is no contradiction in the combination of these technical features, they should be considered to be within the scope of this application.
[0229] The above embodiments merely illustrate several implementation methods of this application, and their descriptions are relatively specific and detailed, but they should not be construed as limiting the scope of this application's patent. It should be noted that those skilled in the art can make various modifications and improvements without departing from the concept of this application, and these all fall within the protection scope of this application. Therefore, the protection scope of this application should be determined by the appended claims.
Claims
1. A test case generation method, characterized in that, The method includes: Obtain data information from database tables; the data information includes field information for each table field in the data tables; The model is generated by generating a model based on the data information of the database table using a preset pattern document; the pattern document includes the table structure of the database table and the field description of each table field; A dimension matrix is constructed, and a test case generation scheme is obtained based on the pattern document, the preset business scenario list, and the dimension matrix through a large language model. The business scenario list includes multiple business scenarios, and the dimension matrix includes a combination of the SQL query complexity dimension and the business analysis dimension of the test cases. The test case generation scheme is used to assign test cases with different SQL query complexity dimensions and different business analysis dimensions to each of the business scenarios. Based on the pattern document and the test case generation scheme, the large language model obtains a list of target test cases corresponding to the database table.
2. The method according to claim 1, characterized in that, The step of obtaining a test case generation scheme based on the pattern document, the preset business scenario list, and the dimension matrix using a large language model includes: Get the preset number of test cases generated; Based on the pattern document, the business scenario list, the dimension matrix, and the number of test cases generated, a first prompt word is generated; The first prompt word is input into the large language model to obtain the test case generation scheme.
3. The method according to claim 2, characterized in that, After obtaining the test case generation scheme, the method further includes: Obtain the number of test cases allocated to each of the aforementioned business scenarios from the test case generation scheme; The total number of test cases is obtained by summing the number of test cases for each of the aforementioned business scenarios. If the total number of test cases is inconsistent with the preset number of test cases generated, the first prompt word is corrected to obtain the corrected first prompt word; The revised first prompt word is input into the large language model to obtain a new test case generation scheme.
4. The method according to claim 1, characterized in that, The step of obtaining a list of target test cases corresponding to the database table based on the pattern document and the test case generation scheme using the large language model includes: Based on the pattern document and the test case generation scheme, generate a second prompt word; The second prompt word is input into the large language model to obtain the target test case list.
5. The method according to claim 4, characterized in that, The target test case list includes target test cases and corresponding tag information for each target test case; the tag information includes the business scenario, SQL query complexity dimension, and business analysis dimension of the target test case. After obtaining the target test case list, the method further includes: Generate a target file based on the target test case list; Receive the review results for the target file; the review results are obtained by evaluating the business scenario, SQL query complexity dimension, and business analysis dimension of the target test case based on the test case generation scheme; Based on the review results, the second prompt word is revised to obtain the revised second prompt word; The revised second prompt word is input into the large language model to obtain a new list of target test cases.
6. The method according to claim 1, characterized in that, The field information of each table field includes the field name, data type, field constraint information, enumeration value, and statistical characteristics of each table field; The step of obtaining a schema document based on the data information of the database table using a preset schema document generation model includes: Obtain preset task description information; the task description information represents the document content of the pattern document; A third prompt word is generated based on the field name, data type, field constraint information, enumeration value, and statistical characteristics of each table field, as well as the task description information. The third prompt word is input into the pattern document generation model to obtain the pattern document.
7. The method according to claim 6, characterized in that, After obtaining the pattern document, the method further includes: The table structure of the schema document and the field descriptions of each table field are validated to obtain the validation results. If the verification result indicates that there are errors in the table structure and the field descriptions of each table field, the third prompt word is corrected to obtain the corrected third prompt word. The corrected third prompt word is input into the pattern document generation model to obtain a new pattern document.
8. A test case generation device, characterized in that, The device includes: The data information acquisition module is used to acquire data information from database tables; the data information includes field information of each table field in the data tables; The pattern document acquisition module is used to generate a model based on the data information of the database table through a preset pattern document; the pattern document includes the table structure of the database table and the field description of each table field; The test case generation scheme acquisition module is used to construct a dimension matrix. Based on the pattern document, the preset business scenario list, and the dimension matrix, a large language model is used to obtain a test case generation scheme. The business scenario list includes multiple business scenarios, and the dimension matrix includes a combination of the SQL query complexity dimension and the business analysis dimension of the test cases. The test case generation scheme is used to assign test cases with different SQL query complexity dimensions and different business analysis dimensions to each of the business scenarios. The target test case list acquisition module is used to obtain a target test case list corresponding to the database table based on the pattern document and the test case generation scheme using the large language model.
9. A computer device comprising a memory and a processor, wherein the memory stores a computer program, characterized in that, When the processor executes the computer program, it implements the steps of the method according to any one of claims 1 to 7.
10. A computer-readable storage medium having a computer program stored thereon, characterized in that, When the computer program is executed by a processor, it implements the steps of the method according to any one of claims 1 to 7.