Database paging query method, device, medium and product

By using a list of row identifiers in database pagination queries to perform sorting operations at the beginning, transforming them into a subset of target row identifiers for extraction, the problem of slow speed in deep pagination queries in databases is solved, achieving fast response and pagination queries with imperceptible latency.

CN122285702APending Publication Date: 2026-06-26SHENZHEN JINGYUAN INFORMATION TECH CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
SHENZHEN JINGYUAN INFORMATION TECH CO LTD
Filing Date
2026-03-17
Publication Date
2026-06-26

AI Technical Summary

Technical Problem

When performing deep pagination queries, especially when accessing deep pagination such as page 2000, the existing database needs to sequentially scan and skip a large number of records, resulting in slow query speed and failing to meet the needs of real-time interactive analysis.

Method used

By prioritizing the sorting operation and utilizing the list of row identifiers in the target dataset, pagination queries are transformed into truncation of a subset of the target row identifiers, thereby constructing a data query statement to obtain complete data records, avoiding database scanning and discarding large amounts of data.

Benefits of technology

It improves the speed of database pagination queries, ensuring consistent response time. Regardless of which page is requested, there is no noticeable delay when turning or jumping between pages, meeting the needs of real-time interaction.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122285702A_ABST
    Figure CN122285702A_ABST
Patent Text Reader

Abstract

This application discloses a database pagination query method, device, medium, and product, relating to the field of database technology. The method includes: receiving a pagination query request for a target dataset, the pagination query request including query conditions, the amount of data per page, and a requested page number; the query conditions including a sorting field and a sorting direction; obtaining a list of row identifiers for the target dataset corresponding to the query conditions, the list including row identifiers of all data records in the target dataset, and the row identifiers being arranged in the order corresponding to the query conditions; extracting a subset of target row identifiers corresponding to the requested page number from the list of row identifiers based on the requested page number and the amount of data per page; constructing a data query statement based on the subset of target row identifiers; and retrieving complete data records corresponding to each target row identifier in the subset of target row identifiers from the target database storing the target dataset based on the data query statement. This application can improve the speed of database pagination queries.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application belongs to the field of database technology, and in particular relates to a database pagination query method, device, medium and product. Background Technology

[0002] In the semiconductor integrated circuit manufacturing process, to ensure product yield, optical or electron beam defect inspection equipment is often used to scan the entire surface of the wafer. A single inspection can generate millions to billions of defect records (including coordinates, dimensions, types, etc.) and store them in relational databases (such as MySQL and SQLite). Engineers need to frequently perform multi-dimensional sorting and pagination of the defect data to locate process problems. For example, for a specific inspection batch, the data can be sorted in descending order by defect size and returned to a specified page.

[0003] Existing databases commonly use LIMIT and OFFSET to implement pagination queries. However, when deep pagination is required, such as querying page 2000 and needing to skip the first 100,000 records using OFFSET 100,000, even if the sorting field has an index, the database still needs to sequentially scan and skip the first 100,000 records that meet the criteria before retrieving subsequent records. The complexity of this process increases linearly with OFFSET. For tables with hundreds of millions of defects, a single deep query can take several seconds to tens of seconds, causing interface lag and a surge in database load, failing to meet the needs of real-time interactive analysis. Summary of the Invention

[0004] This application provides a database pagination query method, device, medium, and product, which can improve the query speed of database pagination queries and meet the needs of real-time interaction.

[0005] A first aspect of this application provides a database pagination query method, including: Receive a pagination query request for a target dataset. The pagination query request includes query conditions, the amount of data per page, and the requested page number. The query conditions include a sorting field and a sorting direction. Obtain a list of row identifiers for the target dataset corresponding to the query conditions. The list of row identifiers includes the row identifiers of all data records in the target dataset, and the row identifiers are arranged in the order corresponding to the query conditions. Based on the requested page number and the amount of data per page, extract the target row identifier subset corresponding to the requested page number from the row identifier list; Construct a data query statement based on the target row identifier subset; Based on the data query statement, complete data records corresponding to each target row identifier in the target row identifier subset are obtained from the target database storing the target dataset.

[0006] A second aspect of this application provides a database pagination query apparatus, comprising: The receiving module is used to receive pagination query requests for a target dataset. The pagination query request includes query conditions, the amount of data per page, and the requested page number. The query conditions include a sorting field and a sorting direction. The acquisition module is used to acquire a list of row identifiers of the target dataset corresponding to the query conditions. The list of row identifiers includes the row identifiers of all data records in the target dataset, and the row identifiers are arranged in the order corresponding to the query conditions. The extraction module is used to extract a subset of target row identifiers corresponding to the requested page number from the row identifier list based on the requested page number and the amount of data per page; The construction module is used to construct a data query statement based on the subset of target row identifiers; The acquisition module is used to acquire, based on the data query statement, complete data records corresponding to each target row identifier in the target row identifier subset from the target database storing the target dataset.

[0007] A third aspect of the embodiments of this application provides an electronic device, including: a memory and a program or instructions stored in the memory and executable on a processor, wherein when the program or instructions are executed by the processor, they implement the database pagination query method provided in any of the embodiments of this application described above.

[0008] A fourth aspect of the embodiments of this application provides a computer-readable storage medium storing a computer program or instructions, which, when executed by a processor, implements the database pagination query method provided in any of the embodiments of this application described above.

[0009] A fifth aspect of the embodiments of this application provides a computer program product in which instructions, when executed by a processor of an electronic device, cause the electronic device to perform a database pagination query method as provided in any of the embodiments of this application described above.

[0010] The database pagination query method provided in this application provides a method that prioritizes sorting by using a list of row identifiers for the target dataset corresponding to the query conditions. It transforms the pagination query into a subset of target row identifiers by considering the amount of data per page and the requested page number. A data query statement is then constructed using this subset, and complete data records corresponding to each target row identifier in the subset are retrieved from the database. This avoids the slow query speed caused by the database needing to scan and discard large amounts of data during pagination, reducing the computational load on the database. The response time remains consistent regardless of the requested page, with no perceptible delay during page turning or jumping, thus improving the query speed of database pagination and meeting the needs of real-time interaction. Attached Figure Description

[0011] To more clearly illustrate the technical solutions of the embodiments of this application, the accompanying drawings used in the embodiments of this application will be briefly introduced below. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.

[0012] Figure 1 This is a schematic flowchart of a database pagination query method provided in one embodiment of this application; Figure 2 This is a schematic flowchart of a database pagination query method provided in one embodiment of this application; Figure 3 This is a schematic diagram of the structure of a database pagination query device provided in one embodiment of this application; Figure 4 This is a schematic diagram of an electronic device provided in one embodiment of this application. Detailed Implementation

[0013] The features and exemplary embodiments of various aspects of this application will be described in detail below. To make the objectives, technical solutions, and advantages of this application clearer, the application will be further described in detail below with reference to the accompanying drawings and specific embodiments. It should be understood that the specific embodiments described herein are only intended to explain this application and not to limit it. For those skilled in the art, this application can be implemented without some of these specific details. The following description of the embodiments is merely to provide a better understanding of this application by illustrating examples.

[0014] It should be noted that, in this document, relational terms such as "first" and "second" are used merely to distinguish one entity or operation from another, and do not necessarily require or imply any such actual relationship or order between these entities or operations. Furthermore, the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or apparatus. Without further limitations, an element defined by the phrase "comprising..." does not exclude the presence of additional identical elements in the process, method, article, or apparatus that includes said element.

[0015] It should be noted that the acquisition, storage, use, and processing of data in the technical solution of this application all comply with the relevant provisions of national laws and regulations. In the embodiments of this application, certain existing industry solutions such as software, components, and models may be mentioned. These should be considered exemplary, intended only to illustrate the feasibility of implementing the technical solution of this application, and do not imply that the applicant has already used or necessarily used such solutions.

[0016] In the field of semiconductor integrated circuit technology, to ensure product yield, optical or electron beam defect inspection equipment is often used to scan the entire surface of wafers. A single inspection can generate millions to billions of defect records (including coordinates, size, type, etc.) and store them in relational databases (such as MySQL and SQLite). Engineers need to frequently perform multi-dimensional sorting and pagination of defect data to comprehensively evaluate defect distribution and ensure process stability. For example, when sorting defect records for a specific batch of inspections by defect size from largest to smallest, large defect sizes often have strong randomness, so defect records at the beginning of the sequence are difficult to reflect systematic process deviations. In a stable process, the size of most defects on the wafer will be concentrated within a typical range. In the defect records in the middle of the sequence, the changes in defect size can more accurately reflect the drift of process parameters such as etching and deposition, and can be used to monitor process stability. Small defect sizes that are close to the detection limit, and defect records at the end of the sequence, can be used to reflect the substrate noise of the inspection equipment and minor differences in the process. Therefore, engineers need to be able to quickly access any position in the sorted list to comprehensively evaluate defect distribution and monitor process stability.

[0017] However, existing databases commonly use LIMIT and OFFSET to implement pagination queries. But when accessing deep pagination, such as page 2000 with OFFSET 100000, even if the sorting field has an index, the database still needs to sequentially scan and skip the first 100000 records that meet the criteria before retrieving subsequent records. The complexity of this process increases linearly with OFFSET. For a defect table with hundreds of millions of records, a single deep query can take several seconds to tens of seconds, causing interface lag and a surge in database load, failing to meet the needs of real-time interactive analysis.

[0018] In view of this, this application provides a database pagination query method, device, medium, and product. The data pagination method uses a list of row identifiers of the target dataset corresponding to the query conditions to perform a sorting operation beforehand. It transforms the pagination query into a subset of target row identifiers by considering the amount of data per page and the requested page number. Then, it constructs a data query statement using the subset of target row identifiers and retrieves complete data records corresponding to each target row identifier from the database. This avoids the slow query speed caused by the database needing to scan and discard large amounts of data during pagination operations, reduces the computational load on the database, and ensures consistent response time regardless of the requested page. Page turning and jumping are seamless, improving the query speed of database pagination and meeting the needs of real-time interaction.

[0019] The database pagination query method provided in this application embodiment is described below. In practical applications, the executing entity of the database pagination query method in this application embodiment can be an electronic device. This electronic device can have a display screen, thereby displaying the complete data records obtained from the query on the display screen.

[0020] The following describes specific embodiments of the database pagination query method, apparatus, device, medium, and product provided in this application. First, the database pagination query method will be introduced.

[0021] Figure 1 This is a flowchart illustrating a database pagination query method provided in one embodiment of this application. Figure 1 As shown, the database pagination query method includes steps 101 to 105.

[0022] Step 101: Receive a pagination query request for the target dataset. The pagination query request includes query conditions, the amount of data per page, and the requested page number. The query conditions include the sorting field and the sorting direction. Step 102: Obtain the row identifier list of the target dataset corresponding to the query conditions. The row identifier list includes the row identifiers of all data records in the target dataset, and the row identifiers are arranged in the order corresponding to the query conditions. Step 103: Based on the requested page number and the amount of data per page, extract the target row identifier subset corresponding to the requested page number from the row identifier list; Step 104: Construct a data query statement based on the subset of target row identifiers; Step 105: Based on the data query statement, retrieve the complete data records corresponding to each target row identifier in the target row identifier subset from the target database storing the target dataset.

[0023] The target dataset is the collection of data stored in the database that the user currently wants to query. For example, the target dataset can be all defect records generated by a certain inspection task, and the defect records can include defect coordinates, defect size, defect type, etc.

[0024] A pagination query request is an instruction initiated by an application that interacts with a database to view data records in a target dataset. A pagination query request can include query criteria, the requested page number, and the number of data records per page.

[0025] The query conditions include the sorting field and the sorting direction, which are the conditions for filtering and sorting the target dataset. The sorting field is a field in the data records of the target dataset, and the sorting direction can be ascending or descending. For example, when the sorting field is defect size and the sorting direction is descending, it means that the defect records in the target dataset are sorted in descending order of defect size.

[0026] The number of data records displayed per page is the number of data records displayed on each page. It is used to paginate and display the data records in the target dataset after they have been sorted according to the query conditions.

[0027] For a target dataset sorted according to query criteria and paginated according to the amount of data per page, a request page number is used to identify the page number of the data records the user wishes to view. Request page numbers are typically positive integers starting from 1; for example, a request page number of 1 indicates viewing page 1, and a request page number of 100 indicates viewing page 100. A larger request page number indicates that the user expects to skip more pages of data. The amount of data per page and the request page number together determine the number of data records that need to be skipped and returned in a single query.

[0028] In step 102, the row identifier is a number or code used in the database to uniquely identify a data record. The row identifier can be used to quickly and accurately locate the complete data record corresponding to the row identifier in the database.

[0029] In some embodiments, the memory cache of an electronic device may pre-store a list of row identifiers for different datasets corresponding to multiple different query conditions. Therefore, the list of row identifiers for the target dataset corresponding to the query conditions can be obtained from the memory cache according to the target dataset and the query conditions.

[0030] The row identifier list of the target dataset corresponding to the query conditions stores the row identifiers of all data records in the target dataset, and the order of the stored row identifiers is arranged according to the order corresponding to the query conditions. For example, when the sorting field in the query conditions is defect size and the sorting direction is descending, then in the row identifier list of the target dataset corresponding to the query conditions, the first row identifier points to the data record with the largest defect size in the target dataset, and the last row identifier points to the data record with the smallest defect size in the target dataset.

[0031] In step 103, the target row identifier subset is a portion of the row identifiers extracted from the complete list of row identifiers, based on the requested page number and the amount of data per page. The row identifiers included in the target row identifier subset are the row identifiers of the data records in the requested page number.

[0032] In some embodiments, the page number and the amount of data per page can be requested to calculate the starting position and number of line identifiers to be extracted from the line identifier list. For example, if the requested page number is page 2, and each page displays 10 data entries, then 10 line identifiers are extracted consecutively starting from the 11th position of the line identifier list to obtain the target subset of line identifiers corresponding to the requested page number. If the requested page numbers are pages 1000-1005, and each page displays 10 data entries, then 50 line identifiers are extracted consecutively starting from the 10000th position of the line identifier list to obtain the target subset of line identifiers corresponding to the requested page number.

[0033] In step 104, since row identifiers can uniquely identify a data record in the database, after determining the target subset of row identifiers, a data query statement can be constructed based on the row identifiers included in the target subset. For example, the filtering condition of the data query statement can be all row identifiers in the target subset of row identifiers.

[0034] In step 105, a data query statement can be sent to the database to retrieve the data records corresponding to each row identifier in the target row identifier subset. Since the query conditions are constructed based on row identifiers, the database does not need to perform any additional sorting or full table scans, and can return query results at extremely high speed. After obtaining the query results, the data records corresponding to each retrieved row identifier can be sorted according to the order of the row identifiers in the target row identifier subset to obtain the complete data records corresponding to each target row identifier in the target row identifier subset.

[0035] The database pagination query method provided in this application provides a method that prioritizes sorting by using a list of row identifiers for the target dataset corresponding to the query conditions. It transforms the pagination query into a subset of target row identifiers by considering the amount of data per page and the requested page number. A data query statement is then constructed using this subset, and complete data records corresponding to each target row identifier in the subset are retrieved from the database. This avoids the slow query speed caused by the database needing to scan and discard large amounts of data during pagination, reducing the computational load on the database. The response time remains consistent regardless of the requested page, with no perceptible delay during page turning or jumping, thus improving the query speed of database pagination and meeting the needs of real-time interaction.

[0036] In some embodiments, in order to improve the efficiency of database pagination queries, steps 201 to 203 are included before step 101 receives the pagination query request for the target dataset.

[0037] Step 201: Divide the target dataset into multiple consecutive and non-overlapping data fragments; Step 202: For each data shard, obtain the intermediate results of each data shard sorted according to the query conditions from the target database. The intermediate results include row identifiers and their corresponding sorting field values. Step 203: Perform multi-way merge sort on all intermediate results according to the query conditions to obtain a list of row identifiers corresponding to the query conditions.

[0038] Data sharding refers to a subset of the target dataset. In step 201, the target dataset can be divided into multiple contiguous and non-overlapping data shards according to the row identifier of the data records or any field in the data records. Here, "contiguous" means that all data records in the multiple data shards can cover all data records in the target dataset, and "non-overlapping" means that there are no duplicate data records in the multiple data shards.

[0039] To ensure that the list of row identifiers of the target dataset corresponding to the query conditions can be obtained, in step 202, a database query statement is constructed for each data shard to sort the data records corresponding to that shard in the target dataset according to the query conditions, namely the sorting field and sorting direction.

[0040] Send the database query statement corresponding to each data shard to the database, and determine the query result returned by the database as the intermediate result of that data shard sorted according to the query conditions.

[0041] For each data shard, the intermediate results include row identifiers and sort field values ​​for all data records corresponding to that shard. The sort field value refers to the numerical value of the sort field within the data record. For example, when the sort field is defect size, the sort field value could be 100 nanometers, 95 nanometers, 88 nanometers, etc. In the intermediate query results, the order of all data records corresponding to the data shard is determined by the sort field value and the sorting direction. For example, when the sorting direction is ascending, the data records in the intermediate query results are arranged in ascending order of the sort field value.

[0042] In step 202, since the amount of data in each data shard is much smaller than the amount of data in the target dataset, and the database query statement limits the query scope to the data records in the data shards, and the intermediate results can include only the row identifiers and sorting field values ​​of all data records corresponding to the data shards, the intermediate results of the data shards sorted according to the query conditions can be obtained quickly.

[0043] In step 203, since the intermediate results have already been sorted according to the sorting field value, and the multi-way merge sort algorithm can utilize the existing local ordering to efficiently merge the globally ordered results of the sorting field in multiple data shards, i.e., the target dataset, for each data shard, the intermediate results of each data shard sorted according to the query conditions are first obtained from the target database, and then all intermediate results are sorted according to the query conditions based on the sorting field value. Compared with directly sorting all data records in the target dataset in the database, this method has lower computational complexity and can quickly obtain the sorting results of all data records in the target dataset according to the query conditions, and obtain the row identifier list corresponding to the query conditions.

[0044] In one implementation, to reduce memory usage, the list of row identifiers corresponding to the query conditions may only include the row identifiers of the data records and not other fields.

[0045] The database pagination query method provided in this application divides the target dataset into multiple continuous and non-overlapping data shards. For each data shard, intermediate results sorted by query conditions for each data shard are obtained from the target database. This decomposes the sorting task of all data records into multiple parallel subtasks, reducing the amount of data and complexity of a single query. By performing multi-way merge sort on all intermediate results according to the query conditions based on the sorting field value, the sorting results of all data records in the target dataset according to the query conditions can be quickly obtained, thereby obtaining the row identifier list corresponding to the query conditions, thus improving the query efficiency for pagination query requests for the target dataset.

[0046] In some embodiments, the electronic device's memory cache pre-stores multiple lists of row identifiers corresponding to different datasets for different query conditions. Each list of row identifiers corresponds to a cache key determined based on the target dataset and the query conditions. Step 102, obtaining the list of row identifiers for the target dataset corresponding to the query conditions, may include: determining a query cache key based on the target dataset and the query conditions; searching the memory cache for the list of row identifiers for the target dataset corresponding to the query conditions based on the query cache key; if the cache is hit, it means that the list of row identifiers for the target dataset corresponding to the query conditions has been successfully found in the memory cache, and the list of row identifiers for the target dataset corresponding to the query conditions is obtained; if the cache is not hit, it means that the list of row identifiers for the target dataset corresponding to the query conditions has not been found in the memory cache. In this case, the list of row identifiers for the query conditions can be generated according to the methods shown in steps 201 to 203, and after obtaining the list of row identifiers for the target dataset corresponding to the query conditions, it can be stored in the memory cache so that the list of row identifiers for the target dataset corresponding to the query conditions can be obtained from the memory cache later.

[0047] In some embodiments, to further improve the efficiency of database pagination queries, after step 203, step 301 is included, which involves storing the query conditions and their corresponding row identifier list into a memory cache. The processing procedure of step 102 includes step 302, retrieving the row identifier list corresponding to the query conditions from the memory cache.

[0048] Memory cache is a dedicated storage area in a computer system's main memory used for temporary storage of intermediate data generated during computation. Memory read / write speeds are significantly faster than disks or solid-state drives (SSDs). Storing data in memory greatly improves the response speed of subsequent accesses. Therefore, storing query conditions and their corresponding row identifier lists in the memory cache allows for faster pagination queries when retrieving the row identifier list for the target dataset corresponding to the query conditions, as the list is retrieved directly from the memory cache.

[0049] The database pagination query method provided in this application caches the query conditions and their corresponding row identifier list in memory. When querying data records with the same sorting condition on different page numbers, the row identifier list can be directly retrieved from the cache without waiting for any database operations or sorting calculations, thereby improving the speed of obtaining pagination query results.

[0050] In some embodiments, to further improve the pagination query speed, the processing of step 201 may include steps 401 to 403.

[0051] Step 401: Obtain the total number of records in the target dataset; Step 402: Determine the number of shards based on the total number of records. The number of shards is positively correlated with the total number of records. Step 403: Based on the number of shards and the row identifiers of all data records in the target dataset, divide the target dataset into multiple data shards. Each data shard includes data records in the target dataset whose row identifiers are a continuous value range. The row identifiers of the data records included in all data shards are continuous and do not overlap.

[0052] In some embodiments, the total number of records in the target dataset can be obtained by executing a lightweight database aggregation query. For example, the standard COUNT function can be used to count all records. Alternatively, the total number of records can be obtained by querying the maximum primary key value, leveraging the continuity of the auto-incrementing primary key in the database table. For example, executing SELECT MAX auto-incrementing primary key FROM table_name can quickly yield an approximate total number of records in the target dataset.

[0053] In step 402, the total number of records refers to the total number of data records in the target dataset.

[0054] In some embodiments, after obtaining the total number of records, the number of data shards to be divided into can be determined according to preset rules, with the number of shards being positively correlated with the total number of records. For example, a threshold can be set; when the total number of records is less than the threshold, only one shard is used to avoid unnecessary parallel overhead; when the total number of records exceeds the threshold, a fixed number of shards is used, such as ten shards when the number of records exceeds ten thousand. As another example, the number of shards can be calculated based on the expected number of data records contained in each data shard. For instance, when the expected number of data records in each data shard is set to ten thousand, the number of shards is equal to the total number of records divided by ten thousand, rounded up.

[0055] In step 403, since row identifiers are usually primary keys in database tables, and have uniqueness and index support, the target dataset can be divided into multiple data shards based on the number of shards and the row identifiers of all data records in the target dataset.

[0056] For example, the minimum and maximum values ​​of row identifiers in the target dataset can be obtained, or the continuity of the auto-incrementing primary key can be directly utilized to divide the total range of row identifiers into several continuous intervals based on the number of shards, with each interval corresponding to a data shard. For instance, if the minimum row identifier is 1, the maximum is 10000, and the number of shards is 5, then the interval width of each shard is 2000, resulting in data shards corresponding to row identifier intervals of 1 to 2000, 2001 to 4000, 4001 to 6000, 6001 to 8000, and 8001 to 10000, respectively. Each data shard contains all data records whose row identifiers fall within their corresponding intervals. Dividing data into shards based on continuous intervals of row identifiers allows the query conditions for each shard to be written in the form of BETWEEN start value AND end value. This enables the database to quickly locate data blocks using indexes when retrieving intermediate results from each data shard sorted by query conditions, avoiding full table scans and improving the speed of obtaining intermediate results. The execution speed of each shard query is also improved.

[0057] The database pagination query method provided in this application determines the number of shards based on the total number of records, enabling the method to adapt to datasets of different sizes. This avoids the additional overhead of creating multiple threads, establishing multiple database connections, and executing multiple queries when the total number of records in the target dataset is small. At the same time, when the total number of records in the target dataset is large, the method can increase the number of shards and quickly obtain the intermediate results of all data shards by querying the intermediate results of each data shard sorted by the query conditions in parallel, thereby improving the speed of generating the row identifier list corresponding to the query conditions.

[0058] In some embodiments, in order to improve the speed of generating a list of row identifiers corresponding to query conditions, step 202 may include steps 501 to 503, in which intermediate results of each data shard sorted by query conditions are obtained from the target database.

[0059] Step 501: Generate a corresponding query task for each data shard based on the query conditions and the data range corresponding to each data shard in the target dataset. Step 502: Based on the database file path where the target dataset is stored in the target database and the identifier of the current execution thread of each query task, create an independent database connection for each query task. Step 503: Execute multiple query tasks concurrently through each database connection to obtain intermediate results for each data shard sorted by the query conditions.

[0060] In some embodiments, when generating a corresponding query task for each data shard, the query conditions can be transformed into SELECT and ORDER BY clauses in an SQL statement. The SELECT clause includes a sorting field, used to retrieve the value of the sorting field from the data records. The ORDER BY clause includes a sorting field and a sorting condition, used to sort the query results based on the value of the sorting field. The data range corresponding to each data shard in the target dataset can be transformed into a BETWEEN condition in an SQL statement, used to limit the query range.

[0061] The database file path is the complete path of the database file storing the target dataset in the operating system. It is used to locate the database file that needs to be accessed and is a parameter that must be provided when creating a database connection.

[0062] A thread identifier is a unique code assigned to each thread by the operating system or programming language, usually called a thread ID. A thread identifier can uniquely identify a thread during program execution; different threads have different identifiers, and the identifier of the same thread remains unchanged throughout its lifetime.

[0063] In step 502, when the database driver has thread safety restrictions—that is, a connection cannot be used by multiple threads simultaneously—a database connection is created for the query task based on the database file path and the identifier of the currently executing thread. This ensures that the database connection created for each thread is unique, preventing multiple threads from sharing the same connection. Therefore, even if multiple threads access the same database file simultaneously, because they are using different connections, the database internally treats them as different client sessions, processing them independently without interference. Consequently, parallel queries can be implemented even in database drivers that do not support multi-threaded concurrency, reducing the instantaneous pressure on database resources.

[0064] For example, in the Qt framework, a QSqlDatabase connection can only be used by one thread at a time. This means that if multiple threads access the same database connection simultaneously, it may lead to data races and other unpredictable behavior. Therefore, to avoid thread safety issues with Qt database connections, a unique database connection is created for each thread by combining the current thread ID with the path to form a unique connection name. This ensures that each thread has its own database connection, avoiding data races and other unpredictable problems.

[0065] In step 503, by creating an independent database connection for each query task and executing multiple query tasks concurrently, when a thread completes a query task, if there are other query tasks waiting to be executed, the thread can continue to take on new tasks and execute them until all shard queries are completed. This improves the speed of obtaining intermediate results of all data shards sorted by query conditions. At the same time, it avoids the situation where a single large query occupies database locks and I / O resources for a long time.

[0066] The database pagination query method provided in this application improves the speed of obtaining intermediate results of each data segment sorted by query conditions by generating corresponding query tasks for each data segment, creating an independent database connection for each query task, and executing multiple query tasks concurrently, thereby improving the response speed of pagination queries.

[0067] In some embodiments, in order to improve the utilization efficiency of database connection resources, step 601 is included after step 503.

[0068] Step 601: For each query task, after obtaining the intermediate results of the corresponding data shards sorted by the query conditions, close and cancel the database connection corresponding to the query task.

[0069] Closing a connection means actively disconnecting the communication channel between the application and the database. Closing a connection causes the database server to release all resources associated with that connection, including server-side memory buffers, lock objects, transaction context, etc. Simultaneously, the application's connection object is marked as unavailable, preventing any further database operations from being performed through that connection.

[0070] Unregistering a connection refers to removing a reference to and record of that connection from the application's connection management mechanism. In many database access frameworks, each created connection needs to be managed by name or handle, and the connection registry maintains an index of all active connections. Unregistering a connection removes it from the registry, allowing the connection name to be reused by subsequent new connections, while also preventing other parts of the program from accidentally referencing a closed connection.

[0071] The database pagination query method provided in this application, by promptly closing and canceling the corresponding data connection after obtaining the intermediate results of the corresponding data shards sorted according to the query conditions, releases related resources, which can prevent database connection leakage, ensure the stable operation of the database, avoid connection name conflicts, guarantee the success rate of connection creation, ensure that database connection resources are only occupied when the query is actually executed, and improve the resource utilization of the database.

[0072] In some embodiments, in order to improve the efficiency of generating a list of row identifiers corresponding to query conditions, steps 701 to 703 are included after step 203.

[0073] Step 701: Reverse the order of the row identifiers in the row identifier list to generate a reversed row identifier list; Step 702: Construct reverse query conditions corresponding to the query conditions. The sorting direction of the reverse query conditions is opposite to the sorting direction of the query conditions. Step 703: Determine the list of row identifiers in reverse order as the list of row identifiers corresponding to the reverse query conditions.

[0074] In some embodiments, after determining the list of row identifiers corresponding to the query conditions, since the row identifiers in the list are arranged according to the sorting field and sorting direction, reversing the order of the row identifiers in the list can obtain a list of row identifiers arranged in the reverse direction of the sorting field and sorting direction, i.e., a reverse-order list of row identifiers.

[0075] For example, if the list of row identifiers corresponding to the query conditions is sorted in descending order of defect size, then the list of row identifiers in reverse order is sorted in ascending order of defect size.

[0076] A reverse query condition is a query condition that, compared to the query condition, only changes the sorting direction.

[0077] In one implementation, the order of the line identifiers in the list of line identifiers can be reversed in place using standard library functions provided by the programming language, such as the std::reverse function in C++, to obtain a reversed list of line identifiers.

[0078] In step 703, the reverse row identifier list is determined as the row identifier list corresponding to the reverse query conditions. This allows the row identifier list corresponding to the reverse query conditions to be obtained without performing sharded queries and multi-way merge sort. Furthermore, the row identifier list corresponding to the reverse query conditions can be stored in the memory cache. Thus, when the sorting direction in the query conditions changes, the corresponding row identifier list can be directly retrieved from the cache based on the new query conditions, thereby improving the speed of pagination queries.

[0079] In some scenarios, switching the sorting direction is a common operation. For example, if you first check the defect size from largest to smallest and then switch to checking it from smallest to largest, re-sorting may take a long time. However, if there is a list of row identifiers corresponding to the query conditions, you can quickly obtain the list of row identifiers corresponding to the reverse query conditions by reversing the order of the row identifiers in the list.

[0080] The database pagination query method provided in this application reverses the order of row identifiers in the row identifier list to generate a reversed row identifier list, and constructs a reversed query condition corresponding to the query condition. This allows the row identifier list corresponding to the reversed query condition to be obtained without repeatedly executing the database query operation, thereby improving the response speed of pagination queries.

[0081] Figure 2 This is a flowchart illustrating a database pagination query method provided in one embodiment of this application. Figure 2 As shown, in some embodiments, the database pagination query method includes steps 801 to 807.

[0082] Step 801: Receive a pagination query request for the target dataset. The pagination query request includes query conditions, the amount of data per page, and the requested page number. The query conditions include the sorting field and the sorting direction. Step 802: Divide the target dataset into multiple consecutive and non-overlapping data fragments; Step 803: For each data shard, obtain the intermediate results of each data shard sorted according to the query conditions from the target database. The intermediate results include the row identifier and sorting field value of the data record. Step 804: Perform multi-way merge sort on all intermediate results according to the query conditions based on the sorting field value to obtain a list of row identifiers corresponding to the query conditions; Step 805: Based on the requested page number and the amount of data per page, extract the target row identifier subset corresponding to the requested page number from the row identifier list; Step 806: Construct a data query statement based on the subset of target row identifiers; Step 807: Based on the data query statement, retrieve the complete data records corresponding to each target row identifier in the target row identifier subset from the target database storing the target dataset.

[0083] The specific implementation methods of steps 801 to 807 can be found in the descriptions in the above embodiments, and will not be repeated here.

[0084] Based on the same inventive concept as the database pagination query method, this application also provides specific embodiments of a database pagination query device.

[0085] Figure 3 This is a schematic diagram of the structure of the database pagination query device provided in the embodiments of this application. For example... Figure 3 As shown, the database pagination query device 30 provided in this application embodiment includes: The receiving module 31 is used to receive a pagination query request for the target dataset. The pagination query request includes query conditions, the amount of data per page, and the requested page number. The query conditions include a sorting field and a sorting direction. The acquisition module 32 is used to acquire a list of row identifiers of the target dataset corresponding to the query conditions. The list of row identifiers includes the row identifiers of all data records in the target dataset, and the row identifiers are arranged in the order corresponding to the query conditions. The extraction module 33 is used to extract the target row identifier subset corresponding to the requested page number from the row identifier list based on the requested page number and the amount of data per page; Module 34 is used to construct a data query statement based on a subset of target row identifiers; The acquisition module 35 is used to retrieve, based on the data query statement, the complete data records corresponding to each target row identifier in the target row identifier subset from the target database storing the target dataset.

[0086] As an optional embodiment, the database pagination query device further includes: The partitioning module is used to divide the target dataset into multiple contiguous and non-overlapping data partitions. The second acquisition module is used to obtain intermediate results of each data shard from the target database, sorted according to the query conditions. The intermediate results include the row identifier and sorting field value of the data record. The sorting module is used to perform multi-way merge sorting on all intermediate results according to the query conditions based on the sorting field values, and obtain a list of row identifiers corresponding to the query conditions.

[0087] As an optional embodiment, the database pagination query device further includes: The storage module is used to store the query conditions and their corresponding row identifier list into the memory cache. The retrieval module is specifically used to retrieve a list of row identifiers corresponding to the query conditions from the memory cache.

[0088] As an optional embodiment, the partitioning module is specifically used for: Get the total number of records in the target dataset; The number of shards is determined based on the total number of records, and the number of shards is positively correlated with the total number of records. Based on the number of shards and the row identifiers of all data records in the target dataset, the target dataset is divided into multiple data shards. Each data shard includes data records in the target dataset whose row identifiers are a continuous range of values. The row identifiers of the data records included in all data shards are continuous and do not overlap.

[0089] As an optional embodiment, the second acquisition module is specifically used for: Based on the query conditions and the data range corresponding to each data shard in the target dataset, generate a corresponding query task for each data shard. Based on the database file path where the target dataset is stored in the target database and the identifier of the current execution thread for each query task, create an independent database connection for each query task; Multiple query tasks are executed concurrently through each database connection to obtain intermediate results for each data shard sorted by the query conditions.

[0090] As an optional embodiment, the second acquisition module is further configured to: For each query task, after obtaining the intermediate results of the corresponding data shards sorted by the query conditions, close and unregister the database connection corresponding to the query task.

[0091] As an optional embodiment, the database pagination query device further includes: The generation module is used to reverse the order of the row identifiers in the row identifier list and generate a reversed row identifier list. The second construction module is used to construct reverse query conditions corresponding to the query conditions. The sorting direction of the reverse query conditions is opposite to the sorting direction of the query conditions. The determination module is used to determine the list of row identifiers corresponding to the reversed row query conditions.

[0092] Figure 4 This is a schematic diagram of the structure of an electronic device provided in an embodiment of this application. Figure 4 As shown, the electronic device provided in this application embodiment may include: a processor 41 and a memory 42 storing computer program instructions.

[0093] Specifically, the processor 41 may include a central processing unit (CPU), an application-specific integrated circuit (ASIC), or one or more integrated circuits that can be configured to implement the embodiments of this application.

[0094] Memory 42 may include mass storage for data or instructions. For example, and not limitingly, memory 42 may include a hard disk drive (HDD), floppy disk drive, flash memory, optical disk, magneto-optical disk, magnetic tape, or Universal Serial Bus (USB) drive, or a combination of two or more of these. Where appropriate, memory 42 may include removable or non-removable (or fixed) media. Where appropriate, memory 42 may be internal or external to the integrated gateway disaster recovery device. In a particular embodiment, memory 42 is non-volatile solid-state memory.

[0095] The processor 41 implements any of the database paging query methods in the above embodiments by reading and executing computer program instructions stored in the memory 42.

[0096] In one example, the electronic device may also include a communication interface 43 and a bus 44. Wherein, as... Figure 4 As shown, the processor 41, memory 42, and communication interface 43 are connected through bus 44 and complete communication with each other.

[0097] Communication interface 43 is mainly used to realize communication between various modules, devices, units and / or equipment in the embodiments of this application.

[0098] Bus 44 includes hardware, software, or both, that couples the components of the electronic device together. For example, and not limitingly, the bus may include an Accelerated Graphics Port (AGP) or other graphics bus, an Enhanced Industry Standard Architecture (EISA) bus, a Front Side Bus (FSB), HyperTransport (HT) interconnect, an Industry Standard Architecture (ISA) bus, an Infinite Bandwidth Interconnect, a Low Pin Count (LPC) bus, a memory bus, a Microchannel Architecture (MCA) bus, a Peripheral Component Interconnect (PCI) bus, a PCI-Express (PCI-X) bus, a Serial Advanced Technology Attachment (SATA) bus, a Video Electronics Standards Association Local (VLB) bus, or other suitable buses, or combinations of two or more of these. Where appropriate, bus 44 may include one or more buses. Although specific buses are described and illustrated in embodiments of this application, this application contemplates any suitable bus or interconnect.

[0099] Furthermore, in conjunction with the database pagination query method in the above embodiments, this application embodiment can provide a computer storage medium for implementation. This computer storage medium stores computer program instructions; when these computer program instructions are executed by a processor, they implement any of the database pagination query methods in the above embodiments.

[0100] In addition, in conjunction with the database pagination query method in the above embodiments, this application embodiment can provide a computer program product for implementation. When the instructions in the computer program product are executed by the processor of an electronic device, the electronic device executes the database pagination query method provided by any aspect of the above embodiments of this application.

[0101] It should be clarified that this application is not limited to the specific configurations and processes described above and shown in the figures. For the sake of brevity, detailed descriptions of known methods are omitted here. In the above embodiments, several specific steps are described and shown as examples. However, the method process of this application is not limited to the specific steps described and shown. Those skilled in the art can make various changes, modifications, and additions, or change the order of steps, after understanding the spirit of this application.

[0102] The functional blocks shown in the above-described structural diagram can be implemented as hardware, software, firmware, or a combination thereof. When implemented in hardware, they can be, for example, electronic circuits, application-specific integrated circuits (ASICs), appropriate firmware, plug-ins, function cards, etc. When implemented in software, the elements of this application are programs or code segments used to perform the required tasks. Programs or code segments can be stored on a machine-readable medium or transmitted over a transmission medium or communication link via data signals carried on a carrier wave. "Machine-readable medium" can include any medium capable of storing or transmitting information. Examples of machine-readable media include electronic circuits, semiconductor memory devices, ROM, flash memory, erasable ROM (EROM), floppy disks, CD-ROMs, optical disks, hard disks, fiber optic media, radio frequency (RF) links, etc. Code segments can be downloaded via computer networks such as the Internet, intranets, etc.

[0103] It should also be noted that the exemplary embodiments mentioned in this application describe methods or systems based on a series of steps or apparatus. However, this application is not limited to the order of the above steps; that is, the steps can be performed in the order mentioned in the embodiments, or in a different order, or several steps can be performed simultaneously.

[0104] The aspects of this disclosure have been described above with reference to flowchart illustrations and / or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of this disclosure. It should be understood that each block in the flowchart illustrations and / or block diagrams, and combinations of blocks in the flowchart illustrations and / or block diagrams, can be implemented by computer program instructions. These computer program instructions can be provided to a processor of a general-purpose computer, a special-purpose computer, or other programmable data processing apparatus to produce a machine such that these instructions, executable via the processor of the computer or other programmable data processing apparatus, enable the implementation of the functions / actions specified in one or more blocks of the flowchart illustrations and / or block diagrams. Such a processor can be, but is not limited to, a general-purpose processor, a special-purpose processor, a special application processor, or a field-programmable logic circuit. It is also understood that each block in the block diagrams and / or flowcharts, and combinations of blocks in the block diagrams and / or flowcharts, can also be implemented by special-purpose hardware performing the specified functions or actions, or can be implemented by a combination of special-purpose hardware and computer instructions.

[0105] The above description is merely a specific implementation of this application. Those skilled in the art will clearly understand that, for the sake of convenience and brevity, the specific working processes of the systems, modules, and units described above can be referred to the corresponding processes in the foregoing method embodiments, and will not be repeated here. It should be understood that the protection scope of this application is not limited thereto. Any person skilled in the art can easily conceive of various equivalent modifications or substitutions within the technical scope disclosed in this application, and these modifications or substitutions should all be covered within the protection scope of this application.

Claims

1. A method for database paging query, characterized in that, The method comprises the following steps: receiving a paging query request for a target data set, the paging query request comprising a query condition, a data amount per page, and a request page number, the query condition comprising a sorting field and a sorting direction; obtaining a list of row identifiers of the target data set corresponding to the query condition, the list of row identifiers comprising row identifiers of all data records in the target data set, and the row identifiers being arranged in an order corresponding to the query condition; according to the request page number and the data amount per page, cutting out a target subset of row identifiers corresponding to the request page number from the list of row identifiers; constructing a data query statement according to the target subset of row identifiers; based on the data query statement, obtaining complete data records corresponding to each target row identifier in the target subset of row identifiers from a target database storing the target data set.

2. The database paging query method of claim 1, wherein, Before the step of receiving the paging query request for the target data set, the method further comprises the following steps: dividing the target data set into a plurality of continuous and mutually non-overlapping data shards; for each data shard, obtaining an intermediate result of each data shard sorted according to the query condition from the target database, the intermediate result comprising row identifiers of data records and sorting field values; performing multi-way merge sorting on all intermediate results according to the sorting field values and the query condition to obtain a list of row identifiers corresponding to the query condition.

3. The database paging query method of claim 2, wherein, After the step of performing multi-way merge sorting on all intermediate results according to the sorting field values and the query condition to obtain a list of row identifiers corresponding to the query condition, the method further comprises the following steps: storing the query condition and the list of row identifiers corresponding thereto in a memory cache; the step of obtaining a list of row identifiers corresponding to the query condition comprises: obtaining the list of row identifiers corresponding to the query condition from the memory cache.

4. The database paging query method of claim 2, wherein, The step of dividing the target data set into a plurality of continuous and mutually non-overlapping data shards comprises the following steps: obtaining a total number of records of the target data set; determining a number of shards according to the total number of records, the number of shards being positively correlated with the total number of records; dividing the target data set into a plurality of data shards according to the number of shards and row identifiers of all data records in the target data set, each data shard comprising data records in the target data set whose row identifiers are in a continuous value domain interval, and all data shards comprising data records whose row identifiers are continuous and non-overlapping.

5. The database paging query method of claim 2, wherein, The step of obtaining an intermediate result of each data shard sorted according to the query condition from the target database comprises the following steps: generating a corresponding query task for each data shard according to the query condition and a data range of each data shard in the target data set; creating an independent database connection for each query task according to a database file path of the target data set stored in the target database and an identifier of a current execution thread of each query task; concurrently executing a plurality of query tasks through each database connection to obtain an intermediate result of each data shard sorted according to the query condition.

6. The database paging query method of claim 5, wherein, The method further comprises: For each query task, after obtaining the intermediate result of the corresponding data shard sorted according to the query condition, closing and logging out the database connection corresponding to the query task.

7. The database paging query method according to any one of claims 2-6, wherein, The method further comprises: sequentially reversing the row identifiers in the row identifier list to generate a reversed row identifier list; constructing a reversed query condition corresponding to the query condition, the sorting direction of the reversed query condition being opposite to that of the query condition; determining the reversed row identifier list as the row identifier list corresponding to the reversed query condition.

8. An electronic device, comprising: The device comprises a processor and a memory storing computer program instructions; The processor executes the computer program instructions stored in the memory to implement the database pagination query method according to any one of claims 1-7.

9. A computer-readable storage medium, characterized in that, The computer readable storage medium stores computer program instructions, and the computer program instructions are executed by the processor to implement the database pagination query method according to any one of claims 1-7.

10. A computer program product, characterised in that, The instructions in the computer program product are executed by the processor of the electronic device to enable the electronic device to perform the database pagination query method according to any one of claims 1-7.