Serverless data lake indexing subsystem and application programming interface
By building a multi-engine interoperable indexing system on a data lake, the problem of different query engines being unable to share indexes is solved, resulting in improved query performance and reduced costs. This system is suitable for index management of various data formats and user types.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- MICROSOFT TECHNOLOGY LICENSING LLC
- Filing Date
- 2021-03-16
- Publication Date
- 2026-06-23
Smart Images

Figure CN115552390B_ABST
Abstract
Description
Background Technology
[0001] The volume of raw data in all forms generated by business organizations, scientific researchers, and others can be enormous, ranging from hundreds of gigabytes. Modern systems often collect and generate data many times faster than they can effectively classify and manage such data. In this context, data lakes are increasingly being adopted. A data lake is a data storage platform configured to store such a large amount of raw data, whether structured or unstructured, locally. The sheer size of data lakes, and the often unstructured nature of the data, can make it difficult to effectively utilize all the information gathered from them.
[0002] On the other hand, "data warehouses" typically store structured or processed data that is more easily manipulated for various business intelligence or research needs. However, data warehouses do not have a similar scale to data lakes, so the information that can be retrieved may be more limited.
[0003] However, the overall trend is towards the convergence of data lakes and data warehouses. For example, data warehouse products are increasingly offering enhanced capabilities in terms of data diversity and scale, approaching the capabilities of data lakes. Data lake products are demonstrating support for efficient, updatable, and versioned relational data within data lakes, featuring change tracking and highly competitive ultra-large-scale relational query capabilities. Similarly, data lake products are increasingly supporting relational toolchains for reporting, data harmonization, security, sharing, compliance, and governance.
[0004] The need for index support on data lake systems has historically been significant for traditional data warehouse systems. While there are various ways to improve query performance in database systems, indexes are particularly effective at providing substantial speedups for certain workloads because they can reduce the amount of data retrieved for a given query. However, providing indexing solutions in the context of distributed database systems and / or cloud-based architectures presents several challenges. For example, a key driver of adopting a cloud-based model is the flexibility to store and subsequently query data using any query engine. Unfortunately, different query engines often cannot use common indexes. Summary of the Invention
[0005] This summary is provided to introduce selected concepts in a simplified form, which will be further described in the detailed embodiments below. This summary is not intended to identify key or essential features of the claimed subject matter, nor is it intended to limit the scope of the claimed subject matter.
[0006] This paper describes a query engine and system that optimizes multi-engine data workloads via index specifications and APIs consuming different query engines to create, discover, and use indexes that conform to the index specifications and are stored in discoverable locations on a data lake. In one example aspect, a system is configured to accept workloads in the form of multiple data queries, extract indexable columns from the queries, generate candidate indexes from the indexable columns, select and then build the best candidate index, wherein the built indexes conform to the index specifications and are stored in predetermined locations on the data lake. In an additional aspect, the query engine in the system can receive queries, generate a query plan for those queries, which is configured to use one or more of the built candidate indexes, and execute the query plan to generate query results.
[0007] In another example, building an index includes index metadata that describes the content and lineage of each corresponding built index and reflects the index's state. On the other hand, building an index lineage includes query plan information that corresponds to the query used to create the built index.
[0008] On the other hand, the system includes a second query engine configured to search for built indexes stored in a predetermined location in the data lake, receive queries, generate query plans for the queries, determine whether the corresponding index can be used to optimize the query plan based on the index metadata for each built index, and if so, optimize the query plan to use the corresponding built index, and execute the optimized query plan to provide query results.
[0009] Other features and advantages, as well as the structure and operation of various examples, are described in detail below with reference to the accompanying drawings. It should be noted that these ideas and techniques are not limited to the specific examples described herein. These examples are presented herein for illustrative purposes only. Based on the teachings contained herein, the additional examples will be clear to those skilled in the art. Attached Figure Description
[0010] The accompanying drawings, which are incorporated herein and form part of the specification, illustrate embodiments of the present application and, together with the specification, further serve to explain the principles of the embodiments and enable those skilled in the art to make and use the embodiments.
[0011] Figure 1 An example data lake system according to an embodiment is shown.
[0012] Figure 2 A hierarchical view of a data lake indexing and querying system according to an embodiment is shown.
[0013] Figure 3 An exemplary hierarchical organization diagram of index metadata for a data lake, according to an embodiment, is shown.
[0014] Figure 4 An example data lake indexing and query system application programming interface (API) according to an embodiment is shown.
[0015] Figure 5A An example index metadata specification according to an embodiment is shown.
[0016] Figure 5B The following is illustrated according to an embodiment: Figure 5A Example instances of covering indexes described in the example index metadata specification.
[0017] Figure 6 An example index state machine supporting serverless stateful indexing operations is shown according to an embodiment.
[0018] Figure 7 An example log operation for supporting concurrency control of serverless multi-user indexing operations, according to an embodiment, is shown.
[0019] Figure 8 An example timeline according to an embodiment is shown, illustrating the use of Figure 7 The example log operations are used to manage concurrent index operations for two users.
[0020] Figure 9 An example application of filtering index rules to an SQL query is shown according to an embodiment.
[0021] Figure 10 An example application of the join index rule to an SQL query is shown according to an embodiment.
[0022] Figure 11 An example workload according to an embodiment is shown, along with recommended example steps for generating an index for that workload.
[0023] Figure 12A An architecture overview of cost-based index tuning using the "what if" utility is shown according to an embodiment.
[0024] Figure 12B An example application of the "what if" utility to an SQL query is shown according to an embodiment.
[0025] Figure 13 A detailed schematic diagram of a workload optimization system according to an embodiment is shown.
[0026] Figure 14 A flowchart is shown of a method for workload optimization performed at a first query engine in a distributed query processing system according to an embodiment.
[0027] Figure 15 The example illustrates the use of a method according to an embodiment. Figure 14 The flowchart is a multi-engine refinement flowchart, in which the second query engine optimizes the query plan to utilize pre-existing indexes found at predetermined locations.
[0028] Figure 16 A schematic diagram of an example workload optimization system according to an embodiment is shown, illustrating various aspects of multi-engine interoperability.
[0029] Figure 17 This is a block diagram of an example computer system in which embodiments can be implemented.
[0030] The features and advantages of the embodiments will become more apparent from the detailed description set forth below, taken in conjunction with the accompanying drawings, in which similar reference numerals identify corresponding elements throughout the drawings. In the drawings, similar reference numerals generally denote identical, functionally similar, and / or structurally similar elements. The first appearance of an element in the drawings is identified by the leftmost numeral(s) in the corresponding reference numeral(s). Detailed Implementation
[0031] I. Introduction
[0032] This specification and accompanying drawings disclose one or more embodiments incorporating features of the invention. The scope of the invention is not limited to the disclosed embodiments. The disclosed embodiments are merely illustrative, and modifications of the disclosed embodiments are also included in the invention. The embodiments of the invention are defined by the appended claims.
[0033] In this specification, references to "an embodiment," "embodiment," "example embodiment," etc., indicate that the described embodiment may include a specific feature, structure, or characteristic, but each embodiment does not necessarily include that specific feature, structure, or characteristic. Furthermore, these phrases do not necessarily refer to the same embodiment. Additionally, when a specific feature, structure, or characteristic is described in connection with an embodiment, it is proposed that, whether explicitly described or not, implementing such a feature, structure, or characteristic in conjunction with other embodiments is within the knowledge of those skilled in the art.
[0034] In this discussion, unless otherwise stated, adjectives describing the conditional or relational characteristics of one or more features of an embodiment of the invention, such as “substantially” and “approximately”, are understood to mean that the condition or characteristic is defined within an acceptable tolerance for the operation of that embodiment for its intended application.
[0035] Several exemplary embodiments are described below. It should be noted that any section / headings provided herein are not intended to be limiting. Embodiments are described throughout this document, and embodiments of any type may be included under any section / subsection. Furthermore, embodiments disclosed in any section / subsection may be combined in any way with any other embodiments described in the same section / subsection and / or different sections / subsections.
[0036] II. Example Implementation
[0037] The embodiments disclosed herein bring multi-engine interoperability to data lakes and can include guided, semi-guided, or automatic index selection to allow users to optimize their workloads. Furthermore, to reduce operating costs and further enhance multi-engine interoperability while also enabling multi-user concurrency, the embodiments disclosed herein implement a "serverless" index management strategy. In this section, we present as follows... Figure 1 An overview of the operating environment of the disclosed embodiments shown.
[0038] Figure 1 An example data lake system 100 according to an embodiment is shown. The data lake system 100 is shown to include general-purpose components dedicated to various high-level functions. The data lake system 100 includes a data collector 112, a data digester 114, a data modeler, and a server 116. The data lake system 100 also includes a data lake 110 that interfaces with the data collector 112, the data digester 114, the data modeler, and the server 116, thereby enabling data to be written to or read from the data lake 110.
[0039] Conceptually, data collector 112 is configured to accept structured or unstructured data and store this data in data lake 110. Such data may include, for example, log files 102 (unstructured), media 104 (unstructured), files 106 (unstructured), and / or business applications 108 including any underlying schema (structured). Note that such data categories are merely exemplary. Data lakes such as data lake 110 are typically configured to accept and store any type of data in its native format. Through data collector 112, data lake 110 can be populated with massive amounts of data, hundreds of gigabytes or even more.
[0040] Effective utilization of such a large volume of data can be achieved through the combined operation of data decomposer 114, data modeler, and server 116. In an embodiment, data decomposer 114 is configured to process unstructured data on data lake 110 to provide structured or semi-structured and curated views of at least some of the data contained therein. The data modeler and server 116 can then use these views for various purposes, including generating business intelligence 118 or other useful outputs. Essentially, the data modeler and server 116 can be configured to operate in a manner similar to a traditional data warehouse, but across the entire data lake as processed by data decomposer 114.
[0041] The embodiments described herein can be implemented in various ways. For example, embodiments can be implemented within / on top of a data modeler and server 116 to provide a data lake indexing and querying system. However, other structural and operational embodiments will be clear to those skilled in the art.
[0042] Further embodiments and their advantages are described below with reference to the example API. Additionally, auxiliary data structures provided by the embodiments are described. The workload optimization embodiments disclosed herein can be implemented and operated using multiple different query engines. However, in the context of this disclosure, embodiments are described in conjunction with Apache Spark. However, those skilled in the art will understand that Apache Spark is merely an example query engine and not a necessary component of any embodiment. Other types of query engines may exist.
[0043] The embodiments of the indexing system described herein may include one or more of the following advantages:
[0044] 1. Data format agnostic. To support the most diverse scenarios, the indexing subsystem should be able to index data stored in the lake in any format, including text data (e.g., CSV, JSON, Parquet, ORC, Avro, etc.) and binary data (e.g., video, audio, images, etc.). Furthermore, the data is considered... External Management That is, it does not assume control over the dataset throughout its lifecycle.
[0045] 2. Low-cost index metadata management. To avoid burdening the query optimizer and end users, index metadata should be lightweight, fast to retrieve, and independent of third-party catalog operations. In other words, the index subsystem should rely solely on the data lake for operation and should not assume the existence of any other services for proper functioning.
[0046] 3: Multi-engine interoperability. The indexing subsystem should facilitate the integration of third-party engines. To achieve this, the examples demonstrate (a) index state management and (b) index metadata in a manner that is as transparent as possible.
[0047] 4. Simple and guided user experience. The indexing subsystem should support a wide variety of users, including data scientists, data engineers, and data enthusiasts. Therefore, it should provide the simplest possible experience.
[0048] 5: Scalable Indexes. Since providing all possible auxiliary data structures to help speed up queries is usually impractical, our indexing subsystem should provide a pluggable mechanism for updating auxiliary data structures (related to the index).
[0049] 6: Security, privacy, and compliance. Since auxiliary structures (such as indexes, views, and statistics) partially or completely replicate the original dataset, the indexing subsystem should meet the necessary security, privacy, and compliance standards.
[0050] These advantages, particularly multi-engine interoperability, are enhanced by rethinking indexes as “derived datasets.” While indexes are traditionally built and maintained as auxiliary data structures within a database management system (“DBMS”), in a data lake, where a single “database system” does not exist, the embodiments described herein treat indexes as… group raw data The form of an index is that the data is derived from one or more datasets and can optionally be used by any query optimizer to improve the speed of data retrieval. There are likely only a few basic assumptions to consider an index as a derived dataset: (1) such derived datasets support basic lifecycle operations such as creation, deletion, (full or incremental) rebuilding and recovery, and (2) they can be used for query acceleration (specifically, they can be easily utilized and / or integrated into the query optimizer and execution runtime). Therefore, the implementation actually supports any type of index, including, for example, covering indexes, region mappings, materialized views, indexed views (i.e., indexes on materialized views), statistical and block elimination indexes. Therefore, when the term “index” is used below, each of the examples listed above is appropriately considered, but “index” can also include any other type of derived dataset that satisfies the above assumptions. However, for the sake of context and completeness, several examples of derived datasets are described below.
[0051] Covering indexes. Covering indexes are very effective when certain select and filter columns frequently co-occur in a query. They have the following properties:
[0052] 1. Non-clustered – Indexes and data are separate.
[0053] 2. Covering -- An index contains the key column (i.e., the term "key column" as used below). Index column ") and the data / payload column (i.e., the term "" also used below) Included columns These data / payload columns are copied from the original data (for the "index only" query access path).
[0054] 3. Columnar – The index is stored in a columnar format (e.g., Parquet) instead of a row-oriented format such as a B-tree. This allows for faster index scans by leveraging vectorization and min-max pruning techniques.
[0055] Query performance can be significantly improved by including all columns in a covering index, whether as key or non-key columns. Adding physical layout attributes (such as chunking, partitioning, and sort order) can speed up the main filtering and union operations, which typically dominate query execution time. In this example, all columns that the user marks as “index columns” can be chunked and (optionally) sorted.
[0056] Block-based index elimination. For highly selective queries (e.g., searching for a single GUID among billions), a method known as "block elimination index" can be advantageously employed. Block Elimination Index The index category is "block elimination index". Block elimination indexes are similar to traditional inverted indexes, except that pointers are references. piece For any URI (not row_id), a block is a reasonable unit of addressable data stored in the data lake (e.g., a range of offsets within a single Parquet file or a large CSV file). The optimizer can leverage this index to quickly reduce irrelevant blocks in queries.
[0057] Materialized views. For expensive queries with unions or aggregations, materialized views can be created as derived datasets. The underlying query optimizer can then use these materialized views transparently.
[0058] Statistical data. In an environment with a cost-based query optimizer, an implementation may collect statistical data (e.g., histograms) of columns of interest a priori. A capable optimizer can then utilize this statistical data at runtime to optimize resources.
[0059] Benefiting from the advantages and background described above, a more detailed description now turns to an architectural overview of the data lake indexing and querying system according to an embodiment. More specifically, Figure 2 An example hierarchical view 200 of a data lake indexing and querying system according to an embodiment is shown. For example... Figure 2As shown, the data lake indexing and querying system 200 includes a data lake 202, index infrastructure 208, query infrastructure 224, and a set of user-facing index management APIs 226. The index infrastructure 200 provides and includes a log management API 214, an index specification 212, a concurrency model 216, and a set of index creation and maintenance APIs 210 built upon these. The query infrastructure 224 of the data lake indexing and querying system 200 includes a set of optimizer extensions 222, an index recommendation system 220, and a "what-if" utility 218. Based on... Figure 2 The following discussion of the data lake indexing and query system 200, other structural and operational embodiments will be clear to those skilled in the art.
[0060] In this embodiment, the user can utilize the index infrastructure 208 (as...) Serve or Library The index infrastructure 208 can be used to create and maintain indexes (or “derived datasets”) on its data via the index creation and maintenance API (described further below). For example, the index infrastructure 208 can be implemented as one or more extensions to the sparkSession object of Apache Spark, where users can use a suitable data access client (e.g., spark-shell) to create non-clustered column-covered indexes, specify the columns to be indexed, and the columns to be included as data columns, i.e., using a query like: CREATE INDEX myCoveringIndex ON dirLocation1 INDEX(a, b) INCLUDE (c). Note that the implementation does not require a separate “index service” because the index infrastructure 208 can, in principle, utilize any available query engine (e.g., Spark) for index building. As described in more detail below, the indexes and their metadata are stored on the data lake itself, so users can parallelize index scans to the scale of their query engine and to the extent that their environment / business allows.
[0061] In this embodiment, the index metadata maintenance is handled by... Index Manager ( Figure 2 (not shown in the image) to manage, Index Manager This is controlled through the Index Creation and Maintenance API 210. The Index Manager is responsible for creating, updating, and deleting index metadata as index data is modified, thus managing the consistency between index data and index metadata. The Index Manager also provides utility functions for reading index metadata from its serialized format. For example, the query optimizer can read all index metadata and then find the best index for a given query.
[0062] The implementation may also enable primitive components that form the basis of the index creation and maintenance API 210. For example, such primitive components may include any or all of the log management API 214, index specification 212, or concurrency model 216.
[0063] As described above and in more detail below, support for multi-engine interoperability has spurred the need to store all indexes and their metadata on the lake. To track the genealogy of operations occurring on the indexes, the implementation logs user actions in... Operation Log This is described in more detail below and can be done via the Log Management API 214.
[0064] Index specification 212 supports the aforementioned scalability advantages because the embodiments involve index specification 212 that reflects the properties of the corresponding underlying index (or derived dataset). These advantages are demonstrated via index creation and maintenance API 210, and those who wish to extend the system to include other types of indexes / derived datasets must implement support for these APIs.
[0065] Finally, concurrency model 216 demonstrates primitives for using optimistic concurrency control to support multi-user and incremental maintenance scenarios (as described further below).
[0066] The discussion now turns to another major layer on top of the data lake: the query infrastructure 224. Without loss of generality, the components of the query infrastructure 224 are described in this paper as implemented as a Scala version of the Apache Spark query optimizer (aka, Catalyst), extending it to be aware of indexes. That is, given a query and existing indexes, the implementation using Spark can execute... Transparent query rewriting This allows the use of existing indexes. To enable optimizer extensions on the user side, `sparkSession.enableIndexingSubSystem()` needs to be executed after creating a Spark session. Because the implementation treats the index as another dataset on the lake, users can leverage Spark's distributed nature to automatically scale index scans. Although the implementation has been described above and below in terms of Spark and Scala, it should be understood that other implementations can use programming languages other than Scala and query engines other than Spark.
[0067] While the embodiments described herein introduce the concept of indexing on a data lake, a crucial aspect of big data management that significantly impacts performance is the ability to select which indexes to build for a given query or workload. To determine the correct index for a workload, it is essential for users to be able to perform cost-benefit analyses on existing indexes and any "hypothetical" indexes in their minds. Therefore, query infrastructure 224 includes a "what if" utility 218 that allows users to quantitatively analyze the impact of existing or hypothetical indexes on system performance. Furthermore, query infrastructure 224 includes an index recommendation module 220, which demonstrates automatic index recommendations for automatically selecting indexes in query acceleration for big data workloads. This tool takes the SQL query workload as input and suggests a suitable set of indexes. The implementation details of index recommendation module 220 and the "what if" utility 218 are described in more detail below.
[0068] As described above, the embodiment stores all index data and metadata on the data lake itself without any external dependencies. Figure 3 A sample hierarchical organization diagram 300 of index metadata on a data lake is shown according to an embodiment. Based on, as Figure 3 The following discussion of the hierarchical organization diagram 300, other structural and operational embodiments will be clear to those skilled in the art(s) related to the subject.
[0069] In one embodiment, such as Figure 3 As shown, all indexes can be stored at file system root 302. Depending on the indexes used, this arrangement can be advantageous. For example, an index may include materialized views that themselves span the dataset, which in turn requires decoupling the dataset from the index. However, in another embodiment, the index may be co-located with the dataset. Embodiments can implement fine-grained access control mechanisms, such as copying the strictest access control lists (ACLs) from the dataset used as the index for stricter security and compliance. Because we allow the concept of disabling public indexes, users are free to provide hints to the optimizer using "index sets," thereby allowing A / B testing.
[0070] However, it should be understood that the use of file system root 302 is merely exemplary, and another default index location can be specified, for example, in index specification 212 as described above. More specifically, multi-engine interoperability is achieved in part through index specification 212, which enables canonical query engines and other clients that know and comply with the canonical specifications outlined in index specification 212 to know in advance the default location of index storage, thereby allowing these engines and clients to discover the availability of indexes that may already exist and subsequently build query plans that incorporate those indexes.
[0071] Continue to refer to Figure 3 / indexes / / <index name> Each of the indexes listed below has two components:
[0072] 1) Directory 308 named _indexing_subsystem_log contains the index operation log, which is a list of all operations that have occurred since the index was created; and
[0073] 2) The actual content of the index is 322.
[0074] Please note that content is captured across multiple directories. This is to support features such as concurrent index management (e.g., snapshot isolation) and incremental maintenance (e.g., the latest index is a union of the contents of multiple directories).
[0075] Figure 4 The example data lake indexing and querying system application programming interface (API) 400 is shown in the context of Apache Spark, as illustrated by the embodiments. Please note that this list is merely exemplary and should not be construed as a requirement for every engine or every embodiment. Based on... Figure 4 The following discussion of API 400, other structural and operational embodiments will be clear to those skilled in the art(s) related to it.
[0076] API 400 includes index maintenance APIs in lines 2 through 8, which include APIs corresponding to actions such as create, delete, restore, clear, rebuild (sometimes referred to as "refresh," especially when rebuilding is incremental), and cancel. In an embodiment, the deleteIndex API in line 4 corresponds to a "soft delete," which tells the optimizer not to consider the index during optimization. The actual index referenced by the API call is not permanently deleted, thus allowing the user to restore the deleted index using the restoreIndex API, as shown in line 5. Alternatively, the user can permanently delete an index that is in a soft delete state using the vacuumIndex API, as shown in line 6. The rebuildIndex API in line 7 implements the aforementioned rebuild / refresh operations. As shown in line 8, the user can cancel an ongoing index maintenance operation using the cancelIndex API, which can be useful if the user suspects that the maintenance job has stalled or failed.
[0077] API 400 also includes utility APIs for debugging and recommendation, as shown in lines 11 through 15. These APIs are called Explain, WhatIf, and Recommend, as shown in lines 11, 12, and 14, respectively. The Explain API allows users to obtain various useful information from the optimizer, such as which parts of the plan were modified, which indexes were selected, why they were selected, and so on. The WhatIf API allows users to provide the indexing subsystem with sample index configurations and understand how useful it would be to build the indexes. The Recommendation API allows users to obtain ranking recommendations for building indexes / views for their chosen workloads.
[0078] API 400 also includes storage and query optimizer customization configuration settings, as shown in lines 18–21. These settings allow users to override the behavior of the query optimizer and index management. For example, by default, each created index is discoverable and stored in a public folder or file system root directory, as described above, making it accessible to all users at the workspace level. If this is unacceptable, and because the indexes are only accessible to the user who created them, the user can choose a private index location and namespace, then create their private indexes and provide hints to the optimizer during optimization (e.g., by setting configuration variables).
[0079] indexing_subsystem.index.creation.[path|namespace]
[0080] and / or
[0081] indexing_subsystem.index.search.disablePublicIndexes.
[0082] Having described the various APIs available in the embodiments, the discussion now turns to serverless index management implemented by the disclosed embodiments. As mentioned above, one advantage is the low-cost multi-engine index subsystem, which allows concurrent index maintenance operations to be performed on indexes that are callable by multiple engines. Although the embodiments could be implemented with a server to mediate such operations, other embodiments described herein can simplify implementation by making index management “serverless,” i.e., the embodiments do not require a separate server dedicated to index management tasks. Serverless functionality is achieved in part by storing all index information (e.g., metadata, operations on the indexes) in a data lake, and by having the index track its own state through index operation logs incorporated into the index and through additional updates to its own metadata. Despite being serverless, the embodiments achieve concurrent updates through optimistic concurrency control (as described in more detail below). Further description of the embodiments now turns to a further description of one aspect of these aspects: index metadata on the lake.
[0083] Interoperability is complex because each query engine must agree on what constitutes an index, which may require consensus among developers (and organizations / companies) working in different siloed ecosystems. Because the latter is far more difficult in reality, the embodiments described in this paper prioritize low-friction configurations for displaying index-related metadata (e.g., content, state, etc.) in a way that allows for easy integration. Displaying the state of an index or a list of operations invoked on an index through traditional methods (such as directory services or transaction manager services) ensures strong consistency. However, this approach has several major operational impacts. First, it introduces service dependencies and on-site support management costs. Second, it complicates integration because each new engine must now depend on a third-party service. Finally, it introduces the operational cost of running that service.
[0084] Considering these drawbacks, the embodiments described herein balance metadata consistency for ease of operation and maintenance, meaning the baseline truth of the indexed information is stored on a data lake. There are several ways to specify the index information that needs to be stored. For example, Figure 5A An example index metadata specification 500 according to an embodiment is shown. The index metadata specification 500 includes three parts: content 504, lineage 506, and status 508. Based on... Figure 5A The following discussion of the Index Metadata Specification 500, other structural and operational embodiments will be clear to those skilled in the art(s) related to it.
[0085] Content 504 may include type and type-specific information about the derived dataset that is useful when instantiating appropriate index interpretation logic, such as name, type, configuration (e.g., indexes and included columns and their types), and content (e.g., physical location and layout).
[0086] Lineage 506 may include information for tracking the lineage of derived datasets, such as the indexed HDFS data source, information required to refresh the index with minimal information from users, information required to perform index / view selection, and a descriptive history of the index. Lineage may also include information about any additional transformations applied to one or more data sources during index construction (e.g., filters applied before indexing, such as WHERE Col1 IN (“user1”, “user2”) or similar filters).
[0087] State 508 may include state information about the derived dataset, such as global information like activity and disabling, and transient information like creation and deletion.
[0088] Figure 5A This also includes Operation Log 510. Operation Log 510 will be described in more detail below in conjunction with the description of multi-user concurrency control.
[0089] Figure 5B The following is illustrated according to the embodiment. Figure 5A The example instance of the covering index 502 described in the example index metadata specification 500 is shown below. Those skilled in the art will easily understand most of the metadata for the covering index 502. However, it is important to first note that all metadata is stored in an easy-to-read JSON format, which reduces reliance on query engine integration and provides support for specification evolution through versioning fields. Also note the plan node 514 under the source node 512. The plan node 514 also includes a rawPlan node 516 as an attribute of the plan node 514. The rawPlan node 516 contains the raw query plan information for the covering index 502. For example, in the case of Spark, the rawPlan node 516 is a serialized representation of the logical plan.
[0090] Including raw query plan information in rawPlan node 516 of covering index 502 offers several advantages. First, the raw query plan enables support for transparent index refresh (e.g., by calling the above-mentioned...). Figure 4The described `rebuild()` API allows the query engine to decide whether to utilize the index during optimization, without requiring the user to provide the original query used when creating the index. Secondly, it allows the query engine to decide whether to use the index during optimization. For example, and recalling embodiments that include multi-engine support, the query engine can examine the original query plan of the index and find that the index was created using an unsupported hash function, thus ignoring the index from any optimization plan. Thirdly, including the original query plan is useful for debugging.
[0091] The covering index 502 also includes a state node 518, which, as described above, tracks the state of the index, thereby enabling the embodiment to be serverless. In a serverless paradigm, there are various ways to manage the index state. For example, Figure 6 An example index state machine 600 supporting serverless stateful indexing operations according to an embodiment is shown. State machine 600 includes the following transient states: Create 602, Restore 604, Delete 606, Refresh 608, and Optimize 610. State machine 600 also includes the following stable states: Active 612, Deleted 614, and Empty / DNE 616 (where 'DNE' = 'Does not exist'). Based on... Figure 6 The following discussion of the index state machine 600, other structural and operational embodiments will be clear to those skilled in the art.
[0092] Because the embodiments are implemented in a serverless paradigm, there is naturally no server to maintain or track the index state. Therefore, the embodiments are implemented according to... Figure 6 The index state machine 600 illustrates state transitions to manage the index state. Note that states Active 612, Deleted 614, and Empty / DNE 616 are stable states, while the other states are transition states. These state transitions are described below:
[0093] Create 602 Assuming no index exists, the state machine begins in state null / DNE 616. When the user combines the above... Figure 4 When the createIndex() API is called as described, the created index moves from state null / DNE 616 to state creation 602. If, for some reason, the user cancels index creation by issuing the cancelIndex() API while the index is still in state creation 602, the index returns to state null / DNE 616.
[0094] Activity 612 Once the index is successfully created, it transitions to state activity 612 and becomes visible (it is invisible when in state creation 602). Indexes typically spend most of their time in state activity 612.
[0095] Refresh 608 Existing indexes can be refreshed / rebuilt via the rebuildIndex() API described above. Although the terms refresh and rebuild are largely used interchangeably in this article, the term "refresh" is generally used for incremental rebuilds. Note that refreshing does not prevent index visibility—consumers of the index can continue to access the currently active copy of the index until the refresh is complete.
[0096] Delete 606 Users can use the deleteIndex() API described above to delete an index. During the deletion operation, the index enters a delete status 606. As mentioned above, the deletion operation is only a soft delete (for speed) and has the effect of making the index invisible / unusable.
[0097] 614 have been deleted After the deleteIndex() call is completed, the index enters the deleted state 614.
[0098] Restore 604 Because deletion is a soft delete, the `restoreIndex()` command can restore the index. At this point, the index enters the recovery state (604), and upon completion, it enters the active state (612). Similarly, the index is not visible while in the recovery state.
[0099] Optimize 610 Users can further optimize the index via the optimizeIndex() API. For example, one optimization is... Index compression In this process, incrementally generated (small) index blocks are merged into larger index blocks to improve index read efficiency.
[0100] In multi-user scenarios, some index states obviously conflict with each other (i.e., an index cannot simultaneously represent certain states of different users). For example, if an index is in state Delete 606, Refresh 608, or Optimize 610 in one user session, it cannot be in state Restore in another concurrent user session. This is understandable because an index can only move from active 612 to delete 606, refresh 608, or optimize 610, and it can only move from delete 606 to restore 604. If two API calls could cause index state conflicts, they are incompatible. Table 1 shows the APIs disclosed in this paper. Compatibility Matrix It also shows that an API call by one user (e.g., when reading) is incompatible with an API call by a second user (reading downwards), where C=create, D=delete, O=optimize, RF=refresh, RS=restore, and V=clear.
[0101]
[0102] Table 1: Compatibility Matrix of Index Management API
[0103] While Table 1 prevents the index from reaching an incompatible state in two different user sessions, it does not prevent two different users from attempting conflicting changes to the index. To address this issue, the implementation uses optimistic concurrency control to ensure index consistency. As described above, the implementation utilizes... Figure 7 The example logging operation 700 illustrates an optimistic concurrency control scheme. Example logging operation 700 includes LogOp() 702, Validate() 704, Begin() 706, RunOp() 708, Commit() 710, and commit protocol 712. Such operations use... Figure 5A The operation was executed using operation log 510, and is described in this article as follows:
[0104] • LogOp() - Records index operations that the user will attempt to perform on log 510.
[0105] • Validate() - Verifies whether an index is in a suitable state to allow the required operation (e.g., an index that does not exist cannot be deleted).
[0106] • Begin() - Assigns the id to the index operation with the corresponding transformation state.
[0107] • RunOp() - Logs the currently running operation to the operation log. 510
[0108] • Commit() - Records the ID of the completed index operation in the operation log (510), including the corresponding final stable state.
[0109] Commit() relies on renaming files in a cloud file system (such as HDFS, Azure Storage, or Azure Data Lake). atomicity This ensures that changing the index state from a transition state to a stable state during Commit() is atomic. For example, if a file corresponding to an index transition state is renamed during commit, the commit and transaction as a whole can be aborted (as described in Commit Protocol 712). After receiving the abort message, the transaction can be retried later.
[0110] Figure 8 The timeline 800 shows the use of Figure 7This example log operation is used to manage a sample transaction of concurrent index operations by two users. Timeline 800 progresses from left to right, where the index is initially created at step 802 and data is added at step 804. At step 806, user (user 1) attempts to refresh the existing index. Almost simultaneously, at step 808, another user (user 2) attempts to delete the same index. Assuming deletion is expensive (i.e., time-consuming), the refresh from user 1 successfully completes first at step 810. When user 2's delete attempt commits, the log reflects the fact that the index state has changed (due to user 1's refresh), and the deletion fails at step 812.
[0111] The implementation is designed to allow the use of the above-described concurrency control mechanism. Multiple writers and Multiple readers For index readers, any stable snapshot of the committed index data is usually sufficient. To ensure consistency between the index and the corresponding indexed data, an implementation may employ a signature-based mechanism, thereby using the latest timestamp of the data file to generate the index. sign (For example, such as) Figure 5A (Signature 520 shown). As described in more detail below, during query processing, a query may access both the table and its indexes simultaneously. Using the signature of the data stored in the index metadata, the optimizer can ensure that the index does not become outdated. In serverless architectures, perfect external consistency is often impossible to guarantee. For example, a query might be accessing data that has been updated since the data was validated using signature 520 of the index, and there is no locking of resources. Instead, external consistency must be supported at the query engine level—that is, external consistency can only be supported if the query engine can expose such an API.
[0112] The above description provides a framework for indexes and their corresponding lifecycle APIs. Of course, these indexes are useless unless they can be utilized at query time. Therefore, the query engine's query optimizer must be aware of the existence and format of the indexes and be able to process them correctly. As described in more detail below, an example leverages indexes by incorporating new rules into Spark's rule-based query optimizer. Specifically, filtering and composite index rules are defined and can be used to optimize the query plan for a given query to utilize available indexes. In the next section, we first discuss the impact of indexes on query execution. Then, we introduce the details of index rule implementation and integration. Integrating index rules into other types of query optimizers, such as those following Starburst, Volcano, or Cascades architectures, would be similar.
[0113] As described above, the embodiments described herein focus on the two main Spark operators in query processing, namely filtering and joining, merely to illustrate the concepts (i.e., other operators, such as aggregation and grouping, can also be optimized in a similar way). More specifically, the embodiments implement two query optimizer rules, namely FilterIndexRule and JoinIndexRule, which aim to accelerate the filtering and joining operators in the Spark query execution plan that uses indexes as described above. The embodiments benefit the performance of filtering operators by producing filtering indexes by eliminating data partitioning and by sharding the index data by index columns (i.e., partitioning using a hash function). Therefore, such indexes can significantly reduce the amount of data that filters need to access when using equality predicates (i.e., point lookups) that reference index columns.
[0114] Such an index can also benefit the performance of the join operator if the indexed column matches the join key and the optimizer chooses to use a shuffle-based join (e.g., a hash join or a sort-merge join). In this case, the shuffling phase of the join can be completely avoided due to the block partitioning of the index. It is well known that shuffling operations are expensive in distributed big data processing, and eliminating or minimizing such shuffling often provides significant performance advantages.
[0115] Although not described in this article, such indexes are also beneficial for other operators, such as aggregations on top of grouping. Below, we formally define FilterIndexRule and JoinIndexRule.
[0116] The FilterIndexRule works as follows: If there is a filter at the top of the table scan... f If the following conditions are met, we will replace it with an index. I :
[0117] • I The index (key) column Prelude Listed f In a certain Predicate reference;
[0118] • By f All columns referenced by the predicate in the text are I Coverage, that is, appearing I The indexed column or included column.
[0119] consider Figure 9The example shown illustrates an example application 900 of filtering index rules for an SQL query by a query optimizer according to an embodiment. Example application 900 assumes an input configuration 902, which includes a table 904, a query 906, and an index 908. Index 908 includes a single index F1, which is the filtering index. Index F1 includes index columns... Ra (That is, Ra is the key column of index F1, represented by the underscore) and the included column Rb. Upon receiving query 906, the query optimizer can generate the original query plan 912 in response. The index rules first search for pattern matches within the query plan.
[0120] Scan -> Filter.
[0121] In the original query plan 912 generated from query 906, there are two such matches:
[0122] (M1) Scan (R) -> Filter (Ra = 5);
[0123] (M2) Scan (S) -> Filter (Sd>200).
[0124] For each match, the indexing rule instructs the query optimizer to further check if there is an index that satisfies the conditions, and if so, replace the table scan with the corresponding index. In our example, only the match (M1) has such an index: index F1, which is defined as having an index column equal to Ra. As a result, the scan operator at the top of table R is replaced with the scan operator at the top of index F1, resulting in an optimized query plan 914.
[0125] JoinIndexRule works similarly, finding candidate indices through pattern matching. However, unlike FilterIndexRule, it's impossible to match a specific pattern except for matching only a single union operator. When a matching union operator is found, it's checked whether it satisfies... Equal joint conditions That is, restricted to equality predicates between cohorts. conjunction The combined conditions.
[0126] Join operators that meet the conditions O With joint conditions c After matching, the next step is to find targets O The available indexes. Given that both the left and right subplans of a union are linear, then in the plan tree... O There are only two basic tables. For each basic table... T ,exist T The top for each candidate index I Check the following conditions:
[0127] cQuoted T All joint columns should be with I The index columns are the same;
[0128] access T All other columns referenced by the left or right subplan are included in I The included columns.
[0129] More formally, assuming and These are candidate indexes found for the left and right child plans, respectively. Index pairing can be further performed by conducting the following compatibility tests:
[0130] Consider indexes and index .if index column and index column It is a pairwise association, meaning each column pair... Appears in conjunctive predicates In the middle, the pair of two indices It is compatible. Please note that... and They must have the same number of indexed columns.
[0131] The above compatibility test can be understood through examples:
[0132] Consider the following query :
[0133] SELECT A, D FROM T1, T2
[0134] WHERE T1.A = T2.B AND T1.C = T2.D
[0135] Suppose we have two indices, I on T1 1 (A,C);() and I on T2 2 (B,D);() . Then (I 1 , I 2 ) It is compatible. If it's not I2, we have index I′. 2 (D,B);() , then (I 1 , I′ 2 ) is incompatible .
[0136] There may be more than one compatible index pair. In one embodiment, the index pair that results in the lowest execution cost can be selected based on the following criteria:
[0137] • If a compatible index pair exists , making and If there are the same number of blocks, then choose the one with the most blocks. maximum A number of block index pairs.
[0138] Otherwise, if no such index pair exists, select any index pair from the matching index pairs.
[0139] There are several reasons for using these criteria. First, when two indexes have the same number of blocks, there is no shuffling during a (sort-merge) union. That is, if the number of blocks is different, one index is reshuffled to have the same number of blocks as the other. Second, in general, the more blocks there are, the better the parallelism of the union execution (assuming no resource constraints).
[0140] Finally, JoinIndexRule replaces the scan operator at the top of the table with the scan operator at the top of the corresponding index in the best-compatible index pair. For example, consider... Figure 10 This illustrates an example application 1000 of the JoinIndexRule for SQL queries according to an embodiment. Example application 1000 includes... Figure 9 The input configuration 1002 of configuration 902. Specifically, input configuration 1002 includes inputs from... Figure 9 Table 904 and query 906. Input configuration 1002 also includes based on Figure 9 It has index 908 and index 1008, but it now includes two composite indexes J1 and J2.
[0141] Then, the JoinIndexRule is applied according to the description above. Specifically, since there are now two composite indexes J1 and J2, query plan 912 can check for composite operators that meet the conditions. Here, query plan 912 includes composite operator 1010 with the composite condition Rb=Sc. Next, it is checked whether there are candidate indexes in the left and right subplans of the composite. We find that index J1 is suitable for the left subplan (due to the presence of column Ra in filter operation 1012), and index J2 is suitable for the right subplan (due to the presence of column Sd in filter operation 1016). Obviously, (J1, J2) is the only compatible candidate index pair. The scan operators on the resulting tables R and S can be replaced by the scan operators on top of J1 and J2, respectively, resulting in an optimized query plan 1014.
[0142] As mentioned above, FilterIndexRule and JoinIndexRule define rules that allow the query engine to optimize query plans to utilize corresponding indexes. However, the use of such rules must be integrated into the query engine. As described above, the embodiments in this paper are based on the Spark query engine (aka, Catalyst). As a rule-based query optimizer, integrating these rules is straightforward and primarily involves merging index rules into the rules adopted by the optimizer. This requires making two decisions: where to include the new rules and in what order to apply them.
[0143] Where should rules be included? Placing new rules in the wrong place can lead to undesirable consequences due to potential interactions and side effects between rules. However, the embodiments described herein simply replace the base table with indexes that meet the criteria, which has no effect on downstream operators in the logical plan. Therefore, rules can be included in all other optimizer rules. after (That is, after the query optimizer has already completed the logical query plan in other ways) apply the new rules.
[0144] What is the order of the rules? Since both FilterIndexRule and JoinIndexRule are applied after all other rules, the order is somewhat arbitrary and can be either. However, implementations can benefit from placing JoinIndexRule before FilterIndexRule, as the index used for joining can be expected to lead to further improvements.
[0145] Having described the architecture of the index subsystem, the lifecycle management of indexes, and how to effectively utilize such indexes in queries, the following description turns to the question: Given a known query workload, what index(s) will be most beneficial to create?
[0146] The embodiments described below provide an index recommendation framework that operates in two main steps:
[0147] 1) Candidate generation - Generate candidate indexes based on the characteristics of the query workload; and
[0148] 2) Choosing the best index – Use rule-based or cost-based methods to select the best index from the candidate indexes.
[0149] Therefore, given a query workload, a set of candidate indexes is created, and the best index in the set is selected and recommended for construction (or, alternatively, automatically constructed).
[0150] refer to Figure 11Further exploration of this two-step process, Figure 11 Example workload 1100 and example step 1102 for generating index recommendations for workload 1100 are shown. Workload 1100 includes query 1104, which includes query Q1 and query Q2. The data source being queried is also shown with workload 1100, presented in the form of table 1106. Example step 1102 includes steps 1108 through 1112, pointing to step 1: candidate generation, step 2: searching for the best index configuration, and step 3: recommended output, respectively. Step 1108 includes sub-steps 1a and 1b. Similarly, step 1110 includes sub-steps 2a and 2b. Based on... Figure 11 The following discussion of example step 1102, other structural and operational embodiments will be clear to those skilled in the art(s).
[0151] Figure 11 The main idea behind step 1108, candidate generation, is as follows:
[0152] 1. Examine each query with a workload of 1100 to determine Indexable columns (For example, in step 1a); and
[0153] 2. Use 'a' to construct candidate indexes (e.g., at step 1b).
[0154] In this embodiment, candidate generation can be performed according to Algorithm 1 as shown below:
[0155]
[0156]
[0157] Algorithm 1
[0158] Algorithm 1 continues to be referenced Figure 11 The candidate generation step 1108 is described below. Embodiments can be configured to determine different types of indexable columns known in the art. In one particular embodiment, Algorithm 1 starts from the query... q Extract the following indexable columns:
[0159] - Appeared q Equal predicates (i.e., x=a ,in x Yes, a The column in Algorithm 1 (line 4, which is a constant);
[0160] - Appeared q Equal predicates (i.e., x ≤ a orx ≥ a ,in x Yes, a The column in Algorithm 1 (line 5, which is a constant);
[0161] -appeared q The co-predicates of (i.e.,) Rx = Sy ,in R and S The column in the table (row 6 of Algorithm 1);
[0162] -appeared q Grouped list (i.e., grouped as) x 1 , …, x m The column in ) (row 7 of Algorithm 1);
[0163] - Appeared q The list of projections (i.e., the selection) x 1 , …, x m The column in (row 8 of Algorithm 1)
[0164] q The set of indexable columns Simply put, it is the union of the sets of the above columns (and as reflected in line 9 of Algorithm 1).
[0165] After enumerating the indexable columns in rows 3 to 9 of Algorithm 1, rows 13 to 21 of Algorithm 1 generate the workload from the indexable columns. Each query The candidate indexes are described below.
[0166] To construct candidate indexes from indexable columns, the indexable columns are grouped together in row 13 by their corresponding tables (accessed by queries). Algorithm 1 then iterates through each such group of indexable columns corresponding to a specific table, generating candidate indexes for each group in rows 15-20.
[0167] Specifically, create one or more indexes for the indexable columns corresponding to the filter (line 16 of Algorithm 1), and create an index (if any) for the indexable columns corresponding to the union (line 19 of Algorithm 1).
[0168] The candidate index corresponding to the filter is represented in line 16 as I filter Each candidate index includes Index column and Included columns The index column is composed of... The equality filter column in the middle and The range filtering columns are concatenated to form the index, while the remaining indexable columns form its included columns (as shown in the helper function GenerateFilterIndex() in lines 24-26 of Algorithm 1). The candidate index corresponding to the concatenation is represented in line 19 as follows. I join And, similar to the filtering indexes, each candidate index includes... Index column and Included columns . The columns that are joined together form the index columns, while the remaining indexable columns form the columns it includes (as shown in the helper function GenerateJoinIndex() in lines 30 to 31 of Algorithm 1).
[0169] After Algorithm 1 is completed, Figure 11 Step 1108 is also completed, and the next step in the embodiment is to determine which index configuration is optimal. This is done in order to obtain the index candidate set returned by Algorithm 1. One way to select the best index is by enumeration. All subsets, and find those that cause a decrease in workload in terms of query execution time. The best improved subset. However, when When the size is large, this approach may not be feasible in practice. Therefore, the embodiments disclosed herein employ a heuristic approach.
[0170] The first type of heuristic is a rule-based approach that compares deterministic statistics of candidate indices. Specifically, an embodiment can implement the frequency-based method shown in Algorithm 2 below:
[0171]
[0172]
[0173] Algorithm 2
[0174] Algorithm 2 can be summarized in the following three steps:
[0175] 1. Statistical index frequency (Lines 2 to 5), where we count the number of times each candidate index appears in the query workload;
[0176] 2. Merge candidate indexes (Lines 7-13), where we combine indexes with the same indexed columns into a single index by combining their included columns; and
[0177] 3. Rank the merged index(Line 15), where we sort the merged indexes in descending order of frequency.
[0178] Then, Algorithm 2 returns the previous results from the sorted candidates. K There are indexes (in row 16), among which K It is a pre-defined value given by the user.
[0179] While frequency-based approaches generally provide good indexing recommendations, they may not be suitable for all workloads. First, just because a workload's queries frequently use a candidate index does not necessarily mean that the index provides a significantly reduced query execution time. For example, such an index might reside on a frequently accessed table with small references (i.e., dimensions) where access time is negligible. Second, while merging candidate indexes with the same indexed columns has the advantage of reducing index storage and maintenance overhead, efficiency can be difficult to measure without a proper understanding of the overhead. To address these and other issues, implementations can employ cost-based approaches that rely on modeling query execution costs.
[0180] One embodiment of the cost-based index selection method relies on several basic building blocks:
[0181] 1. Cost Model It estimates the execution cost of a given query plan;
[0182] 2. "what if" utility Its return Assumption The query plan and its cost without actually building an index; and
[0183] 3. High efficiency Search Algorithm It searches for the index that generates the lowest workload execution cost.
[0184] Figure 12A A schematic diagram 1200 illustrates an architecture for cost-based index tuning using the "what if" utility 1210 according to an embodiment. Schematic 1200 includes an index tuning module 1204 and a query optimizer 1212, which includes the "what if" utility 1210. Based on... Figure 12A The following discussion of the schematic diagram 1200, other structural and operational embodiments will be clear to those skilled in the art.
[0185] At higher levels, Figure 12A The cost-based approach implemented in the schematic diagram 1200 works as follows. The index tuning module 1204 will include query... q and hypothetical index configuration CThe ordered pair 1206 is provided to the "what-if" utility 1210. It is assumed that the index configuration can include the candidate index set as described above. The "what-if" utility 1210 can combine with the query optimizer 1212 to generate an ordered pair 1208, which includes the query plan. P and implementation plan P Estimated cost (in) Figure 12A (Represented as 'cost-est(P)'). Query plan P The plan is generated by the query optimizer, assuming that the configuration has actually been built. C The index. Similarly, the cost estimate cost-est(P) reflects the plan. P The estimated execution cost. (See below for details.) Figure 12B The context includes a more detailed description of this high-level overview. Figure 12B Example 1202 of the application of the "what-if" utility 1210 to an SQL query is shown according to an embodiment. Example 1202 includes table 1214, SQL query 1216, filtering index F11218, and plan. P 1220, Plan P '1222, Cost' R 1224 and cost R '1226. Based on such Figure 12A The “what-if” utility 1210 discussed below, along with other structural and operational embodiments, will be clear to those skilled in the art.
[0186] like Figure 12B As shown, the data source has two tables, 1214, to be queried: R ( a , b )and S ( c , d And query 1216 is a simple filter-join query applied to them. Suppose a user wants to understand the performance impact on query 1216 by building an index F11218. In one embodiment, the user can call the "what-if" utility 1210 without building the index F11218. Such a call typically proceeds as follows:
[0187] 1. Receive the plan returned by the query optimizer. P 1220, and the search can use the filtering predicates evaluated by index F11218.
[0188] 2. Index F11218 was determined to be beneficial for accelerating the processing of Ra=5.
[0189] 3.R The table scan above was replaced with access to the "hypothesis" filter F11218, and a new plan was generated. P '1222. Plan' P '1222 is not executable (because filter F11218 has not been built yet).
[0190] 4. Regarding the new plan P '1222 invokes the cost estimation program and generates its estimated cost. (Again, this cost is hypothetical, and it makes sense if F1 is constructed.)'
[0191] To determine the cost P Cost models can be applied to planning. P 1220. For example, suppose the cost model is configured to estimate the cost of each operator. Output size Assuming table R is 2 GB in size, this means a table scan of R incurs a cost of retrieving and processing 2 GB (denoted as cost R 1224). On the other hand, the cost model employed by the "what-if" utility 1210 determines that when the table scan of R is replaced by index F11218, the size can be reduced to 0.8 GB, which is the cost... R '1226, as Figure 12B As shown.
[0192] In this way, the cost model can estimate the plan. P The output size of all operators in 1220, and similarly, the plan is estimated. P’ The output of the operators in 1222 sums the individual costs and determines the estimated cost for executing each plan. After doing this, the plans can now be compared. P 1220 and the plan P’ The cost of 1222 is calculated, and improvements are determined. For the example here, assume the cost ( P ) = size( P = 2.5GB, and cost ( P′ ) = size( P′ ) = 1.5GB, which are respectively Plan P 1220 and Plan P’ The sum of the output sizes of all operators in 1222. As a result, if index F11218 were constructed, the improvement would be... Algorithm 3 shown in this article illustrates an embodiment of the "what if" utility 1210:
[0193]
[0194] Cost models can estimate the planned query cost in several ways. In the size-based cost model described above, the size of the base table / index file can be obtained by relying on metadata stored in the file system. Combined with the above... Figure 12B In the examples described, it can be obtained in this way. R And the size of index F11218. Estimating the output size of other operators can often be done by estimating the cardinality and / or selectivity of the operators (e.g., estimating the fraction of rows in a table that satisfy the predicate filtering conditions). Operator selectivity can be determined, statistics about the workload can be collected, and the statistics can be summarized using compact data structures (e.g., histograms, sketches, random samples). However, collecting statistics from large amounts of data is very time-consuming. Furthermore, it can be difficult to maintain these statistics to prevent them from becoming outdated due to data updates.
[0195] Alternatively, a heuristic approach can be used to assign selectivity values to operators. For example, for operators whose output size is the same as their input size, such as... Sort Its selectivity is only 1.0; for other operators, such as filter or Alliance combine Its selectivity can be set to 0.1.
[0196] It should be noted that the exact cost estimate produced by the cost model is not particularly important for the purposes of the "what-if" utility 1210. That is, having a true and accurate cost estimate is not as important as the comparability of any two estimates. It is sufficient if the cost model can accurately determine which of the two query plans has a higher cost.
[0197] exist Figure 12A Having provided a high-level description of the "what-if" utility 1210 within the context of the architectural diagram 1200, we now turn to a detailed description of an embodiment of a cost-based method for index selection using the "what-if" utility 1210. However, it should be noted that the "what-if" utility 1210 has its own advantages beyond its role in index recommendation. Using the "what-if" utility 1210, users are able to... quantitatively Evaluate the potential improvements in building the hypothesis index, for example, in percentage terms. However, these benefits are not free—they lead to more accurate cost modeling, which in turn increases the overhead of collecting statistics.
[0198] Algorithm 4 utilizes, for example, the "what if" utility 1210 implemented by Algorithm 3, to implement an embodiment of a cost-based method for index selection, as shown below:
[0199]
[0200] Algorithm 4 accepts query workload The candidate index set listed by Algorithm 1 and the number of indexes to be returned. K As input. Enumerate up to size K Each subset (in line 3), and for each such subset Call the "what if" utility 1210 to get each The estimated cost is like The assumed indexes in the previous example are constructed the same way (line 4). If this is for workload... estimated cost If the sum is lower than the lowest cost recorded so far, then we will Mark it as the best index and update the lowest cost so far (lines 5-7). Finally, return the overall best subset found with the minimum estimated cost (line 8). Note that it will call... The sum of the returned costs, as shown in line 4, is just one example of combining query costs to calculate workload costs. For example, in an alternative embodiment, each query could be further weighted (e.g., relative to frequency), and a “weighted sum” could then be calculated when combining query costs.
[0201] Implementations of workload optimization systems can be carried out in various ways to use information obtained from queries about the workload to generate index recommendations that will benefit the workload, and to build and use such indexes to serve the queries. For example, Figure 13 A detailed schematic diagram 1300 of a workload optimization system 1302 according to an embodiment is shown. (As...) Figure 13 As shown, the workload optimization system 1302 includes a candidate index generator 1304, an index selector 1308, and a query processor 1312. Based on... Figure 13 The following discussion of the workload optimization system 1302, other structural and operational embodiments will be clear to those skilled in the art.
[0202] As a preliminary matter, as stated above, such as Figure 13The workload optimization system 1302 shown is configured to receive a workload 1104 comprising multiple queries and to pass the workload 1104 to a candidate index generator 1304. The candidate index generator 1304 is configured to extract a set of indexable columns from the queries of the workload 1104 in various ways. For example, in an embodiment, indexable columns may be extracted from the queries of the workload 1104 according to Algorithm 1, as described above. The candidate index generator 1304 is also configured to subsequently generate a set of candidate indexes 1306 based on the set of indexable columns. For example, candidate indexes 1306 may be generated from the set of indexable columns according to Algorithm 1.
[0203] Subsequently, the generated candidate indexes 1306 are passed to the index selector 1308. The index selector 1308 is configured to select which index(s) among the candidate indexes 1306 will provide the greatest performance benefit when executing the query for workload 1104, and provides the thus selected indexes to the query processor 1312. The index selector 1308 can select the best index in several ways. For example, the index selector 1308 can employ a frequency-based method as described in Algorithm 2 and above. Alternatively, the index selector 1308 can employ a cost-based method, which utilizes… Figure 12A The "what-if" utility 1210 determines which indexes have the lowest cost. For example, the index selector 1308 may employ an embodiment of the "what-if" utility 1210 implementing Algorithm 3 as described above. Furthermore, the index selector 1308 may then employ Algorithm 4 to select the optimal index using the "what-if" utility 1210 and the methods described above. The index selector 1308 then passes the selected index 1310 to the query processor 1312.
[0204] Query processor 1312 is configured to accept a selected index 1310, construct an index included in the selected index 1310 to provide a constructed index 1314, receive a query 1316, generate a query plan optimized to use one or more constructed indexes 1314, and execute the query plan to produce query results. For example, constructed index 1314 can be constructed from the selected index 1310 by constructing a table that includes one or more key columns (i.e., “index columns” as described in detail above in the description of Algorithm 1) and one or more data columns corresponding to the “included columns”.
[0205] Upon receiving query 1316, query processor 1312 is configured to generate a query plan for that query, wherein, where possible, FilterIndexRule and JoinIndexRule are used, and in combination as described above. Figure 9 and10 The query plan is modified to reference one or more built indexes 1314. The query processor 1312 can then execute the modified query plan to produce the final query results.
[0206] According to one embodiment, in combination Figure 14 Described Figure 13 Other operational aspects of the workload optimization system 1302 Figure 14 A flowchart 1400 illustrates a method for workload optimization performed at a first query engine in a distributed query processing system. In an embodiment, flowchart 1400 may be derived from... Figure 13 The workload optimization system 1302 is executed. (Although reference...) Figure 13 The workload optimization system 1302 shown is described, but Figure 14 The method is not limited to this implementation. In fact, as further described below, flowchart 1400 can also be derived from... Figure 16 The workload optimization system 1602 is executed. Based on the following... Figure 14 The discussion of flowchart 1400, other structural and operational embodiments will be clear to those skilled in the art.
[0207] Please note that flowchart 1400 can be triggered to optimize the workload of the distributed query processing system in various ways. For example, optimization can be triggered in response to an explicit request from a system administrator or automatically (e.g., based on changes in average system workload over time or substantial changes to the underlying data). Flowchart 1400 begins at step 1402. In step 1402, a set of candidate indexes is generated based on multiple queries. For example, and referring to... Figure 13 The workload optimization system 1302, and the candidate index generator 1304, can be configured to generate candidate indexes 1306 based on a set of indexable columns extracted from a workload comprising multiple queries, as described above. Figure 13 Workload optimization system 1302 Figure 11 A detailed description of Algorithm 1 and the manner in which it is described in detail. Flowchart 1400 continues in step 1404.
[0208] At step 1404, based on the determination of the estimated performance improvement for the workload provided by the candidate index set, a predetermined number of candidate indexes are selected from the candidate index set. For example, and continuing to refer to... Figure 13 The workload optimization system 1302 includes an index selector 1308 configured to select a predetermined number of candidate indices 1310 from a candidate index set 1306 based on performance improvements estimated for the workload provided by the selected set. More specifically, the index selector 1308 can be configured as described above. Figure 13 The workload optimization system 1302, Algorithm 2 or Algorithm 3 in combination with Algorithm 4, and their respective detailed descriptions are described in detail. Flowchart 1400 continues in step 1406.
[0209] At step 1406, the selected candidate indexes are constructed according to the indexing specification and stored at a predetermined location on the data lake. The predetermined location and the index metadata included in the index construction conform to the indexing specification. For example, and continuing to refer to... Figure 13 The workload optimization system 1302, query processor 1312 can be configured to accept the selected index 1310 from index selector 1308 and build such an index to provide index 1314, according to the above regarding Figure 13 The workload optimization system 1302 and / or methods known in the art. Furthermore, such an index can conform to index specification 212, thereby including metadata such as content 504 and lineage 506 as described in detail above. By conforming to index specification 212, such an index can also be stored in a predetermined default location specified in index specification 212. Flowchart 1400 continues at step 1412.
[0210] At step 1408, a query is received. For example, and continue to refer to... Figure 13 The workload optimization system 1302, query processor 1312 can receive query 1316, which may include a query on workload 1104 or any other query. Flowchart 1400 continues in step 1410.
[0211] In step 1410, a query plan is generated for the query, which is optimized to use the built index. For example, and continue to refer to Figure 13 The workload optimization system 1302 and query processor 1312 can be configured to generate intermediate query plans that are not configured to use any indexes, and then apply FilterIndexRule and / or JoinIndexRule to the intermediate query plans to provide the services configured to use the indexes as described above. Figure 13 The workload optimization system 1302 details the query plan for building at least one of the indexes, FilterIndexRule and / or JoinIndexRule respectively, as combined with Figure 9 and 10 Examples 900 and 1000 are described.
[0212] Figure 14 Flowchart 1400 ends at step 1412. In step 1412, the query plan is executed to generate the final query results. For example, and continue to refer to... Figure 13The workload optimization system 1302, query processor 1312, can be configured to execute the query plan generated in step 1414 to generate query result 1318. As is known in the art, because the query plan is configured to use at least one of the building indexes in building index 1314, generating query result 1318 requires fewer system resources (e.g., storage space, CPU cycles, and / or memory) compared to executing the intermediate query plan described above that is not configured to use building index 1314.
[0213] In the foregoing discussion of steps 1402 to 1412 of flowchart 1400, it should be understood that these steps may sometimes be performed in a different order or even simultaneously with other steps. For example, once at least one index has been built at step 1406, the embodiment may perform steps 1408 to 1412 while the system continues to build other indexes of a predetermined number of candidate indexes. Other operational embodiments will be clear to those skilled in the art. It should also be noted that... Figure 13 The foregoing general description of the operation of the workload optimization system 1302 is provided for illustrative purposes only, and embodiments of the workload optimization system 1302 may include different hardware and / or software and may operate in a different manner than described above. In fact, the steps of flowchart 1300 can be performed in various ways.
[0214] Combination Figure 15 and 16 Described Figure 13 Other operational aspects of the workload optimization system 1302. Figure 15 It shows the Figure 14 The flowchart 1400 is a multi-engine refinement flowchart, in which, according to one embodiment, a second query engine optimizes the query plan to utilize a pre-existing index found at a predetermined location. Figure 16 A schematic diagram of an example workload optimization system 1600, including aspects of multi-engine interoperability according to an embodiment, is shown. Figure 16 As shown, the workload optimization system 1600 includes a data lake 202, a data lake access client 1606, a first query engine 1608, and a second query engine 1614. The data lake 202 includes an index 1602 and a dataset 1604 stored thereon. As described below, index 1602 contains a covering index 502 created thereon.
[0215] In one embodiment, Figure 15 The steps of flowchart 1500 are, for example... Figure 16 The second query engine 1614 shown is used to perform these steps, and these steps assume that there are built indexes (e.g., covering index 502) on data lake 202 constructed by different query engines. Therefore, in the description Figure 15Before flowchart 1500, now refer to Figure 16 The workload optimization system 1600 describes the creation of such indexes by such a query engine.
[0216] In this embodiment, the first query engine 1608 can be configured to create an index on the data lake 202 by receiving and executing an appropriate query on the dataset 1604. For example, the first query engine 1608 can receive a CREATE INDEX query 1610 from the data lake access client 1606 and execute that query on the dataset 1604 to create a covering index 502. In this embodiment, the index corresponding to the CREATE INDEX query 1610 is based on the above reference. Figure 14 Flowchart 1400 and Figure 13 The workload optimization system 1302 describes the process for determining the indexes. Specifically, and referring to the workload optimization system 1302, the index to be created by the create index query 1610 may include one of the selected indexes 1310 chosen by the index selector 1308 from the candidate indexes 1306.
[0217] Upon receiving the create index query 1610, the first query engine 1608 can execute a query on the dataset 1604, and more specifically, on the indexed data 1612, to generate a covering index 502. This covering index 502 is then stored at the default predetermined location specified in the index specification 214 on the data lake 202, as described in detail above, unless an alternative storage location is specified at the time of index creation (also as described above). In this embodiment, the first query engine 1608 may correspond to... Figure 13 The query processor 1312. However, it should be understood that in some embodiments, the candidate index generator 1304 and the index selector 1308 may be separate from the first query engine 1608.
[0218] After describing the index creation of the workload optimization system 1600, we will now combine... Figure 15 describe Figure 16 Other operational aspects of the workload optimization system 1600 Figure 15 It shows the Figure 14 The flowchart 1400 is a multi-engine refinement flowchart 1500, wherein, according to an embodiment, the second query engine optimizes the query plan to utilize a pre-existing index found at a predetermined location. In an embodiment, flowchart 1500 may be derived from... Figure 16 The workload optimization system 1600 is running. (Although reference...) Figure 16 The workload optimization system 1600 shown is described, but... Figure 15 The method is not limited to this implementation. In fact, as further described below, flowchart 1500 can also be derived from... Figure 13 The workload optimization system 1302 is executed. Based on the following... Figure 15 The discussion of flowchart 1500, other structural and operational embodiments will be clear to those skilled in the art.
[0219] Flowchart 1500 begins at step 1502. In step 1502, a query is received. For example, the second query engine 1614 can receive query 1316. Flowchart 1500 continues at step 1504.
[0220] At step 1504, a query plan is generated for this query. For example, Figure 16 The second query engine 1614 can be used according to the above information. Figure 14 The executable query plan is generated in the manner described in step 1410 of flowchart 1400. More specifically, the second query engine 1614 can be configured similar to... Figure 13 The query processor 1312 operates to generate an intermediate query plan that is not configured to use any building indexes. Flowchart 1500 continues in step 1506.
[0221] At step 1506, the built indexes stored in a predetermined location are searched within the data lake. For example, in an embodiment, the second query engine 1614 is configured to know the index specification 214 and implement an API such as the user-facing index management API 226 described above, thereby discovering any indexes that may exist on the data lake 202 and are stored in a predetermined default location specified in the index specification 214. Therefore, the second query engine 1614 can discover the covering index 502 at the predetermined default location on the data lake 202, and can subsequently retrieve the raw query plan information for that index's index metadata from the rawPlan node 516 of the covering index 502 as described above. The second query engine 1614 can examine the raw query plan to determine whether it can use the covering index 502. As mentioned above, the second query engine 1614 may not be able to use the index because it requires capabilities not available on the second query engine 1614 (e.g., unsupported hash functions).
[0222] The second query engine 1614 can also access index metadata, including content 504 and lineage 506 (both described herein), before or after analyzing the original query plan information described above, to determine, for example, the indexed columns and included columns (and their types), the indexed data source, and / or the physical location and layout of the covering index 502. Some or all of this information may be useful in determining whether and how the second query engine 1614 utilizes such an index. Flowchart 1500 continues at step 1508.
[0223] At step 1508, based on the index metadata of each found built index, it is determined whether the corresponding built index can be used to optimize the query plan, and if so, the query plan is optimized to use the corresponding built index. For example, and continuing to refer to... Figure 16 The workload optimization system 1600, having received query 1316 and also having discovered the covering index 502, determines that the covering index 502 can be used by the second query engine 1614 and the data source indexed therefrom. The second query engine 1614 can use the covering index 502 when constructing a query plan for query 1316, wherein such a query plan, where possible, uses the covering index 502 in the manner described above with respect to the workload optimization system 1302 to produce query result 1318. For example, the second query engine 1614 can apply FilterIndexRule and / or JoinIndexRule to the query plan generated in step 1504 to provide an optimized query plan configured as described above with respect to the workload optimization system 1302. Figure 13 The workload optimization system 1302 generally uses a covering index 502, FilterIndexRule and / or JoinIndexRule respectively, as combined as... Figure 9 and 10 Examples 900 and 1000 are described.
[0224] Flowchart 1500 ends at step 1510. In step 1510, an optimized query plan is executed to provide query results. For example, and continue to refer to... Figure 16 The workload optimization system 1600, and the second query engine 1614, can be configured to execute the query plan generated in step 1512 to generate query results, such as... Figure 13 The query result shown is 1318. As is well known in the art, because the query plan is configured to use the covering index 502, generating the query result requires fewer system resources (e.g., storage space, CPU cycles, and / or memory) compared to executing the intermediate query plan described above, which is not configured to use the building index.
[0225] In the foregoing discussion of steps 1502 to 1410 of flowchart 1500, it should be understood that these steps may sometimes be performed in a different order or even concurrently with other steps. For example, step 1506, where the embodiment searches and builds an index in the data lake, may be performed at any time before step 1508, including before receiving the query in step 1502 or before generating the query plan in step 1504. Similarly, in some embodiments, it may be determined based on index metadata whether one or more of the indexes found at step 1506 are incompatible with the query engine (e.g., the second query engine 1614), thereby allowing step 1508 to be performed before steps 1502 and 1504 in certain circumstances. Other operational embodiments will be clear to those skilled in the art. It should also be noted that... Figure 13 and Figure 16 The foregoing general description of the operation of the workload optimization systems 1302 and 1600 is provided for illustrative purposes only, and embodiments of the workload optimization systems 1302 and 1600 may include different hardware and / or software and may operate in a different manner than described above.
[0226] III. Example Computer System Implementation
[0227] Each of the data collector 112, data decomposer 114, data modeler and server 116, query optimizer 1212, candidate index generator 1304, index selector 1308, query processor 1312, data lake access client 1606, first query engine 1608 and / or second query engine 1614, and flowcharts 1400 and / or 1500 can be implemented in hardware or hardware combined with software and / or firmware. For example, the data collector 112, data decomposer 114, data modeler and server 116, query optimizer 1212, candidate index generator 1304, index selector 1308, query processor 1312, data lake access client 1606, first query engine 1608 and / or second query engine 1614, and flowcharts 1400 and / or 1500 can be implemented as computer program code / instructions configured to execute in one or more processors and stored in a computer-readable storage medium. Alternatively, the data collector 112, data decomposer 114, data modeler and server 116, query optimizer 1212, candidate index generator 1304, index selector 1308, query processor 1312, data lake access client 1606, first query engine 1608 and / or second query engine 1614, and flowchart 1400 and / or 1500 can be implemented as hardware logic / electronic circuits.
[0228] For example, in one embodiment, one or more of the following components—data collector 112, data decomposer 114, data modeler and server 116, query optimizer 1212, candidate index generator 1304, index selector 1308, query processor 1312, data lake access client 1606, first query engine 1608 and / or second query engine 1614, and flowcharts 1400 and / or 1500—can be implemented together in an SoC in any combination. The SoC may include an integrated circuit chip that includes a processor (e.g., a central processing unit (CPU), microcontroller, microprocessor, digital signal processor (DSP), etc.), memory, one or more communication interfaces, and / or other circuitry, and may optionally execute received program code and / or include embedded firmware for performing functions.
[0229] Figure 17 An exemplary implementation of the computing device 1700 in which embodiments can be implemented is shown. For example, a data collector 112, a data decomposer 114, a data modeler and server 116, a query optimizer 1212, a candidate index generator 1304, an index selector 1308, a query processor 1312, a data lake access client 1606, a first query engine 1608 and / or a second query engine 1614, and flowcharts 1400 and / or 1500 may be implemented in one or more computing devices similar to the computing device 1700 in fixed or mobile computer embodiments, including one or more features and / or alternative features of the computing device 1700. The description of the computing device 1700 provided herein is for illustrative purposes and is not intended to be limiting. Embodiments may be implemented in other types of computer systems, as are well known to those skilled in the art.
[0230] like Figure 17As shown, computing device 1700 includes one or more processors, referred to as processor circuitry 1702, system memory 1704, and a bus 1706 coupling various system components, including system memory 1704, to processor circuitry 1702. Processor circuitry 1702 is electronic and / or optical circuitry implemented in one or more physical hardware circuitry device elements and / or integrated circuit devices (semiconductor chip or die) as a central processing unit (CPU), microcontroller, microprocessor, and / or other physical hardware processor circuitry. Processor circuitry 1702 can execute program code stored in a computer-readable medium, such as program code for operating system 1730, application program 1732, other programs 1734, etc. Bus 1706 represents any one or more bus structures of various types, including memory bus or memory controller, peripheral bus, accelerated graphics port, and any processor or local bus using various bus architectures. System memory 1704 includes read-only memory (ROM) 1708 and random access memory (RAM) 1710. The Basic Input / Output System 1712 (BIOS) is stored in ROM 1708.
[0231] The computing device 1700 also includes one or more of the following drives: a hard disk drive 1714 for reading and writing to the hard disk; a disk drive 1716 for reading or writing to a removable disk 1718; and an optical disk drive 1720 for reading or writing to a removable optical disk 1722, such as a CD-ROM, DVD-ROM, or other optical media. The hard disk drive 1714, disk drive 1716, and optical disk drive 1720 are connected to the bus 1706 via a hard disk drive interface 1724, a disk drive interface 1726, and an optical disk drive interface 1728, respectively. The drives and their associated computer-readable media provide the computer with non-volatile storage of computer-readable instructions, data structures, program modules, and other data. While hard disks, removable disks, and removable optical disks have been described, other types of hardware-based computer-readable storage media may also be used to store data, such as flash memory cards, digital video discs, RAM, ROM, and other hardware storage media.
[0232] Multiple program modules may be stored on a hard disk, magnetic disk, optical disk, ROM, or RAM. These programs include an operating system 1730, one or more application programs 1732, other programs 1734, and program data 1736. Application program 1732 or other program 1734 may include, for example, computer program logic (e.g., computer program code or instructions) for implementing a data collector 112, a data resolver 114, a data modeler and server 116, a query optimizer 1212, a candidate index generator 1304, an index selector 1308, a query processor 1312, a data lake access client 1606, a first query engine 1608 and / or a second query engine 1614, and flowcharts 1400 and / or 1500 (including any appropriate steps of flowcharts 1400 and / or 1500) and / or other embodiments described herein.
[0233] Users can input commands and information into computing device 1700 through input devices such as keyboard 1738 and pointing device 1740. Other input devices (not shown) may include microphone, joystick, gamepad, satellite antenna, scanner, touch screen and / or touchpad, voice recognition system for receiving voice input, gesture recognition system for receiving gesture input, etc. These and other input devices are typically connected to processor circuitry 1702 via serial interface 1742 coupled to bus 1706, but may also be connected via other interfaces such as parallel port, game port, or universal serial bus (USB).
[0234] Display screen 1744 is also connected to bus 1706 via an interface such as video adapter 1746. Display screen 1744 may be external to or incorporated into computing device 1700. Display screen 1744 may display information and serve as a user interface for receiving user commands and / or other information (e.g., via touch, finger gestures, virtual keyboard, etc.). In addition to display screen 1744, computing device 1700 may also include other peripheral output devices (not shown), such as speakers and printers.
[0235] Computing device 1700 is connected to network 1748 (e.g., the Internet) via an adapter or network interface 1750, modem 1752, or other means for establishing communication on the network. Figure 17 As shown, the modem 1752, which can be built-in or external, can be connected to the bus 1706 via the serial interface 1742, or it can be connected to the bus 1706 using another interface type that includes a parallel interface.
[0236] As used herein, the terms “computer program medium,” “computer-readable medium,” and “computer-readable storage medium” are used to refer to physical hardware media, such as a hard disk associated with hard disk drive 1714, removable disk 1718, removable optical disk 1722, other physical hardware media such as RAM, ROM, flash memory cards, digital video disks, zip disks, MEM, nanotechnology-based storage devices, and other types of physical / tangible hardware storage media. Such computer-readable storage media are distinct from and do not overlap with communication media (which exclude communication media). Communication media contain computer-readable instructions, data structures, program modules, or other data in modulated data signals such as carrier waves. The term “modulated data signal” refers to a signal whose one or more characteristics are set or altered in a manner that encodes information in a signal. By way of example and not limitation, communication media include wireless media, such as acoustic, RF, infrared, and other wireless media, as well as wired media. Embodiments also relate to such communication media that are separate from and do not overlap with embodiments relating to computer-readable storage media.
[0237] As described above, computer programs and modules (including application program 1732 and other programs 1734) can be stored on a hard disk, magnetic disk, optical disk, ROM, RAM, or other hardware storage media. Such computer programs can also be received via network interface 1750, serial interface 1742, or any other interface type. When executed or loaded by an application program, such computer programs enable computing device 1700 to implement the features of the embodiments described herein. Therefore, such computer programs represent a controller of computing device 1700.
[0238] The embodiments also relate to computer program products that include computer code or instructions stored on any computer-readable medium. Such computer program products include hard disk drives, optical disk drives, storage device packages, memory sticks, memory cards, and other types of physical storage hardware.
[0239] IV. Additional Example Implementations
[0240] This paper presents a first query engine. The first query engine is configured to be coupled to a data lake, which is configured to store datasets and indexes based on the datasets on the data lake. The first query engine is also configured to: receive a workload comprising multiple queries pointing to the datasets; generate a set of candidate indexes based on the multiple queries; select a predetermined number of candidate indexes from the set of candidate indexes based on an estimated performance improvement for the workload provided by the determined set of candidate indexes; and construct the indexes from the selected candidate indexes according to an index specification, and store the constructed indexes at a predetermined location on the data lake, the predetermined location and the index metadata included in the constructed indexes conforming to the index specification.
[0241] In another embodiment of the first query engine, the index metadata describes one or more of the following: the content that builds the index; the lineage of the index; or the state of the index.
[0242] In another embodiment of the first query engine, the index metadata describing the content of the index includes one or more of the following: the name of the index; the type of the index; the configuration of the index, including the identifiers of the index columns and the identifiers of the included columns and the type of each column; or the physical location and layout of the index.
[0243] In another embodiment of the first query engine, the index metadata describing the lineage of the index construction includes one or more of the following: one or more identifiers, each identifier corresponding to one or more data sources being indexed; the time of indexing the data sources; or a descriptive history of the index construction.
[0244] In another embodiment of the first query engine, the index metadata describing the state of index building includes one or more state descriptors from a set of enabled, disabled, created, or deleted.
[0245] In another embodiment of the first query engine, the index metadata describing the genealogy of the index building also includes query plan information for the original query used to create the index building process.
[0246] In another embodiment of the first query engine, the data lake is also configured to be coupled to a second query engine different from the first query engine. The second query engine is configured to: search for built indexes stored in a predetermined location in the data lake; receive queries; determine, based on the index metadata found by the search for each built index, whether the corresponding built index can be used to optimize the query plan for executing the query, and if so, optimize the query plan to use the corresponding built index; and execute the optimized query plan to provide query results.
[0247] In another embodiment of the first query engine, the first and second query engines are further configured to determine, based on metadata describing the lineage of the index, whether the data source has been updated since the index was built with respect to the data source corresponding to the index, and if so, to rebuild the index using the original query plan information.
[0248] This paper presents a query processing workload optimization system configured to receive workloads comprising multiple queries. The system includes: one or more processors; and one or more storage devices accessible to the processors, storing program code for execution by the processors. The program code includes: a first query processor coupled to a data lake, a candidate index generator, and an index selector, wherein: the candidate index generator is configured to generate a set of candidate indexes based on the multiple queries; and the index selector is configured to select a predetermined number of candidate indexes from the set based on an estimated performance improvement for the workload provided by the determined set of candidate indexes; the first query processor is configured to: construct indexes from the selected candidate indexes according to an index specification, and store the constructed indexes at a predetermined location on the data lake, the predetermined location and the index metadata included in the constructed indexes conforming to the index specification; receive queries; generate a query plan for the queries, wherein the query plan is optimized to use the constructed indexes; and execute the query plan to generate final query results.
[0249] In another embodiment of the query processing workload optimization system, the index metadata describes one or more of the following: the content that builds the index; the lineage of the index; or the state of the index.
[0250] In another embodiment of the query processing workload optimization system, the index metadata describing the content of the index includes one or more of the following: the name of the index; the type of the index; the configuration of the index, including the identifiers of the indexed columns and the included columns, and the type of each column; or the physical location and layout of the index.
[0251] In another embodiment of the query processing workload optimization system, the index metadata describing the lineage of index building includes one or more of the following: one or more identifiers, each corresponding to one or more data sources being indexed; the time of indexing the data sources; or a descriptive history of index building.
[0252] In another embodiment of the query processing workload optimization system, the index metadata describing the state of index building includes one or more status descriptors from a set of enabled, disabled, created, or deleted.
[0253] In another embodiment of the query processing workload optimization system, the index metadata describing the genealogy of the indexes to be built also includes original query plan information for the original queries used to create the indexes.
[0254] In another embodiment of the query processing workload optimization system, the system further includes a second query processor coupled to the data lake, the second query processor being configured to receive queries; generate query plans for the queries; search for built indexes stored in predetermined locations in the data lake; determine, based on the index metadata of each found built index, whether the corresponding built index can be used to optimize the query plan, and if so, optimize the query plan to use the corresponding built index; and execute the optimized query plan to provide query results.
[0255] In another embodiment of the query processing workload optimization system, the first query processor and the second query processor are further configured to determine, based on metadata describing the lineage of the index being built, whether the data source has been updated since the index was built with respect to the data source corresponding to the index, and if so, to rebuild the index using the original query plan information.
[0256] This paper presents a query processing workload optimization system configured to receive a workload comprising multiple queries. The system includes: a data lake configured to store a dataset and dataset-based indexes thereon; at least one processor configured to be coupled to the data lake; and at least one memory storing program code configured to be executed by the at least one processor to perform operations including: receiving a workload comprising multiple queries pointing to the dataset; generating a set of candidate indexes based on the multiple queries; selecting a predetermined number of candidate indexes from the set of candidate indexes based on an estimated performance improvement for the workload provided by the set of candidate indexes; and constructing indexes from the selected candidate indexes according to an index specification, and storing the constructed indexes at a predetermined location on the data lake, the predetermined location and the index metadata included in the constructed indexes conforming to the index specification.
[0257] In another embodiment of the query processing workload optimization system, the index metadata describes one or more of the following: the content that builds the index; the lineage of the index; or the state of the index.
[0258] In another embodiment of the query processing workload optimization system, the index metadata describing the lineage of index building includes one or more of the following: one or more identifiers, each corresponding to one or more data sources being indexed; the time of indexing the data sources; or a descriptive history of index building.
[0259] In another embodiment of the query processing workload optimization system, the operation further includes determining, based on metadata describing the lineage of the index being built, whether the data source has been updated since the index was built with respect to the data source corresponding to the index, and if so, rebuilding the index using the original query plan information.
[0260] V. Conclusion
[0261] While various embodiments of the disclosed subject matter have been described above, it should be understood that they are presented by way of example only and not as a limitation. Those skilled in the art will understand that various changes in form and detail may be made therein without departing from the spirit and scope of the embodiments as defined in the appended claims. Therefore, the breadth and scope of the disclosed subject matter should not be limited by any of the exemplary embodiments described above, but should be defined solely by the appended claims and their equivalents.
Claims
1. A system for querying data, comprising: One or more processors; as well as One or more storage devices, accessible by the one or more processors, storing program code for execution by the one or more processors, the program code including: A first query engine is configured to be coupled to a data lake, the data lake being configured to store datasets and indexes based on the datasets on the data lake, and the first query engine is further configured to: Receive workloads that include multiple queries pointing to the dataset; A candidate index set is generated based on the multiple queries; Based on the estimated performance improvement for the workload provided by the candidate index set, a predetermined number of candidate indexes are selected from the candidate index set; and The selected candidate indexes are constructed according to the index specification, and the constructed indexes are stored at a predetermined location on the data lake, the predetermined location and the index metadata included in the constructed index conforming to the index specification, the index metadata describing the lineage of the constructed index, which can be used by another query engine to determine whether the data source corresponding to the constructed index has been updated.
2. The system of claim 1, wherein the index metadata further describes one or more of the following: The content of the constructed index; or The state of index construction.
3. The system of claim 2, wherein the index metadata describing the content of the constructed index includes one or more of the following: The name of the index to be constructed; The type of the index being constructed; The configuration for building the index includes the identifiers of the indexed columns and the identifiers of the included columns, as well as the type of each column; or The physical location and layout of the constructed index.
4. The system of claim 2, wherein the index metadata describing the genealogy for which the index is constructed includes one or more of the following: One or more identifiers, each corresponding to one or more data sources being indexed; The time when the data source was indexed; or The descriptive history of the index construction.
5. The system of claim 4, wherein the index metadata describing the genealogy of the constructed index further includes query plan information for the original query used to create the constructed index.
6. The system of claim 5, further comprising a second query engine different from the system, the second query engine being configured to: Search the data lake for the constructed index stored at the predetermined location; Receive queries; Based on the index metadata found through the search for each built index, it is determined whether the corresponding built index can be used to optimize the query plan for executing the query, and if so, the query plan is optimized to use the corresponding built index. as well as Execute an optimized query plan to provide query results.
7. The system of claim 6, wherein at least one of the first query engine or the second query engine is further configured to determine, based on the index metadata describing the lineage of the constructed index, whether the data source corresponding to the constructed index has been updated since it was indexed, and if so, to rebuild the index using the query plan information.
8. The system of claim 2, wherein the index metadata describing the state of the constructed index includes one or more state descriptors from a set of enabled, disabled, created, or deleted states.
9. A query processing workload optimization system, comprising: The first query engine includes: One or more processors; and One or more storage devices, accessible by the one or more processors, storing program code for execution by the one or more processors, the program code including: A candidate index generator is configured to generate a set of candidate indexes based on multiple queries; and An index selector is configured to select a predetermined number of candidate indexes from the candidate index set based on an estimated performance improvement for the workload provided by the candidate index set; and A first query processor is configured to construct an index from the selected candidate indexes according to an index specification, and to store the constructed index at a predetermined location on the data lake, the predetermined location and the index metadata included in the constructed index conforming to the index specification; and The second query engine includes: One or more processors; and One or more storage devices, accessible by the one or more processors, storing second program code for execution by the one or more processors, the second program code comprising: The second query processor is configured as follows: Receive queries; Generate a query plan for the query; and Search the indexed data lake stored at the predetermined location. Based on the index metadata for each found built index, determine whether the corresponding built index can be used to optimize the query plan, and if so, optimize the query plan to use the corresponding built index; and Execute the optimized query plan to provide query results.
10. The query processing workload optimization system according to claim 9, wherein the index metadata describes one or more of the following: The content of the constructed index; The genealogy for constructing the index; or The state of index construction.
11. The query processing workload optimization system of claim 10, wherein the index metadata describing the content of the constructed index includes one or more of the following: The name of the index to be constructed; The type of the index being constructed; The configuration for building the index includes the identifiers of the indexed columns and the identifiers of the included columns, as well as the type of each column; or The physical location and layout of the constructed index.
12. The query processing workload optimization system of claim 10, wherein the index metadata describing the genealogy of the constructed index includes one or more of the following: One or more identifiers, each corresponding to one or more data sources being indexed; The time when the data source was indexed; or The descriptive history of the index construction.
13. The query processing workload optimization system of claim 12, wherein the index metadata describing the genealogy of the constructed index further includes query plan information for the original query used to create the constructed index.
14. The query processing workload optimization system according to claim 13, wherein, At least one of the first query engine or the second query engine is further configured to determine, based on the index metadata describing the lineage of the constructed index, whether the data source corresponding to the constructed index has been updated since it was indexed, and if so, to rebuild the index using the query plan information.
15. The query processing workload optimization system of claim 10, wherein the index metadata describing the state of the constructed index includes one or more status descriptors from a set of enabled, disabled, created, or deleted.
16. A query processing workload optimization system, comprising: A data lake configured to store datasets and indexes based on the datasets; At least one processor is configured to be coupled to the data lake; as well as At least one memory storing program code configured to be executed by the at least one processor to perform operations, the operations including: Receive workloads that include multiple queries pointing to the dataset; A candidate index set is generated based on the multiple queries; Based on the estimated performance improvement for the workload provided by the candidate index set, a predetermined number of candidate indexes are selected from the candidate index set; and The selected candidate indexes are constructed according to the index specification, and the constructed indexes are stored at a predetermined location on the data lake, the predetermined location and the index metadata included in the constructed index conforming to the index specification, the index metadata describing the lineage of the constructed index, which can be used by another query engine to determine whether the data source corresponding to the constructed index has been updated.
17. The query processing workload optimization system of claim 16, wherein the index metadata further describes one or more of the following: The content of the constructed index; or The state of index construction.
18. The query processing workload optimization system of claim 16, wherein the index metadata describing the genealogy in which the index is built includes one or more of the following: One or more identifiers, each corresponding to one or more data sources being indexed; The time when the data source was indexed; The descriptive history of the index construction; or Query plan information for the original query used to build the index.
19. The query processing workload optimization system according to claim 18, wherein, The operation further includes: determining, based on the index metadata describing the lineage of the constructed index, whether the data source corresponding to the constructed index has been updated since it was indexed, and if so, using the query plan information to rebuild the index.
20. The query processing workload optimization system of claim 17, wherein the index metadata describing the content of the constructed index includes one or more of the following: The name of the index to be constructed; The type of the index being constructed; The configuration for building the index includes the identifiers of the indexed columns and the identifiers of the included columns, as well as the type of each column; or The physical location and layout of the constructed index.