A distributed database management method, device, equipment, system and medium

By introducing a suspension and wake-up message mechanism in the distributed database, the low efficiency of SQL execution plan feedback and the concurrency problems of handle switching are solved, achieving efficient management of result feedback and retention of handle environment.

CN115062041BActive Publication Date: 2026-06-12SHANGHAI DAMENG DATABASE

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
SHANGHAI DAMENG DATABASE
Filing Date
2022-06-08
Publication Date
2026-06-12

AI Technical Summary

Technical Problem

In distributed databases, existing technologies have low SQL execution plan feedback efficiency, and when switching statement handles within the same session, multi-threaded operations have concurrency issues, making it impossible to switch back to the original handle's execution environment.

Method used

A suspension and wake-up message mechanism is introduced between instance nodes and data storage nodes. Instance nodes receive management results from data storage nodes and generate suspension messages to suspend unfinished worker threads. At the same time, statement handles are allowed to be executed alternately within the same session, and thread states are managed through SUSPEND and RESUME messages.

Benefits of technology

It improves the efficiency of distributed database management, supports the alternation of handles, preserves the execution context of subtasks, solves concurrency issues, and enables timely feedback of management results.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN115062041B_ABST
    Figure CN115062041B_ABST
Patent Text Reader

Abstract

Embodiments of the present application provide a distributed database management method, device, equipment, system and medium, the method is applied to an instance node in a distributed database, the method comprises: receiving a first SQL statement sent by a client; determining a first subtask corresponding to the first SQL statement according to the first SQL statement and sending to a data storage node; receiving a first number of first management results fed back by the data storage node and generating a first suspension message; feeding back the first management results to the client and sending the first suspension message to the data storage node to suspend the first work thread not ended in the data storage node. By using the method, a batch of management results returned by the data storage node can be received in the subtask scheduling process, without waiting for the subtask to be completely executed, the management efficiency is improved, and the function of suspending the subtask in the plan is realized, and the context environment of subtask execution is reserved.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of computer technology, and in particular to a distributed database management method, apparatus, device, system, and medium. Background Technology

[0002] In existing technologies, the execution plan of Structured Query Language (SQL) in a distributed cluster is typically divided into multiple subtasks and sent to the data storage node (Backend Processor, BP) for execution. Each subtask is executed by one or more threads in a certain execution order. A common scheduling implementation is based on the subtask scheduling mechanism contained in the data storage node. Once scheduling and execution begin, the subtasks will inevitably be executed in the scheduled order. When a distributed transaction is committed, the BP must wait for all subtask threads to finish before responding to the instance node (SQL Processor, SP) where the user executed the SQL connection. This results in low efficiency in returning query results.

[0003] If a client switches between statement handles within the same session, due to the concurrency of multi-threaded transactions, the execution of the new handle on the BP side will inevitably wait for all child threads on other handles to finish or be forcibly terminated. This means the execution context of the original handle being switched to will completely end, and it's impossible to switch back to the previous execution environment of that handle. For example, if the original handle was executing a query, and then a new handle was executed after the original handle finished, the query data executed by the original handle could not be retrieved. Summary of the Invention

[0004] This invention provides a distributed database management method, apparatus, device, system, and medium to achieve timely feedback of management results while preserving the context of task execution.

[0005] Firstly, this embodiment provides a distributed database management method, which is applied to instance nodes in a distributed database, including:

[0006] Receive the first SQL statement sent by the client;

[0007] Based on the first SQL statement, determine the first subtask corresponding to the first SQL statement and send it to the data storage node;

[0008] Receive the first management result of the first quantity from the data storage node and generate the first suspension message;

[0009] The first management result is fed back to the client, and the first suspension message is sent to the data storage node to suspend the first unfinished worker thread in the data storage node.

[0010] Secondly, this embodiment provides a distributed database management method, which is applied to data storage nodes in a distributed database, including:

[0011] Receive the first subtask sent by the instance node;

[0012] A preset number of first worker threads are assigned to each of the first subtasks and execution begins.

[0013] The first management result of the first quantity after execution will be sent to the instance node;

[0014] Receive the first suspend message sent by the instance node and suspend the first worker thread that has not yet ended.

[0015] Thirdly, this embodiment provides a distributed database management device, which is configured in an instance node of a distributed database, including:

[0016] The first statement receiving module is used to receive the first SQL statement sent by the client.

[0017] The first task determination module is used to determine the first subtask corresponding to the first SQL statement based on the first SQL statement and send it to the data storage node;

[0018] The first result receiving module is used to receive the first number of first management results fed back by the data storage node and generate a first suspension message;

[0019] The first result feedback module is used to feed back the first management result to the client and send the first suspension message to the data storage node to suspend the first unfinished worker thread in the data storage node.

[0020] Fourthly, this embodiment provides a distributed database management device, which is configured in the data storage node of the distributed database, including:

[0021] The first task receiving module is used to receive the first subtask sent by the instance node;

[0022] The first thread allocation module is used to allocate a preset number of first worker threads to each first subtask and start execution.

[0023] The first result sending module is used to send the first number of first management results after execution to the instance node;

[0024] The first thread suspension module is used to receive the first suspension message sent by the instance node and suspend the first working thread that has not yet ended.

[0025] Fifthly, this embodiment provides a computer device, characterized in that, as an instance node and / or data storage node in a distributed database, it includes:

[0026] At least one processor; and

[0027] A memory communicatively connected to the at least one processor; wherein,

[0028] The memory stores a computer program that can be executed by the at least one processor, which enables the at least one processor to perform the distributed database management method according to any embodiment of the present invention.

[0029] Sixthly, this embodiment provides a distributed database system, characterized in that the system includes at least one instance node and a data storage node;

[0030] The instance node includes: a first statement receiving module, a first task determining module, a first result receiving module, and a first result feedback module; wherein...

[0031] The first statement receiving module is used to receive the first SQL statement sent by the client.

[0032] The first task determination module is used to determine the first subtask corresponding to the first SQL statement based on the first SQL statement and send it to the data storage node;

[0033] The first result receiving module is used to receive the first number of first management results fed back by the data storage node and generate a first suspension message;

[0034] The first result feedback module is used to feed back the first management result to the client and send the first suspension message to the data storage node to suspend the first unfinished worker thread in the data storage node.

[0035] The data storage node includes: a first task receiving module, a first thread allocation module, a first result sending module, and a first thread suspension module; wherein...

[0036] The first task receiving module is used to receive the first subtask sent by the instance node;

[0037] The first thread allocation module is used to allocate a preset number of first worker threads to each first subtask and start execution.

[0038] The first result sending module is used to send the first number of first management results after execution to the instance node;

[0039] The first thread suspension module is used to receive the first suspension message sent by the instance node and suspend the first working thread that has not yet ended.

[0040] In a seventh aspect, this embodiment provides a computer-readable storage medium storing computer instructions that, when executed by a processor, implement the distributed database management method described in any embodiment of the present invention.

[0041] This invention provides a distributed database management method, apparatus, device, system, and medium. The method is applied to instance nodes in a distributed database and includes: receiving a first SQL statement sent by a client; determining a first subtask corresponding to the first SQL statement and sending it to a data storage node; receiving a first number of first management results from the data storage node and generating a first suspension message; sending the first management results back to the client and sending the first suspension message to the data storage node to suspend an unfinished first worker thread in the data storage node. In this technical solution, during the process of an instance node scheduling a data storage node to execute planned subtasks, a batch of management results from the data storage node can be received and sent back to the client first, while a suspension message is added, allowing the instance node to initiate the suspension of the subtask at the data storage node. Compared to existing technologies where the data storage node's subtask scheduling mechanism ensures that subtasks are executed in the scheduled order once scheduling begins, and the data storage node must wait for all subtask threads to finish before responding to the instance node when submitting management results, this invention allows the data storage node to receive a batch of management results during the subtask scheduling process, without waiting for the subtasks to be fully completed. This improves management efficiency and also enables the suspension of planned subtasks, preserving the execution context of the subtasks.

[0042] It should be understood that the description in this section is not intended to identify key or essential features of the embodiments of the present invention, nor is it intended to limit the scope of the invention. Other features of the invention will become readily apparent from the following description. Attached Figure Description

[0043] To more clearly illustrate the technical solutions in the embodiments of the present invention, the accompanying drawings used in the description of the embodiments will be briefly introduced below. Obviously, the accompanying drawings described below are only some embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.

[0044] Figure 1This is a flowchart illustrating a distributed database management method provided in Embodiment 1 of the present invention;

[0045] Figure 2 This is a flowchart illustrating a distributed database management method provided in Embodiment 2 of the present invention;

[0046] Figure 3 This is a flowchart illustrating a distributed database management method provided in Embodiment 3 of the present invention;

[0047] Figure 4 This is a schematic diagram of the structure of a distributed database management device provided in Embodiment 4 of the present invention;

[0048] Figure 5 This is a schematic diagram of the structure of a distributed database management device provided in Embodiment 5 of the present invention;

[0049] Figure 6 This is a structural block diagram of a computer device provided in Embodiment Six of the present invention;

[0050] Figure 7 This is a schematic diagram of the structure of a distributed database system provided in Embodiment 7 of the present invention. Detailed Implementation

[0051] To enable those skilled in the art to better understand the present invention, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings of the embodiments of the present invention. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort should fall within the scope of protection of the present invention.

[0052] It should be noted that the terms "first," "second," etc., in the specification, claims, and accompanying drawings of this invention are used to distinguish similar objects and are not necessarily used to describe a specific order or sequence. It should be understood that such data can be interchanged where appropriate so that the embodiments of the invention described herein can be implemented in orders other than those illustrated or described herein. Furthermore, the terms "comprising" and "having," and any variations thereof, are intended to cover a non-exclusive inclusion; for example, a process, method, system, product, or apparatus that comprises a series of steps or units is not necessarily limited to those steps or units explicitly listed, but may include other steps or units not explicitly listed or inherent to such processes, methods, products, or apparatus.

[0053] Example 1

[0054] Figure 1This is a flowchart illustrating a distributed database management method provided in Embodiment 1 of the present invention. This embodiment is applicable to the management of distributed databases. The method is applied to instance nodes in the distributed database. The method can be executed by a distributed database management device, which can be implemented in hardware and / or software and can be configured in a computer device.

[0055] like Figure 1 As shown, Embodiment 1 of the present invention provides a distributed database management method, the method comprising:

[0056] S110: Receive the first SQL statement sent by the client.

[0057] Structured Query Language (SQL) is a database query and programming language used to access, query, update, and manage database systems. In this embodiment, the executing entity can be considered as the instance node SP in the distributed database, which can be understood as the node where the user executes SQL statements.

[0058] From the user's perspective, when a user needs to manage the database, they can connect to the instance node using client tools or an application programming interface (API), and enter SQL statements in the interface provided by the client tools or API. For example, to perform a query operation on the database, one can connect to the corresponding instance node in the client tool and execute the SQL statement. Connecting to the corresponding instance node can be done by selecting the hostname and port. After connecting to the instance node, the user can enter SQL statements in the client's text editor and execute the SQL statements by clicking the execute button or other similar actions.

[0059] Specifically, from the perspective of the instance node, the instance node, as the execution subject in this embodiment, receives the first SQL statement sent by the client. The first SQL statement can be understood as the first SQL statement sent by the client.

[0060] S120. Based on the first SQL statement, determine the first subtask corresponding to the first SQL statement and send it to the data storage node.

[0061] In this context, the data storage node (BP) can be understood as the user data storage node. A distributed database has multiple data storage nodes, and dynamic addition and deletion of data storage nodes are supported. When an instance node receives the first SQL statement, it parses the statement to generate a plan tree, which is then divided into one or more subtasks. Here, the subtask corresponding to the first SQL statement is denoted as the first subtask. Each first subtask is then sent to a data storage node, enabling the data storage node to perform database management based on the subtask.

[0062] Furthermore, the step of determining the first subtask corresponding to the first SQL statement and sending it to the data storage node based on the first SQL statement can be described as follows:

[0063] a11) The query optimizer parses the first SQL statement to generate the first plan tree.

[0064] The query optimizer is a database engine component responsible for generating efficient execution plans for SQL statements. Specifically, the query optimizer internally optimizes user requests, generates an execution plan, and transmits it to the storage engine for data manipulation, ultimately returning the results to the user. It is one of the core components of the database management system, determining which indexes and join algorithms to use for a specific query to ensure its efficient execution.

[0065] Specifically, the query optimizer parses the first SQL statement, such as retrieving the table name to be queried and the operations to be performed, and generates a plan tree based on the parsing results. Here, the plan tree corresponding to the first SQL statement is denoted as the first plan tree.

[0066] b11) Divide the first plan tree into at least one first subtask and send the first subtask to the data storage node.

[0067] For example, if two table names are parsed out and the two table names are connected by "join", then the corresponding task can be divided into two sub-tasks.

[0068] Specifically, the first plan tree is divided into multiple subtasks. Here, the subtask corresponding to the first SQL statement is denoted as the first subtask. Then, the first subtask is sent to the data storage node so that the data storage node can perform database management according to the subtask.

[0069] S130: Receive the first management result of the first quantity from the data storage node and generate the first suspension message.

[0070] In this embodiment, when a data storage node receives a subtask, the scheduler in the data storage node allocates a batch of parallel worker threads to each subtask and executes them. After execution, a batch of management results is returned to the instance node. Here, the first quantity of the first management results can be understood as returning the first batch of management results, and the value of the first quantity is not specifically limited. For example, the first management result can be a query result.

[0071] In this embodiment, during the process of the instance node scheduling the data storage node to execute planned subtasks, a SUSPEND message is added. This SUSPEND message can be sent to the data storage node to suspend the subtask currently being executed within the data storage node.

[0072] In this embodiment, one of the times when the instance node generates a suspension message is when the instance node receives the first batch of management results from the data storage node, and this is recorded as the first suspension message.

[0073] S140, Feed back the first management result to the client and send the first suspension message to the data storage node to suspend the first unfinished worker thread in the data storage node.

[0074] In this step, when the instance node receives the first management result from the data storage node, it will send the result back to the client for user viewing and use. Simultaneously, the instance node sends the first suspension message to the data storage node. This means that each subtask in the data storage node is being executed in parallel by a batch of worker threads allocated according to their degree of parallelism. When the data storage node receives the first suspension message, the first worker thread running and not yet completed in the corresponding data storage node will be suspended.

[0075] This invention provides a distributed database management method applied to instance nodes in a distributed database. The method includes: receiving a first SQL statement sent by a client; determining a first subtask corresponding to the first SQL statement and sending it to a data storage node; receiving a first number of first management results from the data storage node and generating a first suspension message; sending the first management results back to the client and sending the first suspension message to the data storage node to suspend an unfinished first worker thread in the data storage node. Using this method, during the process of an instance node scheduling a data storage node to execute planned subtasks, a batch of management results from the data storage node can be received and sent back to the client first, while a suspension message is added, allowing the instance node to initiate the suspension of the subtask at the data storage node. Compared to existing technologies where the data storage node's subtask scheduling mechanism ensures that subtasks are executed in the scheduled order once scheduling begins, and the data storage node must wait for all subtask threads to finish before responding to the instance node when submitting management results, this invention allows the data storage node to receive a batch of management results during the subtask scheduling process, without waiting for the subtasks to be fully completed. This improves management efficiency and also enables the suspension of planned subtasks, preserving the execution context of the subtasks.

[0076] As a first optional embodiment of Embodiment 1 of the present invention, this first optional embodiment further defines and optimizes Embodiment 1, and the method further includes:

[0077] a12) When the second SQL statement is received from the client, the second subtask corresponding to the second SQL statement is determined, and the second subtask and the generated second suspension message are sent to the data storage node. The second suspension message is used to suspend the first working thread that has not ended in the data storage node.

[0078] In this step, the second SQL statement is used to distinguish it from the first SQL statement, indicating that the second SQL statement is a different statement from the first SQL statement. This step involves scenarios where statement handles are switched alternately within the same session.

[0079] In existing technologies, if statement handles are switched alternately within the same client session, due to the concurrency of multi-threaded transactions, the execution of the new handle on the data storage node will inevitably wait for all child threads on other handles to finish or be forcibly terminated. This means the execution context of the original handle being switched to will completely terminate, and it will be impossible to switch back to the previous execution environment of that handle.

[0080] This embodiment introduces a scheduling mechanism that allows for the alternating execution of statement handles within a session. When a new SQL statement, i.e., the second SQL statement, is received, the instance node parses the second SQL statement to determine its corresponding second subtask. Simultaneously, the instance node generates a suspension message, denoted as the second suspension message. Then, the instance node sends the second subtask and the generated second suspension message to the data storage node. The second suspension message suspends the first worker thread that has not yet finished in the data storage node. This ensures that the first worker thread corresponding to the first SQL statement in the data storage node is suspended and its execution is paused. Correspondingly, the subtask corresponding to the second SQL statement is executed, thus enabling handle switching.

[0081] b12) Receives the first number of second management results from the data storage node and sends the second management results back to the client.

[0082] Similarly, after the second SQL statement is executed, in this step, when the instance node receives the second management result sent by the data storage node, it will feed back the second management result to the client for the user to view or use.

[0083] As a second optional embodiment of Embodiment 1 of the present invention, this second optional embodiment further defines and optimizes Embodiment 1. After the first working thread that has not ended in the data storage node is suspended, the method further includes:

[0084] a13) When the first SQL statement associated with the data retrieval command sent by the client is received, a wake-up message generated based on the data retrieval command is sent to the data storage node so that the first worker thread that has been suspended in the data storage node can continue to execute.

[0085] From the user's perspective, when a user wants to continue retrieving data, they can send a fetch command to the instance node by clicking the fetch button on the client.

[0086] From the perspective of the instance node, when the instance node receives the data retrieval command associated with the first SQL statement sent by the client, it needs to generate a wake-up message RESUME based on the data retrieval command. The wake-up message is sent to the data storage node to wake up the first suspended worker thread in the data storage node, and the first worker thread can continue execution after being woken up.

[0087] Understandably, users can execute desired SQL statements by clicking on the corresponding SQL statement in the client session. For example, if a user wants the result of the first SQL statement, they click to execute the first SQL statement; if they want the result of the second SQL statement, they click to execute the second SQL statement. Correspondingly, for instance nodes, the corresponding subtask and worker thread can be determined based on the handle identifier in the received execution statement. Similarly, the management results returned by data storage nodes can also be returned to the corresponding page based on the handle identifier. For example, the execution result of the first SQL statement is returned on the first SQL statement page, and the execution result of the second SQL statement is returned on the second SQL statement page.

[0088] b13) Receives the second number of third management results from the data storage node and sends the third management results back to the client.

[0089] It is understood that the first quantity, the first management result, refers to the management result of allocating a batch of worker threads to execute each subtask in parallel according to the degree of parallelism. The second quantity can be set to a multiple of the first quantity, specifically understood as N batches of management results, where the value of N can be set by the user or administrator.

[0090] After the instance node sends a wake-up message to the data storage node, the first suspended worker thread in the data storage node resumes execution. When the number of execution results reaches a second threshold, the data storage node returns the second number of management results to the instance node. This management result is recorded as the third management result. The instance node receives the third management result and sends it back to the client for user viewing or use.

[0091] It should be noted that in this embodiment, the management results are set as a first number of first management results and a second number of third management results. The purpose is to allow the execution results to be returned to the instance nodes in batches during process execution, and the instance nodes will then return the received management results to the client. Compared to the prior art where, during the submission of distributed transactions, the data storage node must wait for all subtask threads to finish before responding to the instance node, this execution result feedback method improves database management efficiency.

[0092] It is understandable that when there is a first SQL statement and a second SQL statement, or more SQL statements, the corresponding subtask worker thread can be put to sleep or woken up by the two interaction messages SUSPEND and RESUME, that is, paused or resumed, so as to achieve the switching execution between different SQL statements.

[0093] This optional embodiment introduces a scheduling mechanism that allows for alternating execution of statement handles within the same session. By adding a management mechanism that allows running sub-threads to be suspended and resumed, the execution context on different handles is preserved, resolving the concurrency issue of the same transaction object across different sub-threads, and thus solving the problem of alternating execution during handle switching. This invention adds two interactive messages, SUSPEND and RESUME, during the SP's scheduling of BP's execution of planned sub-tasks. The SP initiates the suspension and resumption of sub-tasks on the BP side, enabling the suspension and resumption of the planned sub-task context, thereby supporting handle switching.

[0094] Example 2

[0095] Figure 2 This is a flowchart illustrating a distributed database management method provided in Embodiment 2 of the present invention. This embodiment is applicable to the management of distributed databases. The method is applied to data storage nodes in the distributed database. The method can be executed by a distributed database management device, which can be implemented in hardware and / or software and can be configured in a computer device.

[0096] like Figure 2 As shown, Embodiment 2 of the present invention provides a distributed database management method, the method comprising:

[0097] S210, Receive the first subtask sent by the instance node.

[0098] In this embodiment, the execution entity can be considered as the data storage node BP in the distributed database. The data storage node is the actual storage node for user data. A cluster has multiple data storage nodes and supports dynamic addition and deletion of data storage nodes.

[0099] When an instance node receives the first SQL statement, it parses it to generate a plan tree and divides it into one or more subtasks. Here, the subtask corresponding to the first SQL statement is denoted as the first subtask. Each first subtask is then sent to a data storage node, enabling the data storage node to perform database management based on the subtask. The data storage node receives the first subtasks sent by the instance node.

[0100] S220. Assign a preset number of first worker threads to each first subtask and start execution.

[0101] Specifically, after receiving the first subtask, the instance node generates a corresponding handle object and a scheduler structure to manage the execution context. The scheduler then begins scheduling the subtasks, allocating a batch of worker threads to each subtask according to their degree of parallelism for concurrent execution.

[0102] S230, Send the first management result of the first quantity after execution to the instance node.

[0103] Specifically, for each first subtask, a batch of worker threads are allocated according to the degree of parallelism to begin parallel execution. After execution, a batch of management results is returned to the instance node. This execution result is recorded as the first management result.

[0104] S240, Receive the first suspend message sent by the instance node and suspend the first unfinished worker thread.

[0105] In this embodiment, after the data storage node executes, it returns a batch of query results to the instance node. After receiving this batch of data, the instance node returns it to the user and sends a suspension message to notify the data storage node to temporarily suspend the task executed by the handle and stop execution.

[0106] Specifically, after receiving the first suspend message from the instance node, the data storage node checks and counts the worker threads that have not yet finished on the handle, and suspends these worker threads. Simultaneously, it saves the execution context of the current handle through the scheduler. The execution context of the handle can be understood as the handle's execution result, the number of subtask threads, etc.

[0107] For example, the steps to suspend a worker thread can be described as follows: The scheduler will check and count the worker threads that have not yet finished on the handle. These worker threads will periodically check whether they need to be suspended during operation. After determining the suspension flag, they will wait on the event.

[0108] This invention provides a distributed database management method applied to data storage nodes in a distributed database. The method includes: receiving a first subtask sent by an instance node; allocating a preset number of first worker threads to each first subtask and starting execution; sending a first number of first management results after execution to the instance node; receiving a first suspension message sent by the instance node and suspending any unfinished first worker threads. Using this method, a portion of the results after execution is sent to the instance node first, and a suspension message is added to receive the first suspension message from the instance node and suspend any unfinished first worker threads. Compared to the existing technology where the subtask scheduling mechanism on the data storage node side inevitably completes subtasks according to the scheduling order once scheduling begins, and the data storage node must wait for all subtask threads to finish before responding to the instance node when submitting management results, this invention returns a batch of management results to the instance node during the subtask scheduling process, without waiting for the subtasks to be fully executed, thus improving management efficiency. It also allows for the suspension of planned subtasks, preserving the execution context of the subtasks.

[0109] As a first optional embodiment of Embodiment 2 of the present invention, this first optional embodiment further defines and optimizes Embodiment 2, and the method further includes:

[0110] a21) Receive the second subtask sent by the instance node.

[0111] In this step, the second subtask corresponds to the second SQL statement, which is used to distinguish it from the first SQL statement, indicating that the second SQL statement is a different statement from the first SQL statement. This step involves scenarios where statement handles are switched alternately within the same session.

[0112] In existing technologies, if statement handles are switched alternately within the same client session, due to the concurrency of multi-threaded transactions, the execution of the new handle on the data storage node will inevitably wait for all child threads on other handles to finish or be forcibly terminated. This means the execution context of the original handle being switched to will completely terminate, and it will be impossible to switch back to the previous execution environment of that handle.

[0113] This embodiment introduces a scheduling mechanism that allows for the alternating execution of statement handles within a session. When a new SQL statement, i.e., the second SQL statement, is received, the instance node will parse the second SQL statement to determine the corresponding second subtask. Then, the instance node sends the second subtask to the data storage node, which receives the second subtask sent by the instance node.

[0114] b21) Assign a preset number of second worker threads to each second subtask and start execution.

[0115] Specifically, after receiving the second subtask, the instance node generates a corresponding handle object and a scheduler structure to manage the execution context. The scheduler then begins scheduling the subtasks, allocating a batch of worker threads to each second subtask according to their degree of parallelism for concurrent execution.

[0116] c21) Send the first number of second management results after execution to the instance node.

[0117] Specifically, for each second subtask, a batch of worker threads are allocated according to the degree of parallelism to begin parallel execution. After execution, a batch of management results is returned to the instance node. This execution result is denoted as the second management result.

[0118] d21) Receive the second suspend message sent by the instance node and suspend the second worker thread that has not yet ended.

[0119] Specifically, the instance node generates a suspension message, denoted as the second suspension message. Then, the instance node sends the second subtask and the generated second suspension message to the data storage node. The second suspension message suspends the first worker thread that has not yet finished in the data storage node. This ensures that the first worker thread corresponding to the first SQL statement in the data storage node is suspended and its execution is paused. Correspondingly, the subtask corresponding to the second SQL statement is executed, thus enabling handle switching. The data storage node receives the second suspension message from the instance node and suspends the second worker thread that has not yet finished.

[0120] As a second optional embodiment of Embodiment 2 of the present invention, this second optional embodiment further limits and optimizes Embodiment 2. After suspending the unfinished first working thread, the method further includes:

[0121] a22) When a wake-up message associated with the first SQL statement sent by the instance node is received, the first worker thread that has been suspended in the first handle object associated with the wake-up message will be woken up to continue execution.

[0122] From the user's perspective, when a user wants to continue retrieving data, they can send a fetch command to the instance node by clicking the fetch button on the client. From the instance node's perspective, upon receiving the fetch command associated with the first SQL statement sent by the client, the instance node needs to generate a wake-up message RESUME based on the fetch command. The wake-up message is sent to the data storage node to wake up the first suspended worker thread in the data storage node, allowing it to continue execution. The wake-up message can also wake up all currently sleeping execution threads on the current handle, allowing them to continue execution and send the retrieved data back to the data storage node.

[0123] From the perspective of the data storage node, the data storage node receives the wake-up message associated with the first SQL statement sent by the instance node, and wakes up the first worker thread that has been suspended in the first handle object associated with the wake-up message to continue execution.

[0124] b22) Send the third management result after execution to the instance node.

[0125] When a data storage node receives a wake-up message, the first suspended worker thread in the data storage node resumes execution. When the number of execution results reaches a second threshold, the data storage node returns the second number of management results to the instance node. This management result is recorded as the third management result.

[0126] It should be noted that in this embodiment, the management results are set as a first number of first management results and a second number of third management results. The purpose is to allow the execution results to be returned to the instance nodes in batches during process execution, and the instance nodes will then return the received management results to the client. Compared to the prior art where, during the submission of distributed transactions, the data storage node must wait for all subtask threads to finish before responding to the instance node, this execution result feedback method improves database management efficiency.

[0127] This optional embodiment introduces a scheduling mechanism that allows for alternating execution of statement handles within the same session. By adding a management mechanism that allows running sub-threads to be suspended and resumed, the execution context on different handles is preserved, thus solving the concurrency problem of the same transaction object between different sub-threads and resolving the issue of alternating execution when handles are switched.

[0128] Example 3

[0129] Figure 3This is a flowchart illustrating a distributed database management method provided in Embodiment 3 of the present invention. This embodiment is a further optimization of Embodiment 2 above. In this embodiment, the limitation of "allocating a preset number of first working threads to each subtask and starting execution" is further optimized, and the limitation of "receiving the suspension message sent by the instance node and suspending the unfinished first working thread" is also optimized.

[0130] like Figure 3 As shown in the figure, this embodiment three provides a distributed database management method, which specifically includes the following steps:

[0131] S310, Receive the first subtask sent by the instance node.

[0132] S320. Based on the first subtask, generate the corresponding first handle object and first scheduler. The first scheduler is used to manage the execution context of the first handle object.

[0133] Specifically, after receiving the first subtask, the data storage node generates a corresponding handle object and a scheduler structure to manage the execution context of the handle execution.

[0134] S330: Based on the first scheduler, a preset number of first worker threads are allocated to each subtask and execution begins.

[0135] Specifically, the data storage node scheduling manager starts scheduling subtasks, allocating a batch of parallel worker threads to each subtask according to the degree of parallelism, and the worker thread corresponding to the handle is recorded as the first worker thread.

[0136] S340: Send the first management result of the first quantity after execution to the instance node.

[0137] Specifically, a batch of execution results are returned to the instance node.

[0138] S350. When the first suspend message sent by the instance node is received, a suspend flag is generated on the first scheduler.

[0139] Specifically, when the first suspension message is received from the instance node, a suspension flag is generated on the first scheduler for other worker threads to identify.

[0140] S360, Determine the first unfinished worker thread on the first handle object based on the scheduler manager.

[0141] Specifically, the scheduler will check and count the worker threads that have not yet finished on the handle.

[0142] S370. When the first unfinished worker thread detects the suspension flag, the first unfinished worker thread is suspended, and the execution context of the first handle object is saved based on the first scheduler.

[0143] Specifically, the scheduler will check and count the child threads that have not yet finished on the handle. These child threads will periodically check whether they need to be suspended during their operation. Once a suspension flag is detected, they will wait on the event. After suspension, the scheduler saves the execution context of the current handle and stops execution. This operation does not require returning a response message to the instance node.

[0144] Understandably, if a handle on a data storage node completes its execution normally, the scheduler containing that handle will automatically release the worker thread that executed the subtask.

[0145] This optional embodiment specifies the steps of allocating a preset number of first worker threads to each subtask and starting execution, as well as the steps of receiving a suspension message sent by the instance node and suspending the unfinished first worker threads. During the subtask scheduling process, a batch of management results returned by the data storage node can be received, without waiting for the subtasks to be fully executed, thus improving management efficiency. It also enables the suspension and wake-up of planned subtasks, preserving the execution context of the subtasks through the scheduling manager and facilitating handle switching.

[0146] Example 4

[0147] Figure 4 This is a schematic diagram of a distributed database management device provided in Embodiment 4 of the present invention. It is applicable to the management of distributed databases. The device is configured in the instance nodes of the distributed database. The device can be implemented in hardware and / or software and is generally integrated into a computer device. Figure 4 As shown, the device includes: a first statement receiving module 41, a first task determination module 42, a first result receiving module 43, and a first result feedback module 44. Among them,

[0148] The first statement receiving module 41 is used to receive the first SQL statement sent by the client.

[0149] The first task determination module 42 is used to determine the first subtask corresponding to the first SQL statement based on the first SQL statement and send it to the data storage node;

[0150] The first result receiving module 43 is used to receive the first number of first management results fed back by the data storage node and generate a first suspension message;

[0151] The first result feedback module 44 is used to feed back the first management result to the client and send the first suspension message to the data storage node to suspend the first unfinished worker thread in the data storage node.

[0152] Optionally, the first task determination module 42 is specifically used for:

[0153] The query optimizer parses the first SQL statement and generates the first plan tree.

[0154] The first plan tree is divided into at least one first subtask, and the first subtask is sent to the data storage node.

[0155] Optionally, the device may also include:

[0156] The second task sending module is used to determine the second subtask corresponding to the second SQL statement when it receives the second SQL statement sent by the client, and send the second subtask and the generated second suspension message to the data storage node. The second suspension message is used to suspend the first working thread that has not ended in the data storage node.

[0157] The second result feedback module is used to receive a first number of second management results from the data storage nodes and then feed the second management results back to the client.

[0158] Optionally, the device may also include:

[0159] The wake-up message sending module is used to send a wake-up message generated based on the data retrieval command to the data storage node when it receives the data retrieval command associated with the first SQL statement sent by the client, so that the first worker thread that has been suspended in the data storage node can continue to execute.

[0160] The third result feedback module is used to receive the second number of third management results from the data storage nodes and then feed the third management results back to the client.

[0161] The distributed database management device provided in this embodiment of the invention can execute the distributed database management method provided in any embodiment of the invention, and has the corresponding functional modules and beneficial effects of the method execution.

[0162] Example 5

[0163] Figure 5 This is a schematic diagram of a distributed database management device provided in Embodiment 5 of the present invention. It is applicable to the management of distributed databases. The device is configured in the data storage nodes of the distributed database. The device can be implemented in hardware and / or software and is generally integrated into computer equipment. Figure 5As shown, the device includes: a first task receiving module 51, a first thread allocation module 52, a first result sending module 53, and a first thread suspension module 54. Among them,

[0164] The first task receiving module 51 is used to receive the first subtask sent by the instance node;

[0165] The first thread allocation module 52 is used to allocate a preset number of first worker threads to each first subtask and start execution;

[0166] The first result sending module 53 is used to send the first number of first management results after execution to the instance node;

[0167] The first thread suspension module 54 is used to receive the first suspension message sent by the instance node and suspend the first worker thread that has not yet ended.

[0168] Optionally, the first thread allocation module 52 is specifically used for:

[0169] Based on the first subtask, a corresponding first handle object and a first scheduler are generated. The first scheduler is used to manage the execution context of the first handle object.

[0170] The first scheduler allocates a preset number of first worker threads to each subtask and begins execution.

[0171] Optionally, the first thread suspension module 54 is specifically used for:

[0172] When the first suspend message is received from the instance node, a suspend flag is generated on the first scheduler.

[0173] The scheduler determines the first unfinished worker thread on the first handle object;

[0174] When the first unfinished worker thread detects the suspension flag, the first unfinished worker thread is suspended, and the execution context of the first handle object is saved based on the first scheduler.

[0175] Optionally, the device may also include:

[0176] The second task receiving module is used to receive the second subtask sent by the instance node;

[0177] The second thread allocation module is used to allocate a preset number of second worker threads to each second subtask and start execution.

[0178] The second result sending module is used to send the first number of second management results after execution to the instance nodes;

[0179] The second thread suspension module is used to receive the second suspension message sent by the instance node and suspend the second worker thread that has not yet ended.

[0180] Optionally, the device may also include:

[0181] The wake-up module is used to wake up the first worker thread that has been suspended in the first handle object associated with the wake-up message when it receives the wake-up message associated with the first SQL statement sent by the instance node and continue execution.

[0182] The third result sending module is used to send the third management result after execution to the instance node.

[0183] The distributed database management device provided in this embodiment of the invention can execute the distributed database management method provided in any embodiment of the invention, and has the corresponding functional modules and beneficial effects of the method execution.

[0184] Example 6

[0185] Figure 6 This is a structural block diagram of a computer device provided in Embodiment Six of the present invention. This computer device serves as an instance node and / or data storage node in a distributed database. Figure 6 As shown, the computer device includes a processor 61, a memory 62, an input device 63, and an output device 64; the number of processors 61 in the computer device can be one or more. Figure 6 Taking a processor 61 as an example; the processor 61, memory 62, input device 63, and output device 64 in a computer device can be connected via a bus or other means. Figure 6 Taking the example of a connection between China and Israel via a bus.

[0186] The memory 62, as a computer-readable storage medium, can be used to store software programs, computer-executable programs, and modules, such as modules corresponding to the distributed database management method in this embodiment of the invention (e.g., a statement receiving module 41, a first task determining module 42, a first result receiving module 43, and a first result feedback module 44 in the distributed database management device). The processor 61 executes various functional applications and data processing of the computer device by running the software programs, instructions, and modules stored in the memory 62, thereby implementing the aforementioned distributed database management method.

[0187] The memory 62 may primarily include a program storage area and a data storage area. The program storage area may store the operating system and at least one application program required for a given function; the data storage area may store data created based on terminal usage. Furthermore, the memory 62 may include high-speed random access memory and non-volatile memory, such as at least one disk storage device, flash memory, or other non-volatile solid-state storage device. In some instances, the memory 62 may further include memory remotely located relative to the processor 61, which can be connected to the computer device via a network. Examples of such networks include, but are not limited to, the Internet, intranets, local area networks, mobile communication networks, and combinations thereof.

[0188] Input device 63 can be used to receive input digital or character information, and to generate key signal inputs related to user settings and function control of the computer device. Output device 64 may include display devices such as a display screen.

[0189] Example 7

[0190] Figure 7 This is a schematic diagram of the structure of a distributed database system provided in Embodiment 7 of the present invention, as shown below. Figure 7 As shown, the system includes at least one instance node 4 and a data storage node 5;

[0191] Instance node 4 includes: a first statement receiving module 41, a first task determination module 42, a first result receiving module 43, and a first result feedback module 44; wherein,

[0192] The first statement receiving module 41 is used to receive the first SQL statement sent by the client.

[0193] The first task determination module 42 is used to determine the first subtask corresponding to the first SQL statement based on the first SQL statement and send it to the data storage node;

[0194] The first result receiving module 43 is used to receive the first number of first management results fed back by the data storage node and generate a first suspension message;

[0195] The first result feedback module 44 is used to feed back the first management result to the client and send the first suspension message to the data storage node to suspend the first unfinished worker thread in the data storage node.

[0196] Data storage node 5 includes: a first task receiving module 51, a first thread allocation module 52, a first result sending module 53, and a first thread suspension module 54; wherein,

[0197] The first task receiving module 51 is used to receive the first subtask sent by the instance node;

[0198] The first thread allocation module 52 is used to allocate a preset number of first worker threads to each first subtask and start execution;

[0199] The first result sending module 53 is used to send the first number of first management results after execution to the instance node;

[0200] The first thread suspension module 54 is used to receive the first suspension message sent by the instance node and suspend the first worker thread that has not yet ended.

[0201] The distributed database management system provided in this embodiment of the invention can execute the distributed database management method provided in any embodiment of the invention, and has the corresponding functional modules and beneficial effects of the execution method.

[0202] Example 8

[0203] Embodiment 8 of the present invention also provides a storage medium containing computer-executable instructions, which, when executed by a computer processor, are used to perform a distributed database management method, the method comprising:

[0204] Receive the first SQL statement sent by the client;

[0205] Based on the first SQL statement, determine the first subtask corresponding to the first SQL statement and send it to the data storage node;

[0206] Receive the first management result of the first quantity from the data storage node and generate the first suspension message;

[0207] The first management result is fed back to the client, and the first suspension message is sent to the data storage node to suspend the first unfinished worker thread in the data storage node.

[0208] or,

[0209] Receive the first subtask sent by the instance node;

[0210] A preset number of first worker threads are assigned to each of the first subtasks and execution begins.

[0211] The first management result of the first quantity after execution will be sent to the instance node;

[0212] Receive the first suspend message sent by the instance node and suspend the first worker thread that has not yet ended.

[0213] Of course, the computer-executable instructions provided in the embodiments of the present invention are not limited to the method operations described above, but can also perform related operations in the distributed database management method provided in any embodiment of the present invention.

[0214] Based on the above description of the implementation methods, those skilled in the art can clearly understand that the present invention can be implemented using software and necessary general-purpose hardware, and of course, it can also be implemented using hardware, but in many cases the former is a better implementation method. Based on this understanding, the technical solution of the present invention, or the part that contributes to the prior art, can be embodied in the form of a software product. This computer software product can be stored in a computer-readable storage medium, such as a computer floppy disk, read-only memory (ROM), random access memory (RAM), flash memory, hard disk, or optical disk, etc., including several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) to execute the methods described in the various embodiments of the present invention.

[0215] It is worth noting that in the embodiments of the above-mentioned distributed database management device, the various units and modules included are only divided according to functional logic, but are not limited to the above division, as long as the corresponding functions can be realized; in addition, the specific names of each functional unit are only for easy distinction between each other and are not used to limit the scope of protection of the present invention.

[0216] Note that the above description is merely a preferred embodiment of the present invention and the technical principles employed. Those skilled in the art will understand that the present invention is not limited to the specific embodiments described herein, and various obvious changes, readjustments, and substitutions can be made without departing from the scope of protection of the present invention. Therefore, although the present invention has been described in detail through the above embodiments, the present invention is not limited to the above embodiments, and may include many other equivalent embodiments without departing from the concept of the present invention, the scope of which is determined by the scope of the appended claims.

Claims

1. A distributed database management method, characterized in that, Instance nodes used in distributed databases include: Receive the first SQL statement sent by the client; Based on the first SQL statement, determine the first subtask corresponding to the first SQL statement and send it to the data storage node; Receive a first number of first management results fed back by the data storage node, and generate a first suspension message when the first batch of management results is received, wherein the first number of first management results is the first batch of management results; The first batch of management results is fed back to the client and the first suspension message is sent to the data storage node to suspend the first unfinished worker thread on the first handle object in the data storage node, and the execution context of the first worker thread is saved, the context including the handle execution result and the number of subtask threads; The method further includes: When the second SQL statement sent by the client is received, the second subtask corresponding to the second SQL statement is determined, and the second subtask and the generated second suspension message are sent to the data storage node. The second suspension message is used to suspend the first working thread that has not ended in the data storage node. Receive a first number of second management results from the data storage node, and then send the first number of second management results back to the client.

2. The method according to claim 1, characterized in that, The step of determining the first subtask corresponding to the first SQL statement and sending it to the data storage node based on the first SQL statement includes: The query optimizer parses the first SQL statement to generate a first plan tree; The first plan tree is divided into at least one first subtask, and the first subtask is sent to the data storage node.

3. The method according to claim 1, characterized in that, After the first worker thread that has not yet ended on the first handle object in the data storage node is suspended, the process further includes: When a data retrieval command associated with the first SQL statement sent by the client is received, a wake-up message generated based on the data retrieval command is sent to the data storage node so that the first worker thread that has been suspended in the data storage node can continue to execute. Receive a second number of third management results from the data storage node, and then send the second number of third management results back to the client.

4. A distributed database management method, characterized in that, Data storage nodes used in distributed databases include: Receive the first subtask sent by the instance node; A preset number of first worker threads are assigned to each of the first subtasks and execution begins. The first management result of the first quantity after execution will be sent to the instance node; Receive the first suspend message sent by the instance node and suspend the first worker thread that has not yet ended; The step of receiving the suspend message sent by the instance node and suspending the unfinished first worker thread includes: When the first suspend message is received from the instance node, a suspend flag is generated on the first scheduler. Based on the scheduler, the first unfinished worker thread on the first handle object is determined; When the unfinished first worker thread detects the suspension flag, the unfinished first worker thread is suspended, and the execution context of the first handle object is saved based on the first scheduler. The context includes the handle execution result and the number of subtask threads. The method further includes: Receive the second subtask sent by the instance node; A preset number of second worker threads are assigned to each of the second subtasks and execution begins. The first number of second management results after execution will be sent to the instance node; Receive the second suspend message sent by the instance node and suspend the unfinished second worker thread.

5. The method according to claim 4, characterized in that, The process of allocating a preset number of first worker threads to each of the first subtasks and starting execution includes: Based on the first subtask, a corresponding first handle object and a first scheduler are generated. The first scheduler is used to manage the execution context of the first handle object. The first scheduler allocates a preset number of first worker threads to each subtask and begins execution.

6. The method according to claim 4, characterized in that, After suspending the first worker thread that has not yet finished, the following steps are also included: When a wake-up message associated with the first SQL statement is received from an instance node, the first worker thread that has been suspended in the first handle object associated with the wake-up message is woken up and continues to execute. The second number of third management results after execution will be sent to the instance node.

7. A distributed database management device, characterized in that, Configured in the instance nodes of the distributed database, including: The first statement receiving module is used to receive the first SQL statement sent by the client. The first task determination module is used to determine the first subtask corresponding to the first SQL statement based on the first SQL statement and send it to the data storage node; The first result receiving module is used to receive a first number of first management results fed back by the data storage node, and generate a first suspension message when the first batch of management results is received, wherein the first number of first management results is the first batch of management results; The first result feedback module is used to feed back the first batch of management results to the client and send the first suspension message to the data storage node to suspend the first unfinished first worker thread on the first handle object in the data storage node, and save the execution context of the first worker thread, the context including the handle execution result and the number of subtask threads; The device further includes: The second task sending module is used to determine the second subtask corresponding to the second SQL statement when it receives the second SQL statement sent by the client, and send the second subtask and the generated second suspension message to the data storage node. The second suspension message is used to suspend the first working thread that has not ended in the data storage node. The second result feedback module is used to receive a first number of second management results from the data storage node and feed the first number of second management results back to the client.

8. A distributed database management device, characterized in that, Configured in the data storage nodes of the distributed database, including: The first task receiving module is used to receive the first subtask sent by the instance node; The first thread allocation module is used to allocate a preset number of first worker threads to each first subtask and start execution. The first result sending module is used to send the first number of first management results after execution to the instance node; The first thread suspension module is used to receive the first suspension message sent by the instance node and suspend the first working thread that has not yet ended. The first thread suspension module is specifically used for: When the first suspend message is received from the instance node, a suspend flag is generated on the first scheduler. The scheduler determines the first unfinished worker thread on the first handle object; When the first worker thread that has not yet ended detects the suspension flag, the first worker thread that has not yet ended is suspended, and the execution context of the first handle object is saved based on the first scheduler. The context includes the handle execution result and the number of subtask threads. The device further includes: The second task receiving module is used to receive the second subtask sent by the instance node; The second thread allocation module is used to allocate a preset number of second worker threads to each second subtask and start execution. The second result sending module is used to send the first number of second management results after execution to the instance nodes; The second thread suspension module is used to receive the second suspension message sent by the instance node and suspend the second worker thread that has not yet ended.

9. A computer device, characterized in that, As instance nodes and / or data storage nodes in a distributed database, they include: At least one processor; and A memory communicatively connected to the at least one processor; wherein, The memory stores a computer program that can be executed by the at least one processor, the computer program being executed by the at least one processor to enable the at least one processor to perform the distributed database management method according to any one of claims 1-6.

10. A distributed database system, characterized in that, The system includes at least one instance node and a data storage node; The instance node includes: a first statement receiving module, a first task determining module, a first result receiving module, and a first result feedback module; wherein... The first statement receiving module is used to receive the first SQL statement sent by the client. The first task determination module is used to determine the first subtask corresponding to the first SQL statement based on the first SQL statement and send it to the data storage node; The first result receiving module is used to receive a first number of first management results fed back by the data storage node, and generate a first suspension message when the first batch of management results is received, wherein the first number of first management results is the first batch of management results; The first result feedback module is used to feed back the first batch of management results to the client and send the first suspension message to the data storage node to suspend the first unfinished first worker thread on the first handle object in the data storage node, and save the execution context of the first worker thread, the context including the handle execution result and the number of subtask threads; The instance node is also configured to, when receiving a second SQL statement sent by a client, determine the second subtask corresponding to the second SQL statement, and send the second subtask and the generated second suspension message to the data storage node, wherein the second suspension message is used to suspend the first working thread that has not ended in the data storage node; receive a first number of second management results fed back by the data storage node, and feed back the first number of second management results to the client; The data storage node includes: a first task receiving module, a first thread allocation module, a first result sending module, and a first thread suspension module; wherein... The first task receiving module is used to receive the first subtask sent by the instance node; The first thread allocation module is used to allocate a preset number of first worker threads to each first subtask and start execution. The first result sending module is used to send the first number of first management results after execution to the instance node; The first thread suspension module is used to receive the first suspension message sent by the instance node and suspend the first working thread that has not yet ended. The first thread suspension module is specifically used for: When the first suspend message is received from the instance node, a suspend flag is generated on the first scheduler. Based on the scheduler, the first unfinished worker thread on the first handle object is determined; When the unfinished first worker thread detects the suspension flag, the unfinished first worker thread is suspended, and the execution context of the first handle object is saved based on the first scheduler. The data storage node is also used to receive a second subtask sent by the instance node; allocate a preset number of second worker threads to each second subtask and start execution; send a first number of second management results after execution to the instance node; receive a second suspension message sent by the instance node and suspend the unfinished second worker threads.

11. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores computer instructions that are used to cause a processor to execute the distributed database management method according to any one of claims 1-6.