The present invention implements
parallel processing in a
Database Management System. The present invention provides the ability to locate transaction and
recovery information at one location and eliminates the need for read locks and two-phased commits. The present invention provides the ability to dynamically partition row sources for
parallel processing. Parallelism is based on the ability to parallelize a row source, the partitioning requirements of consecutive row sources and the entire row source tree, and any specification in the
SQL statement. A Query Coordinator assumes control of the
processing of a entire query and can execute serial row sources. Additional threads of control, Query
Server, execute a parallel operators. Parallel operators are called data flow operators (DFOs). A DFO is represented as structured
query language (
SQL) statements and can be executed concurrently by multiple processes, or query slaves. A central scheduling mechanism, a data flow scheduler, controls a parallelized portion of an
execution plan, and can become invisible for serial execution. Table queues are used to partition and transport rows between sets of processes. Node linkages provide the ability to divide the plan into independent lists that can each be executed by a set of query slaves. The present invention maintains a bit vector that is used by a subsequent producer to determine whether any rows need to be produced to its consumers. The present uses states and a count of the slaves that have reached these states to perform its scheduling tasks.