Query scheduling in a parallel-processing database system

a database system and query scheduling technology, applied in the field of database management, can solve the problems of increasing the amount of data generated by companies, agencies, and other organizations, taxing the capabilities of current relational database management systems, and inhibiting the client in its use of requested information, etc., to achieve cost savings, reduce the effect of implementation costs, and reduce the amount of data generated

Inactive Publication Date: 2007-09-06
LEXISNEXIS RISK DATA MANAGEMENT
View PDF10 Cites 53 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Benefits of technology

[0011] The present invention mitigates or solves the above-identified limitations in known solutions, as well as other unspecified deficiencies in known solutions. A number of advantages associated with the present invention are readily evident to those skilled in the art, including economy of design and resources, transparent operation, cost savings, etc.

Problems solved by technology

The rapid increase in the amount of data generated by companies, agencies, and other organizations has taxed the capabilities of current relational database management systems (RDMSs).
This processing lag often prevents access to the data in a timely manner, thereby inhibiting the client in its use of the requested information.
While SMP systems have the potential to improve the efficiency of database operations on large databases by removing the processor as the bottleneck, current implementations have a number of limitations.
For one, the shared memory / disk storage often becomes the limiting factor as a number of processors attempt to access the shared memory / disk storage at the same time.
Simultaneous memory / disk storage accesses in such systems typically result in the placement of one or more of the processors in a wait state until the memory / disk storage is available.
This delay often reduces or eliminates the benefit achieved through the parallelization of the database operation.
Further, the shared memory / disk storage can limit the scalability of the SMP system, where many such systems are limited to eight processors or less.
Another limitation common to SMP database systems is the cost of implementation.
SMP systems, as a result the underlying architecture needed to connect multiple processors to shared resources, are difficult to develop and manufacture, and are, therefore, often prohibitively expensive.
In many cases, the SMP database systems implement a proprietary SMP design, requiring the client of the SMP database system to contract with an expensive specialist to repair and maintain the system.
The development of operating system software and other software for use in the SMP database system is also often complex and expensive to develop.
The performance of parallel processing database systems, SMP or otherwise, is often limited by the underlying software process used to perform the database operation.
It will be appreciated by those skilled in the art that the use of an interpreted language is inherently inefficient from a processing standpoint.
For one, the step of interpreting and then executing a predefined library code segment at run-time often requires considerable processing effort and, therefore, reduces overall efficiency.
Secondly, interpreters often use a predetermined machine-level code sequence for each instruction, thereby limiting the ability to optimize the code on an instruction-by-instruction basis.
Thirdly, because interpreters consider only one node (and its related child nodes) at a time, interpreters typically are unable to globally optimize the database operation by evaluating the instructions of the database operation as a whole.
Current techniques for data storage in conventional parallel-processing database systems also exhibit a number of limitations.
As noted above, current parallel-processing database systems often implement shared storage resources, such as memory or disk storage, which result in bottlenecks when processors attempt to access the shared storage resources simultaneously.
These implementations, however, often have an inefficient or ineffective mechanism for failure protection when one or more of the storage devices fail.
When a failure occurs, the storage device would have to be reinitialized and then repopulated with data, delaying the completion of the database operation.
Additionally, the data may be inefficiently distributed among the storage devices, resulting in data spillover or a lack of proper load-balancing among the processing nodes.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Query scheduling in a parallel-processing database system
  • Query scheduling in a parallel-processing database system
  • Query scheduling in a parallel-processing database system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0036] The following description is intended to convey a thorough understanding of the present invention by providing a number of specific embodiments and details involving parallel processing of database queries. It is understood, however, that the present invention is not limited to these specific embodiments and details, which are exemplary only. It is further understood that one possessing ordinary skill in the art, in light of known systems and methods, would appreciate the use of the invention for its intended purposes and benefits in any number of alternative embodiments, depending upon specific design and other needs.

[0037] A processor is generally understood in the art to include any of a variety of digital circuit devices adapted to manipulate data or other information by performing one or more tasks embodied as one or more sets of instructions executable by the digital circuit device. Processors typically include some form of an arithmetic logical unit (ALU) adapted to p...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

A system and method for scheduling database operations to one or more databases in a parallel-processing database system are described herein. After a query server generates a dynamic-link library (DLL) or other executable representative of one or more database operations to a database, the query server notifies a scheduling services module of the generation of the DLL and submits the DLL to a query agent. The query agent notifies the scheduling services module of its receipt of the DLL. Based on any of a variety of considerations, the scheduling services module schedules a time of execution for the DLL by one or more processing matrices that store the database. At the scheduled time, the scheduling services module directs the query agent to submit the DLL to the indicated processing matrices. The scheduling services module also can be adapted to monitor the execution of previously submitted DLLs by one or more processing matrices and adjust the scheduled times of execution for subsequent DLLs accordingly.

Description

FIELD OF THE INVENTION [0001] The present invention relates generally to database management and more particularly to parallel processing of database queries in a parallel processing system. BACKGROUND OF THE INVENTION [0002] The rapid increase in the amount of data generated by companies, agencies, and other organizations has taxed the capabilities of current relational database management systems (RDMSs). To illustrate, some organizations have access to databases having hundreds of millions, and even billions, of records available through a RDMS. In such RDMSs, certain database operations (e.g., database joins, complex searches, extract-transform-load (ETL) operations, etc.) can take minutes, hours, and even days to process using current techniques. This processing lag often prevents access to the data in a timely manner, thereby inhibiting the client in its use of the requested information. [0003] In response to the increasing lag time resulting from increased database sizes, sof...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(United States)
IPC IPC(8): G06F17/30G06F7/00
CPCG06F17/30463Y10S707/99933G06F17/30545G06F16/24542G06F16/2471
Inventor BAYLISS, DAVIDCHAPMAN, RICHARDSMITH, JAKEPOULSEN, OLEHALLIDAY, GAVINHICKS, NIGEL
Owner LEXISNEXIS RISK DATA MANAGEMENT
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products