Storage performance optimization

a storage performance and optimization technology, applied in the field of data storage management, can solve the problems of inability to provide both economically, prohibitively expensive to provide performance in situations involving high data volumes or io intensive applications, and inability to provide data throughput capabilities, etc., to achieve the effect of optimizing (or maximizing) or maximizing

Inactive Publication Date: 2011-05-05
PARACCEL
View PDF4 Cites 8 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Benefits of technology

[0005]One aspect of the invention relates to systems and methods that seek to optimize (or at least enhance) data throughput in data warehousing environments by connecting multiple servers having local storages with a designated ESS, such as, for example, a SAN. According to another aspect of the invention, the systems and methods preserve a full reference copy of the data in a protected environment (e.g., on the ESS) that is fully available. According to another aspect of the invention, the systems and methods maximize (or at least significantly enhance) overall IO potential performance and reliability for efficient and reliable system resource utilization.
[0006]Other aspects and advantages of the invention include providing a reliable data environment in a mixed storage configuration, compensating and adjusting for differences in disk (transfer) speed between mixed storage components to sustain high throughput, supporting different disk sizes on server configurations, supporting high performance FR and DR in a mixed storage configuration, supporting dynamic reprovisioning as servers are added to and removed from the system configuration and supporting database clustering in which multiple servers are partitioned within the system to support separate databases, applications or user groups, and / or other enhancements. Servers within the data warehousing environment may be managed in an autonomous, or semi-autonomous, manner, thereby alleviating the need for a sophisticated central management system.
[0008]The ESS may hold a copy of the entire database. This copy may be kept current in real-time, or near real-time. As such, the copy of the database held by the ESS may be used as a full reference copy for FR or DR on portions of the database stored within the local storage of individual servers. Since the copy of the ESS is kept continuously (or substantially so) current, “snapshots” of the database may be captured without temporarily isolating the ESS artificially from the servers to provide a quiescent copy of the database. By virtue of the centralized nature of the ESS, the database copy may be maintained with relatively high security and / or high availability (e.g., due to standard replication and striping policies). In some implementations, the ESS may organize the data stored therein such that data that is accessed more frequently by the servers (e.g., data blocks not stored within the local storages) is stored in such a manner that it can be accessed efficiently (e.g., for sequential read access). In some instances, the ESS may provide a backup copy of portions of the database that are stored locally at the servers.
[0010]The servers may form a network of server computer nodes, where one or more leader nodes communicate with the client to acquire queries and deliver data for further processing, such as display, and manages the processing of queries by a plurality of compute node servers. Individual servers process queries in parallel fashion by reading data simultaneously from local storage and the ESS to enhance I / O performance and throughput. The proportions of the data read from local storage and the ESS, respectively, may be a function of (i) data throughput between a given server and the corresponding local storage, and (ii) data throughput between the ESS and the given server. In some implementations, the proportions of the data read out from the separate sources may be determined according to a goal of completing the data read out from the local storage and the data read out from the ESS at approximately the same time. Similarly, the given server may adjust, in an ongoing manner, the portion of the database that is stored in the corresponding local storage in accordance with the relative data throughputs between the server and the local storage and between the server and the ESS (e.g., where the throughput between the server and the local storage is relatively high compared to the throughput between the server and the ESS, the portion of the database stored on the local storage may be adjusted to be relatively large). In some implementations, the individual servers may include one or more of a database engine, a distributed data manager, a I / O system, and / or other components.
[0015]In some embodiments, snapshots of the database may be captured from the ESS. A snapshot may include an image of the database that can be used to restore the database to its current state at a future time. A method of capturing a snapshot of the database may include, monitoring a passage of time since the previous snapshot, if the amount of time since the previous snapshot has breached a predetermined threshold, monitoring the database to determine whether a snapshot can be performed, and performing the snapshot. Determining whether a snapshot can be performed may include determining whether any queries are currently being executed on the database and / or determining whether any queries being executed update the persistent data within the database. This may enhance the capture of snapshots with respect to system in which the database must isolated from queries, updated from temporary data storage, and then imaged to capture a snapshot because snapshots can be captured during ongoing operations at convenient intervals (e.g., when no queries that update the data are be performed).

Problems solved by technology

A single access point is typically configured in order to provide security (e.g., an ESS) or performance (e.g., access locally), but usually is not able to provide both economically.
While an ESS can guarantee security, it may be prohibitively expensive to also provide performance in situations involving high data volumes or IO intensive applications.
Conversely, local storage systems typically have high data throughput capabilities, but are not able to store high data volumes effectively or guarantee security without sacrificing storage capacity through excessive redundancy.
Parallel warehousing and DM environments present both opportunities and additional overhead in environments that rely on single storage configurations.
Such systems can double or quadruple storage requirements, hence reduce capacity on each server, which can lead to a proliferation of servers or reduced system capacity.
Shared-storage parallel database systems (e.g., implementing an ESS) typically rely on centralized high-availability and security services, which reduces the FR and DR infrastructure complexity of parallel solutions, but at the cost of reduced data throughput.
This may lead to inefficient use of the parallel systems, limit the expansion capabilities of the system, significantly reduce the system's ability to scale linearly to support increasing data volumes and application demands from expanded user requirements, and / or other drawbacks.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Storage performance optimization
  • Storage performance optimization
  • Storage performance optimization

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0025]FIG. 1 illustrates a system 10 configured to provide a database, in accordance with one or more implementations of the invention. System 10 may enhance access of the database by increasing overall data throughput of system 10 in processing queries on the database. System 10 may provide enhancements in one or more of security, FR, DR, and / or other aspects of the database at least in part through a mixed storage configuration. As can be seen in FIG. 1, in some implementations, system 10 may include one or more of a client 12, an ESS 14, one or more servers 16, local storage 18 corresponding to individual ones of servers 16, and / or other components.

[0026]Clients 12 may be operatively connected to servers 16, and may generate database queries that are routed to servers 16. Results of the queries (generated by processing on the server) may be sent to the querying client 16 for disposition and display processing. In some implementations, client 12 may be provided on a computing plat...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

A system and method for enhancing data throughput in data warehousing environments by connecting multiple servers having local storages with designated external storage systems, such as, for example, those provided by SANs. The system and method may preserve a full reference copy of the data in a protected environment (e.g., on the external storage system) that is fully available. The system and method may enhance overall I / O potential performance and reliability for efficient and reliable system resource utilization.

Description

RELATED APPLICATIONS[0001]This application is a divisional of U.S. patent application Ser. No. 12 / 122,579, filed May 16, 2008, and entitled “STORAGE PERFORMANCE OPTIMIZATION”, which is hereby incorporated by reference in its entirety into the present application.FIELD OF THE INVENTION[0002]The invention relates to management of data storage in a “database aware” distributed data environment where both local and remote storage systems are used simultaneously to fulfill IO requests.BACKGROUND OF THE INVENTION[0003]In traditional data warehousing and Data Mart (DM) environments, data is stored centrally on an External Storage System (ESS), such as, for example, a Storage Area Network (SAN), or locally. A single access point is typically configured in order to provide security (e.g., an ESS) or performance (e.g., access locally), but usually is not able to provide both economically. While an ESS can guarantee security, it may be prohibitively expensive to also provide performance in sit...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(United States)
IPC IPC(8): G06F17/30
CPCG06F17/30545G06F17/30445G06F16/24532G06F16/2471
Inventor ZANE, BARRY M.STEINHOFF, DAVID
Owner PARACCEL
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products