Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

91 results about "Parallel database" patented technology

A parallel database system seeks to improve performance through parallelization of various operations, such as loading data, building indexes and evaluating queries. Although data may be stored in a distributed fashion, the distribution is governed solely by performance considerations. Parallel databases improve processing and input/output speeds by using multiple CPUs and disks in parallel. Centralized and client–server database systems are not powerful enough to handle such applications. In parallel processing, many operations are performed simultaneously, as opposed to serial processing, in which the computational steps are performed sequentially. Parallel databases can be roughly divided into two groups, the first group of architecture is the multiprocessor architecture, the alternatives of which are the following...

Ultra-shared-nothing parallel database

An ultra-shared-nothing parallel database system includes at least one master node and multiple slave nodes. A database consisting of at least one fact table and multiple dimension tables is partitioned and distributed across the slave nodes of the database system so that queries are processed in parallel without requiring the transfer of data between the slave nodes. The fact table and a first dimension table of the database are partitioned across the slave nodes. The other dimension tables of the database are duplicated on each of the slave nodes and at least one of these other dimension tables is partitioned across the slave nodes.
Owner:MICROSOFT TECH LICENSING LLC

System and method for automating data partitioning in a parallel database

A system for automating data partitioning in a parallel database includes plural nodes connected in parallel. Each node includes a database server and two databases connected thereto. Each database server includes a query optimizer. Moreover, a partitioning advisor communicates with the database server and the query optimizer. The query optimizer and the partitioning advisor include a program for recommending and evaluating data table partitions that are useful for processing a workload of query statements. The data table partitions are recommended and evaluated without requiring the data tables to be physically repartitioned.
Owner:AIRBNB

Data partitioning method for distributed parallel database system

ActiveCN101916261APartially completeAvoid time-consuming network transmissionDigital data information retrievalSpecial data processing applicationsData setData stream
The invention discloses a data partitioning method for a distributed parallel database system. The method comprises the following steps of: establishing a fact table and a dimension table according to the constructed distributed parallel database system; inserting records of the dimension table and the fact table on different nodes according to a partitioning rule; copying the records of the dimension table to the nodes of the fact table; and deleting and updating the data. When a data set or data stream is imported or inserted into the distributed database system in a partitioning way, the relation between tables defined by a database schema can be met on each node, particularly the primary key-foreign key restrictive condition, so the data on each node has local completeness of the data. For the query processing on the connection between the tables by using the primary key-foreign key restrictive condition, the data of each node has the local completeness on the query, so dynamic repartitioning of data between the nodes is not needed; and thus the method has the advantages of preventing time-consuming network transmission of the data, shortening the query response time and improving the query efficiency.
Owner:BORQS BEIJING +2

Load balancing for complex database query plans

Methods, systems, and apparatuses for improving performance of parallel database query plans are described. An exchange operator is positioned in a query tree. A child operator of the exchange operator is parallelized into a plurality of parallel child operators, each of the parallel child operators coupled to the exchange operator in a respective branch of a plurality of parallel branches of the query tree. An output of each of the plurality of parallel child operators may be buffered at the exchange operator. Furthermore, child operators of the plurality of parallel child operators may also be parallelized. Query plans of any form and containing any number of operators may be parallelized in this manner. Any number of parallel branches may be used, independent of the number of operators in the original plan. The parallelized query plans achieve effective load balancing across all branches.
Owner:IANYWHERE SOLUTIONS

Optimizing queries of parallel databases

The present invention extends to methods, systems, and computer program products for optimizing queries of parallel databases. Queries can be partially optimized at an optimizer that is unaware of its use to optimize queries for parallel processing. The optimizer can produce a data structure (e.g., a SQL Server MEMO) that encapsulates a logical serial plan search space. The logical serial plan search space may not incorporate any notion of parallelism into the plan space itself. A parallel-aware optimizer can parallelize the logical serial plan search space by augmenting the data structure (e.g., transforming the SQL Server MEMO into a parallel MEMO). Augmentation can be with data movement operations that move data associated one or more compute nodes in a distributed architecture. Cost estimates can be calculated for the operations contained in the parallelized data structure. The parallel plan with the lowest estimated cost can be selected for the query.
Owner:MICROSOFT TECH LICENSING LLC

Parallel database query processing for non-uniform data sources via buffered access

An apparatus, program product and method utilize a dynamically-populated query buffer to facilitate the handling of at least a portion of a database query in parallel. A query is implemented using at least first and second portions, where the second portion of the query is executed in parallel using a plurality of threads. The first portion of the query is executed to dynamically populate a query buffer with records from a data source, and the plurality of threads that execute the second portion of the query are specified to the query buffer so that the effective data source for the second portion of the query comprises the records that are dynamically populated into the query buffer.
Owner:IBM CORP

Database System Providing Self-Tuned Parallel Database Recovery

A database system providing self-tuned parallel database recovery is described. In one embodiment, for example, in a database system, a method is described for performing recovery operations using an optimal number of recovery threads, the method comprises steps of: (a) spawning an initial recovery thread to perform recovery operations; (b) measuring I / O (input / output) performance with the initial recovery thread; (c) spawning a subsequent recovery thread to perform recovery operations; (d) measuring I / O performance with the subsequent recovery thread; and (e) as long as I / O performance does not degrade beyond a preselected percentage, repeating steps (c) and (d) for spawning a desired number of additional recovery threads. In another embodiment, the database system auto-tunes the cache during performance of database recovery operations to optimize the performance of recovery operations.
Owner:SYBASE INC

Data management method for accessing data storage area based on characteristic of stored data

There is provided a data management method for managing data stored in a parallel database system in which a plurality of data servers manage data. The parallel database system manages: correspondence information between a characteristic of the data and each of the plurality of data servers that manages the data; and a data area corresponding to the characteristic of the data. The data management method comprising the steps of: extracting the characteristic of the data from data to be stored in the data area; storing the data in the data area based on the extracted characteristic of the data; specifying a corresponding data area based on the characteristic of the data stored in the data area by referring to the correspondence information; and accessing, by each of the plurality of data servers, the specified data area.
Owner:HITACHI LTD

Distributed database parallel processing system

The invention discloses a distributed database parallel processing system. The technical scheme is: a distributed parallel database controllable redundancy structure is adopted, and different types of database server nodes are used (full database servers, non-full database servers and void database servers). The database cluster mainly comprises non-full database servers which store partial data of the database cluster to share the work load; and multiple non-full servers constitute multiple complete data sets through the void database servers which ensure the completion of the data sets in function. The full database server can independently provide the completion of the data sets. The void database servers select server nodes in the database cluster to form a star network covering all data and used for retrieving all data in the database cluster and dynamically linking the data areas on multiple database server nodes.
Owner:熊凡凡

Ultra-shared-nothing parallel database

An ultra-shared-nothing parallel database system includes at least one master node and multiple slave nodes. A database consisting of at least one fact table and multiple dimension tables is partitioned and distributed across the slave nodes of the database system so that queries are processed in parallel without requiring the transfer of data between the slave nodes. The fact table and a first dimension table of the database are partitioned across the slave nodes. The other dimension tables of the database are duplicated on each of the slave nodes and at least one of these other dimension tables is partitioned across the slave nodes.
Owner:MICROSOFT TECH LICENSING LLC

Multi-threading, multi-tasking architecture for a relational database management system

A parallel processing architecture for a relational database management system (RDBMS) that supports both a process model operating system and a thread model operating system. The RDBMS is implemented as a shared nothing, single database image utilizing Parallel Database Extensions (PDEs) that insulate the RDBMS from the specifics of the operating system and that provide the necessary techniques for accessing common memory segments.
Owner:TERADATA US

Method and system for data processing with parallel database systems

A database processing system including a plurality of partitioned databases. Data processing is performed with pieces of information processing apparatus associated with each of the partitioned databases respectively. In response to a query, a status table indicating availability of each information processing apparatus is read from the storage. Of the pieces of information processing apparatus for processing the received query, at least a serviceable one is determined as a process request destination. A process request corresponding to the query is transmitted to the information processing apparatus determined as the process request destination. The process request is received through a communication unit, and data on the database are consequently processed. A processing result is transmitted to a transmitting source through the communication unit.
Owner:HITACHI LTD

Data analytics platform over parallel databases and distributed file systems

Performing data analytics processing in the context of a large scale distributed system that includes a massively parallel processing (MPP) database and a distributed storage layer is disclosed. In various embodiments, a data analytics request is received. A plan is created to generate a response to the request. A corresponding portion of the plan is assigned to each of a plurality of distributed processing segments, including by invoking as indicated in the assignment one or more data analytical functions embedded in the processing segment.
Owner:EMC IP HLDG CO LLC

Automated partitioning in parallel database systems

InactiveUS20120117065A1Improve partition configuration cost determination efficiencyReduce the cost of determiningDatabase management systemsDigital data processing detailsQuery optimizationWorkload
Embodiments are directed to determining optimal partition configurations for distributed database data and to implementing parallel query optimization memo data structure to improve partition configuration cost estimation efficiency. In an embodiment, a computer system accesses a portion of database data and various database queries for a given database. The computer system determines, based on the accessed database data and database queries, a partition configuration search space which includes multiple feasible partition configurations for the database data and a workload of queries expected to be executed on that data. The computer system performs a branch and bound search in the partition configuration search space to determine which data partitioning path has the lowest partitioning cost. The branch and bound search is performed according to branch and bound search policies. The computer system also outputs the partition configuration with the determined lowest partitioning cost.
Owner:MICROSOFT TECH LICENSING LLC

Automated partitioning in parallel database systems

InactiveUS8326825B2Reduce the cost of determiningImprove partition configuration cost determination efficiencyDatabase management systemsDigital data processing detailsQuery optimizationWorkload
Embodiments are directed to determining optimal partition configurations for distributed database data and to implementing parallel query optimization memo data structure to improve partition configuration cost estimation efficiency. In an embodiment, a computer system accesses a portion of database data and various database queries for a given database. The computer system determines, based on the accessed database data and database queries, a partition configuration search space which includes multiple feasible partition configurations for the database data and a workload of queries expected to be executed on that data. The computer system performs a branch and bound search in the partition configuration search space to determine which data partitioning path has the lowest partitioning cost. The branch and bound search is performed according to branch and bound search policies. The computer system also outputs the partition configuration with the determined lowest partitioning cost.
Owner:MICROSOFT TECH LICENSING LLC

Database management system cluster node subtasking data query

A cluster node within a cluster of a highly parallel database system includes at least one processing unit that runs a set of first tier threads and a set of second tier threads, a storage disk drive, and a networking interface. When a first tier thread receives a task, it divides the task into a set of subtasks. The first tier thread also assigns the set of subtasks between a subset of the set of second tier threads for execution. Each second tier thread within the subset processes the one or more subtasks it is assigned to. When the task is a work, the subtasks are work units. When the task is a work unit, the subtasks are subwork units.
Owner:OCIENT INC

Implementation method for operator reuse in parallel database

The invention discloses an implementation method for operator reuse in a parallel database, comprising the following steps of: step 1, generating a serial query plan for query through a normal query planning method, wherein the query plan is a binary tree structure; step 2, executing the query plane by scanning from top to bottom, searching materialized reusable operators, changing the query plane structure, and changing thread level materialized operators into global reusable materialized operators; step 3, parallelizing the query plan changed in the step 2, and generating a plan forest for parallel execution of a plurality of threads; step 4, executing global reusable operator combination on the plan forest generated in the step 3, and generating a directed graph plan for the materialized reusable operators capable of being executed by the plurality of threads in parallel; step 5, executing own plan part in the directed graph by each thread in parallel, wherein the thread which executes the global reusable operator firstly is called a main thread, the main thread locks the global reusable operator and truly executes the operator and the plan of the operator, and other threads wait; step 6, unlocking the global reusable operator by the main thread after execution, wherein other threads start to read data from the global reusable operator and continue to execute own plan tree;and step 7, releasing the materialized data of the operator by the main thread after all the plans read the data of the global reusable operator.
Owner:天津神舟通用数据技术有限公司

Data query method and device for parallel database

The invention discloses a data query method and device for a parallel database. The method comprises the steps of: respectively performing target data grouping aggregation on a target data table according to corresponding associated fields between the target data table and other data tables on each database node; respectively performing data re-partitioning on corresponding grouping aggregation results and corresponding other data tables in a Hash manner according to the corresponding associated fields on each database node; collecting data re-partitioning results of the grouping aggregation results and data re-partitioning results of the other data tables on each database node into a target database node; and performing target data connecting aggregation on the data re-partitioning results of the grouping aggregation results and the data re-partitioning results of the other data tables on the target database node. According to the method provided by the invention, data aggregation query can be realized, meanwhile, the parallelism of the query is improved, the resource utilization rate of a cluster is increased, the network cost is reduced, and the query performance is improved.
Owner:DAWNING INFORMATION IND BEIJING

System and method of applying VR device to KTV karaoke

The invention discloses a system and method of applying a VR device to KTV karaoke, characterized in that a central processing module (A0) identifying the program request motion performed by a user through a program request module (A1), searching for program data in a parallel database (A2), and determining a data format; and transcoding common 2D data into 3D data if the data format is common 2D, and synchronously outputting 3D or 3D holographic data songs to a live audio and video device and the user wearing a VR device through an output model (A3). The system and method can connect a user vision system with a motion perception system, so as to realize lifelike virtual 3D panoramic karaoke (for example, one can raise the head to see the sky, and turns around to see a band). In addition, the system and method can convert a common 2D video into a 3D IMAX effect video for users. The system and method are compatible with an original karaoke system without huge technical and financial investment, meanwhile provide a brand new karaoke singing mode, and thereby possess high market popularization values.
Owner:腾叙然

System and method for optimizing large database management systems using bloom filter

A large highly parallel database management system includes thousands of nodes storing huge volume of data. The database management system includes a query optimizer for optimizing data queries. The optimizer estimates the column cardinality of a set of rows based on estimated column cardinalities of disjoint subsets of the set of rows. For a particular column, the actual column cardinality of the set of rows is the sum of the actual column cardinalities of the two subsets of rows. The optimizer creates two respective Bloom filters from the two subsets, and then combines them to create a combined Bloom filter using logical OR operations. The actual column cardinality of the set of rows is estimated using a computation from the combined Bloom filter.
Owner:OCIENT INC

Load balancing management system of large-power-network real-time database system

The invention discloses a load balancing management system of a large-power-network real-time database system. The real-time database system is a parallel database system, and table files of the real-time database system are divided into M sub-tables and are stored on Q data nodes. The load balancing management system comprises a table file division module, a storage module, a heartbeat module, a metadata management module and a load balancing judgment module. The table file division module is used for dividing the table files according to the number of the nodes and sizes of the table files, the storage module is used for storing the divided table files into the data nodes, the heartbeat module is used for communicating the data nodes with management nodes, the metadata management module is used for managing the correspondence of the sub-tables and the data nodes, and the load balancing judgment module is used for judging whether the load of the system is balanced. The load balancing management system of the large-power-network real-time database system not only guarantees high concurrency and response of the parallel real-time database, but also guarantees load balancing of nodes in the cluster and load balancing of each database table file.
Owner:CHINA ELECTRIC POWER RES INST +2

Formulating global statistics for distributed databases

The present invention extends to methods, systems, and computer program products for formulating global statistics for parallel databases. In general, embodiments of the invention merge (combine) information in multiple compute node level histograms to create a global histogram for a table that is distributed across a number of compute nodes. Merging can include aligning histogram step boundaries across the compute node histograms. Merging can include aggregating histogram step-level information, such as, for example, equality rows and average range rows (or alternately equality rows, range rows, and distinct range rows), across the compute node histograms into a single global step. Merging can account for distinct values that do not appear at one or more compute nodes as well as distinct values that are counted at multiple compute nodes. A resulting global histogram can be coalesced to reduce the step count.
Owner:MICROSOFT TECH LICENSING LLC

Method and system for data processing with data replication for the same

To guarantee that contents of an update by a transaction in a parallel database management system. A database management system includes a replica database management unit that manages the replica database, records synchronous information at a timing at which one of the transaction generated is valid in every database management unit and other transactions are invalid in every database management unit, extracts update information and the synchronous information for creating the replica database from the update logs, and causes the replica database management unit to import the update information of each transaction that has become valid before the synchronous information was recorded.
Owner:HITACHI LTD

Method and device for querying double-transcript parallel database

The invention provides a method and a device for querying a double-transcript parallel database, and belongs to the technical field of databases. The method comprises the following steps: obtaining a query request and data storage unit information; forming a plurality of execution plans according to the query request and the data storage unit information; calculating resource occupation rates of the execution plans according to the resource utilization rates of execution nodes and the estimated data transmission quantity of the execution nodes in the execution plans; selecting one execution plan from the execution plans according to the resource occupation rate; and querying data according to the selected execution plan. The method provided by the invention is used for calculating the resource occupation rates of the execution plans according to the resource utilization rates of the execution nodes and the estimated data transmission quantity of the execution nodes in the execution plans and selecting one execution plan from the execution plans according to the resource occupation rate, and since the resource occupation condition influences the data transfer time consumption and the data query efficiency, the data transfer time consumption of the execution plan in the final query is short, and the data query efficiency is high.
Owner:DAWNING INFORMATION IND BEIJING +1

Distributed parallel database system based on Infiniband network and data processing method

PendingCN109933631ABreak through the bandwidth bottleneckHigh performance serviceDatabase distribution/replicationSystem configurationData node
The invention discloses a distributed parallel database system based on an Infiniband network and a data processing method. The system comprises a scheduling cluster, a data cluster and a management cluster. Each cluster is composed of at least two data nodes, and the data nodes are connected through the Infiniband network. The data processing method comprises a data distribution method, a data loading method, a data query method and a data de-duplication method. Aiming at the current situation that the number of distributed parallel processing database cluster node servers is large, the characteristics of high bandwidth, low delay and low memory of an Infiniband network are fully utilized, and a mode of applying the network to data is designed, including database system configuration, data loading and storage, data query, data calculation and the like. The method is high in universality, breaks through network bandwidth bottleneck, storage space and calculation delay limitation of a current database system, guarantees high availability of the system, and provides high-performance service for users.
Owner:CHINA REALTIME DATABASE +3

Method and system for providing a default role for a user in a remote database

A method and system for assigning a user default role in a remote database of a database system is disclosed. The method and system comprises the steps of activating a default role for the remote database and utilizing the activated default role to access data within the remote database. Accordingly, a system and method is provided that allows a user to access a remote database via a default role. The system and method only requires that default role information be stored in a current role database structure and be accessible by a user. In so doing, a user can easily access information in the remote database through the default role. Therefore, this system is compatible and easily implemented utilizing existing parallel database systems.
Owner:IBM CORP
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products