Unlock instant, AI-driven research and patent intelligence for your innovation.

A System for Optimizing Parallel Data Loading Performance of Array Database

A data loading and database technology, applied in the system field, can solve the problems of performance degradation of data loading methods, unsuitable for fast loading of scientific data, etc.

Active Publication Date: 2019-03-12
GUIZHOU UNIV +1
View PDF5 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, due to the traditional relational data loading strategies adopted by these data loading mechanisms, they are not suitable for quickly loading scientific data usually represented by array models into distributed parallel systems, especially in the data size and dimension of these scientific data When the value becomes very large, the performance degradation of the traditional data loading method will be more obvious

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A System for Optimizing Parallel Data Loading Performance of Array Database
  • A System for Optimizing Parallel Data Loading Performance of Array Database
  • A System for Optimizing Parallel Data Loading Performance of Array Database

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0038] Example 1. A system for optimizing the parallel loading performance of array database data, including a monitoring engine 1, the monitoring engine 1 collects the monitoring information of the database cluster 2, and feeds the monitoring information to the FASTLoad system component 3, and the FASTLoad system component 3 executes the data to be loaded according to the monitoring information assign method dataload;

[0039] The monitoring engine 1 is used to monitor the database node 7 in real time;

[0040] Database cluster 2 is used for data loading execution;

[0041] The FASTLoad system component 3 is used for data segmentation processing and loading.

[0042] Described FASTLoad system component 3 comprises data partition engine 5, and data partition engine 5 analyzes the data to be loaded, divides the data file to be loaded into sub-files, then loads sub-files to data loading coordination engine 4, and data loading coordination engine 4 according to The monitoring ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a system for optimizing the parallel data loading performance of array databases. The system comprises a monitoring engine (1), wherein the monitoring engine (1) is used for collecting monitoring information of a database cluster (2), returning the monitoring information to a FastLoad system component (3) and monitoring database nodes (7) in real time; the FastLoad system component (3) is used for carrying out distribution method-based data loading on to-be-loaded data on the basis of the monitoring information, partitioning and loading the data; and the database cluster (2) is used for carrying out the data loading. The system disclosed by the invention can meet the loading requirements of array data and is a scientific data management tool system capable of carrying out data parallel loading on large-scale array model-based data; and the performance of the system is 4-6 times of that of original data loading mechanisms of database systems.

Description

technical field [0001] The invention relates to a system, in particular to a system for optimizing the parallel data loading performance of an array database. Background technique [0002] In the era of big data, scientific data in many disciplines such as bioinformatics, meteorology or astronomical sciences has grown very rapidly. If you want to analyze and process these data, you need to import the data into the database first. As the amount of data increases, the performance of the data loading method becomes more and more important. [0003] SciDB is an open source scientific database system for scientific data management and analysis. It uses the array data (Array) model, mainly developed by Stonebraker and sponsored by Paradigm4. The original intention of its design is to solve scientific problems such as large amount of data and data heredity in scientific research. Different from traditional DBMS, benefiting from the array data model, SciDB can provide large-scale...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/21G06F9/445G06F9/50
CPCG06F9/44521G06F9/5038G06F16/217
Inventor 李晖陈梅李宏源邱能俊
Owner GUIZHOU UNIV