Unlock instant, AI-driven research and patent intelligence for your innovation.
Kudu database data equalization system based on size and implementation method
What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology for balancing systems and databases, applied in the field of databases, to achieve a wide range of applications
Active Publication Date: 2020-05-12
INSPUR SOFTWARE CO LTD
View PDF9 Cites 0 Cited by
Summary
Abstract
Description
Claims
Application Information
AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology
Problems solved by technology
When the cluster is normal, the distribution of the created tables among the Tablet Servers will not change, so as the data is written, the size of each Tablet Server may be unbalanced due to data distribution, which will lead to storage hotspots
Method used
the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more
Image
Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
Click on the blue label to locate the original text in one second.
Reading with bidirectional positioning of images and text.
Smart Image
Examples
Experimental program
Comparison scheme
Effect test
Embodiment 1
[0060] The size-based database data equalization system of kudu of the present invention includes,
[0061] The data balance condition detection module is used to detect whether to perform data balance operation; the working process is as follows:
[0062] (1) Determine whether there is an ongoing migration task:
[0063] a. If there is a task being migrated, skip to step (4);
[0064] b. If there is no ongoing migration task, perform step (2);
[0065] (2) Calculate the difference between the nodes that occupy the largest and smallest disk space in the current situation, and determine whether the difference exceeds the threshold (threshold is the set value, such as 20%): the threshold refers to the largest data skew Value, the maximum value of data skew size is the data difference between the largest node and the smallest node occupying disk space, and can be freely specified according to the specific conditions of the disk.
[0066] a. If the difference does not exceed the ...
Embodiment 2
[0080] as attached figure 1 As shown, the size-based database data balance implementation method of kudu of the present invention, the implementation method steps are as follows:
[0081] S1. The cache acquires the Table being migrated;
[0082] S2. Use the data balance condition detection module to determine whether there is a migration task being executed:
[0083] a. If there is a task being migrated, go to step S10;
[0084] b. If there is no ongoing migration task, execute step S3;
[0085] S3. Calculate the difference between the nodes that occupy the largest and smallest disk space in the current situation, and determine whether the difference exceeds the threshold:
[0086] a. If the difference does not exceed the threshold, jump to step S10;
[0087] b. If the difference exceeds the threshold, execute step S4;
[0088] S4. Obtain the source host with the largest disk usage;
[0089] S5. Obtain the largest Table of the source host, and use the tablet selection mo...
the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More
PUM
Login to View More
Abstract
The invention discloses a kudu database data equalizationsystem based on size and an implementation method, and relates to the database field, the technical problem to be solved by the invention is how to realize size-based database data equalization of kudu. According to the adopted technical scheme, the system structurally comprises a data equalization condition detection module, a to-be-migrated Tablet selection module and a Tablet migration execution module; the data equalization condition detection module is used for detecting whether data equalization operation is executed or not; ; theto-be-migrated Tablet selection module is used for selecting a to-be-migrated Tablet and a migrated node; and the Tablet migration execution module is used for executing actual data migration. The invention also discloses a kudu database data equalization method based on size.
Description
technical field [0001] The invention relates to the field of databases, in particular to a kudu size-based database data balancingsystem and an implementation method. Background technique [0002] The Hadoop ecosystem has many components, each with different functions. In real scenarios, users often need to deploy many Hadoop tools at the same time to solve a problem. For example, users need to use Hbase's fast insert and fast read random access features to import data, and users use HDFS / Parquet+Impala / Hive to query and analyze very large data sets. Many companies have successfully deployed the HDFS / Parquet+HBase hybrid architecture. However, this architecture is more complicated, and it is also very difficult to maintain, and it will also cause data delays. Massive structured storage expects to store structured data with a simple architecture, achieve the effect of fast import and fast query of Hbase, analysis of Parquet super large data, and solve the data delay proble...
Claims
the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More
Application Information
Patent Timeline
Application Date:The date an application was filed.
Publication Date:The date a patent or application was officially published.
First Publication Date:The earliest publication date of a patent with the same application number.
Issue Date:Publication date of the patent grant document.
PCT Entry Date:The Entry date of PCT National Phase.
Estimated Expiry Date:The statutory expiry date of a patent right according to the Patent Law, and it is the longest term of protection that the patent right can achieve without the termination of the patent right due to other reasons(Term extension factor has been taken into account ).
Invalid Date:Actual expiry date is based on effective date or publication date of legal transaction data of invalid patent.