Data splitting method and device for a distributed storage system

A technology of distributed storage and data sharding, applied in database models, digital data processing, structured data retrieval, etc., can solve problems such as unbalanced access request load, inconsistent distribution, and inability to solve hotspot problems, so as to eliminate requests The effect of accessing hotspots and load balancing

Active Publication Date: 2020-03-27
BEIJING QIYI CENTURY SCI & TECH CO LTD
View PDF5 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] However, in practical applications, access to data is not uniform. That is to say, even if the index-based splitting strategy can ensure an even division of the data volume, the requested access may still be concentrated on some of the data. The distribution of on-chip data volume may not be consistent with the distribution of access requests, which makes the index-based splitting strategy invalid and cannot solve the hotspot problem, resulting in unbalanced load of access requests

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data splitting method and device for a distributed storage system
  • Data splitting method and device for a distributed storage system
  • Data splitting method and device for a distributed storage system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0062] The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some, not all, embodiments of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.

[0063] In order to achieve load balancing of access requests, embodiments of the present invention provide a data splitting method and device for a distributed storage system.

[0064] A data splitting method for a distributed storage system provided by an embodiment of the present invention will be firstly introduced below.

[0065] see figure 1 , the embodiment of the present invention provides a data splitting method of a distributed storage system, compri...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The embodiment of the invention provides a data splitting method and device for a distributed storage system. The method comprises the steps that whether query per second (QPS) of all data fragments is greater than a first preset threshold value is monitored; the data fragment with the QPS greater than the first preset threshold value is determined as a target data fragment, access requests of the target data fragment are sampled, and a sampled data stream of index key values corresponding to the sampled access requests is formed; whether the QPS of the target data fragment is greater than a second preset threshold value is judged; if yes, a median of the index key values in the sampled data stream in each preset time period is determined; a median of the index key values in a next preset time period corresponding to the current moment is obtained through prediction according to the determined median, and the median obtained through prediction is determined as a splitting point; and the target data fragment is split according to the splitting point. When the data fragments on access request hotspots are split through the scheme, load balance of the access requests can be realized.

Description

technical field [0001] The invention relates to the technical field of data storage, in particular to a data splitting method and device for a distributed storage system. Background technique [0002] In the era of big data, the storage of massive data is a key technology. With the increase of data scale, when the QPS (Query Per Second, query rate per second) of a single storage node is too large to bear the pressure of access, it is necessary to adopt a distributed storage solution and use data sharding technology to Data is distributed to different storage nodes to eliminate hotspots of access requests and achieve load balancing. Wherein, the hotspot of the access request refers to the storage node corresponding to the data fragment whose QPS is too large, and the storage node refers to the storage server, which may be a physical server or a virtual server. [0003] Among them, the traditional relational database and various Nosql (Not Only SQL, non-relational database) ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/242G06F16/2458G06F16/28
CPCG06F16/2423G06F16/2471G06F16/28
Inventor 郑浩南
Owner BEIJING QIYI CENTURY SCI & TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products