Unlock instant, AI-driven research and patent intelligence for your innovation.

Data distribution control method, system and device of a distributed storage system

A distributed storage and data distribution technology, which is applied in database distribution/replication, transmission system, electronic digital data processing, etc., can solve the problems of reducing network scale, network burden, and high cost of network equipment and architecture adjustment, and achieve improvement Read and write performance, the effect of increasing bandwidth costs

Active Publication Date: 2022-05-06
ALIBABA GRP HLDG LTD
View PDF12 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Using the same distribution strategy, data with high availability requirements may not be able to meet the requirements of defending against large-scale failures, while data with low availability requirements cannot meet its requirements for data read and write speed, and will also bring heavy traffic to the network. burden
[0007] In addition, related technologies adopt a single-level fault-tolerant domain division. If they are all divided according to racks, it is difficult to show the difference between fault-tolerant domains at different levels, which will cause problems for the rationality and effectiveness of data distribution.
For example, due to the layered architecture of the network, the bandwidth between the two racks is not necessarily the same. It may pass through the core switch or even the Internet. Once the bandwidth between the two racks where the data is distributed is small, it will slow down data writing
In order to make up for this problem, one approach is to adjust the network architecture so that the bandwidth between any two nodes in the system is the same, but the cost of network equipment and architecture adjustment is relatively high, which also increases the difficulty of network wiring inside the computer room. The scale that the network can carry is reduced, and the purpose of cost reduction cannot be achieved

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data distribution control method, system and device of a distributed storage system
  • Data distribution control method, system and device of a distributed storage system
  • Data distribution control method, system and device of a distributed storage system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0026] figure 1 Shown is the layered network architecture of the storage nodes in the distributed storage system of this embodiment (also referred to as the system herein), including the following nodes:

[0027] A storage node, a storage node is an entity that stores data, and is also called a machine (host, Machine).

[0028] An access switch (ASW) is a switch that provides network access for storage nodes. In actual deployment, multiple machines on one rack are usually connected to the same access switch.

[0029] The aggregation switch (PSW), which is the aggregation point of multiple access layer switches, handles all traffic from the access layer devices and provides uplinks to the core layer.

[0030] Core switch (DSW), the switch deployed at the core layer (network backbone) is called a core switch. The core layer provides an optimized and reliable backbone transmission structure through high-speed forwarding communication.

[0031] The distributed storage system in...

Embodiment 2

[0111] Figure 4 It is the topology diagram of storage nodes in the distributed storage system in this embodiment, including storage nodes, access switches and core switches, please refer to the figure 1 . In this embodiment, the fault-tolerant domain is divided into three layers, ie, Machine, Rack, and Zone. The distributed storage system includes multiple Zones, ZoneA and ZoneB are shown in the figure, and each Zone includes multiple Racks.

[0112] Based on the above topology, each process of the data distribution control method in this embodiment is described below.

[0113] The data distribution control method in this embodiment includes a process of establishing a topology relationship, a process of writing data, and a process of restoring data. in:

[0114] The process of establishing a topology relationship includes:

[0115] Step 1, the storage node generates topology information, and carries the topology information when registering, the topology information incl...

Embodiment 3

[0141] This embodiment relates to the data distribution control of the upper application system. The upper layer application system of this embodiment takes the cloud disk system used by the user as an example to illustrate how to perform distributed control when storing data. The cloud disk system uses the distributed storage system as the underlying storage system, for example, the distributed storage system in Embodiment 1 or Embodiment 2 can be used.

[0142] In the application scenario where users store data through cloud disks, the corresponding data distribution control methods include:

[0143] Step 1. The cloud disk system provides users with various cloud disks with different availability: high-availability cloud disks and low-availability cloud disks;

[0144] High-availability cloud disks can be charged, and users can be promised availability indicators such as failure recovery time. Low-availability cloud disks are free, and no promises are made to users.

[014...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

A data distribution control method, system and device of a distributed storage system, the distributed control system determines the distribution strategy adopted by the first data from multiple distribution strategies provided by the distributed storage system, and the multiple distribution strategies include cross-fault-tolerant domains A distributed policy and a policy distributed within a fault-tolerant domain; then, according to the adopted distribution policy and the topology relationship of the distributed storage system, allocate a fault-tolerant domain for the first data and perform data writing. This application adds data distribution attribute settings, which can adapt to the requirements of data differences.

Description

technical field [0001] The present invention relates to a distributed storage system, and more specifically, to a data distribution control method, system, and device for a distributed storage system. Background technique [0002] In the current large-scale distributed storage system, in order to realize that the data can still be accessed when there is a problem in a certain fault domain, multiple copies of the data are stored across the fault domain to resist the data availability problem caused by the failure of a single fault domain . For example, in Hadoop Distributed File System (HDFS: Hadoop Distributed File System), multiple copies of data are distributed to different racks for storage. A rack in HDFS constitutes a fault domain (also known as an error domain). The fault domain represents a physical unit that has an error. By placing the fault domain in different racks, when the power supply of a rack or the corresponding switch fails , the data can still be accesse...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/27G06F16/182H04L67/1097
CPCH04L67/1097G06F16/182G06F16/27
Inventor 姚文辉陆靖吕鹏程常艳军朱家稷
Owner ALIBABA GRP HLDG LTD