High-performance storage control method based on distributed high-capacity fragmentation

A storage control and large-capacity technology, applied in special data processing applications, instruments, file systems, etc., can solve problems such as slow query, service blocking, qps sudden drop, etc., achieve high-speed complex calculations, improve availability, and improve performance.

Pending Publication Date: 2021-01-12
TIANYI ELECTRONICS COMMERCE
View PDF0 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Common distributed cache systems have large value scenarios such as: 1. Comments on hot topics, answer sorting scenarios; 2. Fan lists of big Vs; 3. Improper use, or inaccurate business estimates and not timely processing Garbage data, etc.; 4. There are large merchants in the payment system, and it is necessary to count information such as accounts and transactions of large merchants within a period of time. These accounts and transactions may have millions of records; 5. Count all personal mobile phones under a hot IP number and other information; many systems with extremely high timeliness requirements need to calculate the indicator data in a specific scenario in a very short time, and these indicator data need to be stored in the cache system just like the above scenario, and the content is huge
The value corresponding to a certain key may have a GB-level size. If the value is queried, it will cause network-related failures, and many distributed storage architectures are single-threaded models implemented by NIO. In this way, large value queries are mainly possible. It will cause the query of the entire cluster to freeze. In the cluster mode, when the slot fragmentation is even, there will be data and query skew, and some nodes with large values ​​will have problems such as large memory usage and high QPS.
When a large key is deleted or automatically expired, there will be a sudden drop or rise in qps. In extreme cases, it will cause master-slave replication to be abnormal, and the service will be blocked and unable to respond to the request.
In the case of a large number of slow queries, severe queuing congestion will cause the CPU-intensive and network IO-intensive computing load of some nodes to be too high, which will lead to the paralysis of the entire cache system and have a serious impact on the business that uses the cache system

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • High-performance storage control method based on distributed high-capacity fragmentation
  • High-performance storage control method based on distributed high-capacity fragmentation
  • High-performance storage control method based on distributed high-capacity fragmentation

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0025] Such as Figure 1-4 As shown, the present invention provides a high-performance storage control method based on distributed large-capacity sharding, including a storage configuration manager, a shard routing controller, a cache storage controller, a cache storage recovery controller, and a cache retrieval controller. The storage structure is mainly divided into two-level structures. The first-level structure uses the set data structure. The key is the first-level key, which stores the unique ID for business metadata, and the value is the distribution path address index (mainly storing fragmentation path address information, which is the second-level structure. key, hereinafter referred to as secondary key). The secondary structure is a shard storage structure, which uses the ZSet structure. The key is the shard address, the score is the timestamp when it is stored, and the value is the metadata data content. The specific structure is as follows Figure 1 .

[0026] (1...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a high-performance storage control method based on distributed high-capacity fragmentation. A device for implementing the method comprises a configuration manager, a fragmentation routing controller, a cache storage controller, a cache storage recovery controller and a cache retrieval controller are included. According to the method, a large value storage condition of a distributed storage system is prevented, and intelligent and controllable division of an original data set is realized by utilizing an algorithm; when the scale of the stored content is increased, dynamic adaptation is realized through a fragmentation algorithm; when a certain node breaks down, whether tasks on the node can be allocated to other nodes in a balanced mode or not is judged; the problemthat a large amount of data is concentrated on a physical node due to uneven distribution of characteristic values of original data is solved; the performance and concurrency are improved, and read-write operations are distributed to different fragments and are mutually independent; the availability of the system is improved, and even if a part of fragments cannot be used, other fragments cannot be influenced; high-speed retrieval and complex calculation of complex large objects in different scenes are realized, and time sliding window extraction of query data is supported.

Description

technical field [0001] The invention relates to the technical field of computer software applications, in particular to a high-performance storage control method based on distributed large-capacity fragmentation. Background technique [0002] In the existing technology, distributed systems, especially distributed storage systems, need to solve the main problems of data fragmentation and data redundancy, and distributed storage systems under the K / V storage structure are very difficult problems It is how to achieve high-performance implementation of large Value content storage and retrieval. When using a cache cluster, the two most feared situations are hot keys and large values. A hot key is a key in the cache cluster that is instantly and intensively accessed by tens of thousands or even hundreds of thousands of concurrent requests. Common distributed cache systems have large value scenarios such as: 1. Comments on hot topics, answer sorting scenarios; 2. Fan lists of big...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/172G06F16/13G06F16/182
CPCG06F16/13G06F16/172G06F16/182
Inventor 李真杨富安徐冬冬张荣燕赵新浪杨章春王维龙
Owner TIANYI ELECTRONICS COMMERCE
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products