Redis-based mass data classified storage method and system

A mass data and storage system technology, applied in the computer field, can solve problems such as increasing response timeout rate, increasing query time, and difficulty in meeting the continuous growth of user data volume, and achieve the effect of reducing memory fragmentation and memory occupation

Active Publication Date: 2021-08-24
上海艾麒信息科技股份有限公司
12 Cites 0 Cited by

AI-Extracted Technical Summary

Problems solved by technology

[0003] Store user data directly in the database: This method has almost no limit to the growth of user data, and the cost of the hard disk is cheaper than memory, but the efficiency and response time of query data are far inferior to the memory database redis. For real-time advertising trading systems In terms of advertising, it will increase the response timeout rate of advertising bidding, and with the increase of user data, the query time will increase, wh...
View more

Method used

The present invention converts the key-value pair data of key-value into the data of hash type, and the converted hashkey and field are integers, and control the field quantity of each hash bucket to be no more than 512, so that the data is in the ziplist structure The storage method is stored, which greatly saves space. Because the space occupied by the data of the integer...
View more

Abstract

The invention provides a Redis-based mass data classified storage method and system, and the method comprises the steps: S1, classifying data, and defining a data category ID for each category of data; S2, for each data category, calculating the number N of hash buckets according to the data scale of the corresponding actual service; S3, taking the data identifier ID, the data category ID and the number N of the data buckets as input parameter factors, and performing hash calculation to obtain a hash key and a field; S4, using the data content corresponding to the data identification ID as a hashvalue; and S5, storing the hashkey, the field and the hashvalue into the redis. According to the method and the device, the user data identifier is converted into a digital form and is stored in redis in the form of hash type data, so that the purpose of reducing memory occupation is achieved.

Application Domain

Redundant data error correctionSpecial data processing applications +1

Technology Topic

Data contentEngineering +6

Image

  • Redis-based mass data classified storage method and system
  • Redis-based mass data classified storage method and system

Examples

  • Experimental program(2)

Example Embodiment

[0052] Example 1
[0053]According to the present invention, a mass-based mass data classification storage method is provided, including:
[0054] Step S1: Classify the data and define a data category ID for each category data;
[0055] Step S2: For each of the data categories, based on the data size of the corresponding actual service, the number of HASH buckets is calculated; according to the characteristics of the REDIS, the Hash type data is the most space Ziplist data structure storage mode requires Hash Bucket ( That is, the data of the Field stored in a Hash Key is less than 512, so the number N of the HASH bucket is calculated as: n = total data amount / 512, and hindered up.
[0056] Step S3: Use the amount of data identifier ID, data category ID, and data bucket as an incoming factor, performing Hash KEY and FIELD;
[0057] Step S4: The data identifies the data identifies the data as a Hash Value;
[0058] Step S5: Store the Hash Key, Field, and Hash Value into the Redis.
[0059] Specifically, the step S2 includes:
[0060] N = total data volume / 512, n-upward.
[0061] Specifically, the HASH KEY includes: the number N of the data identifier ID, the data type ID, and the data barrel, respectively, using the HASH algorithm A, calculates an integer type has a result A, and interacting an integer type Hash results A as a Hash Key of the Hash data type of Redis;
[0062] The HASH algorithm A is calculated using the CRC 32 algorithm for dispersion of the data identification ID, and then calculates the number of HASH buckets, and the KEY of the different user data is achieved to different data buckets as possible;
[0063] The intensive type hash results A include:
[0064] Integer type hash result A = data category ID * Data bucket N + Data Identifier ID ';
[0065] The data identification ID 'After calculating the CRC algorithm for the data identification ID, the data identifier is calculated.
[0066] Specifically, the Field of the step S3 includes: calculating the data identifier ID as a unique factor, calculating an integer type of hash result B; a hash result B as the Field of Hash KEY;
[0067] The HASH algorithm B is an integer for a low collision using a BKDRHASH algorithm to obtain a low collision, so that the field of different user data saved to different user data in the same data bucket;
[0068] The integer type hash result B performs Hash calculations for the data identification ID using the BKDRHASH algorithm.
[0069] Specifically, it is also included to read massive data based on REDIS classification;
[0070] The seafood-based data stored based on the REDIS classification includes: the HASH bucket data amount N corresponding to the data identifier ID, the data category ID, and the data category is used as an incoming transformation, and hash KEY and FIELD; with Hash Key and Field into the parameters, read the data content corresponding to the Hash Value data identification ID from the REDIS through the HGET command of the Redis Hash.
[0071] An REDIS-based massive data classification storage system is provided in accordance with the present invention, including:
[0072] Module M1: Classify the data and define a data category ID for each category data;
[0073] Module M2: For each of the data categories, calculates the number of HASH buckets based on the data size of the corresponding actual business; according to the characteristics of REDIS, the HASH type data is the most space ziplist data structure storage mode requires Hash bucket ( That is, the data of the Field stored in a Hash Key is less than 512, so the number N of the HASH bucket is calculated as: n = total data amount / 512, and hindered up.
[0074] Module M3: Use the amount of data identifier ID, data category ID, and data bucket to be introduced to reflection, and has a Hash Key and Field;
[0075] Module M4: The data identifies the data identifies the data as a Hash Value;
[0076] Module M5: Store Hash Key, Field, and Hash Value into Redis.
[0077] Specifically, the module M2 includes:
[0078] N = total data volume / 512, n-upward.
[0079] Specifically, the HASH KEY includes: the number N of the data identifier ID, the data type ID, and the data barrel, respectively, using the HASH algorithm A, calculates an integer type has a result A, and interacting an integer type Hash results A as a Hash Key of the Hash data type of Redis;
[0080] The HASH algorithm A is calculated using the CRC 32 algorithm for dispersion of the data identification ID, and then calculates the number of HASH buckets, and the KEY of the different user data is achieved to different data buckets as possible;
[0081] The intensive type hash results A include:
[0082] Integer type hash result A = data category ID * Data bucket N + Data Identifier ID ';
[0083] The data identification ID 'After calculating the CRC algorithm for the data identification ID, the data identifier is calculated.
[0084] Specifically, the Field of the step S3 includes: calculating the data identifier ID as a unique factor, calculating an integer type of hash result B; a hash result B as the Field of Hash KEY;
[0085] The HASH algorithm B is an integer for a low collision using a BKDRHASH algorithm to obtain a low collision, so that the field of different user data saved to different user data in the same data bucket;
[0086] The integer type hash result B performs Hash calculations for the data identification ID using the BKDRHASH algorithm.
[0087] Specifically, it is also included to read massive data based on REDIS classification;
[0088] The seafood-based data stored based on the REDIS classification includes: the HASH bucket data amount N corresponding to the data identifier ID, the data category ID, and the data category is used as an incoming transformation, and hash KEY and FIELD; with Hash Key and Field into the parameters, read the data content corresponding to the Hash Value data identification ID from the REDIS through the HGET command of the Redis Hash.

Example Embodiment

[0089] Example 2
[0090] Example 2 is a preferred example of Example 1
[0091] The present invention stores the mass-quantity of data to the REDIS to save memory, and improve access speed, and can classify independently.
[0092] The present invention provides a method of redis based massive data classification storage, including:
[0093] Step 1: Define the user data category ID according to the business category;
[0094] Step 2: Define the amount of user data bucket according to the data level corresponding to the data category ID;
[0095] Step 3: The user data identifier of the string type, the data category ID, the data category ID, the data category ID, using the HASH algorithm A calculate the integer type hashing result A;
[0096] Step 4: Take the intensive type of hash result A as a Hash Key of the REDIS's HASH data type;
[0097] Step 5: A user data identifying the string type as a unique factor, calculates the integer type has been obtained by the HASH algorithm B, and the hash result B as the Field of the Hash Key of the step 4;
[0098] Step 6: Identify the user data identifies the corresponding data content as the value of the field;
[0099] Step 7: Deposit the Hash Key, Field, Value in the Hash data type in REDIS;
[0100] Step 8: After the above steps are followed, the Hash Key of the data of the same data category ID will be independently maintained within one of the same integer values. And, the user data identifier of the string type is converted into an integer type HashKey and Field, the space occupied on the redis is greatly reduced due to the Ziplist characteristic.
[0101] The role of the HASH algorithm A is to save different user data to different data buckets; the role of the HASH algorithm B is as much as possible to save the field of different user data in the same bucket, otherwise Collide.
[0102] Reading Hash Value Calculates Hash Key and Field in the same way, with the Hash Key and Field as incoming parameters, read from the REDIS through the HGET command of the Redis Hash.
[0103] The present invention stores data classification, different categories of data, depending on the category ID, the range of Hash Key can be classified, and the batch operation data can be submitted, efficiently and less than the memory, such as: Category 1 data size is 500 million, then Defining the number of HASH buckets is 10,000, 1000000 * 512> 500 million, sufficiently stored. The range of Hash Key can be controlled to 1000001-1999999. When the business changes do not require these data or all expiration processes, the Hash Key can clean up very quickly.
[0104]The present invention converts the key value of Key-Value to the data of the data, and the transformed HashKey, Field is an integer, and controls the number of FIELDs of each HASH bucket not more than 512, so that the data is stored in a ziplist structure. This greatly saves space. Since the space of the integer type data is much saved in the form of a string. By testing, when the amount of data is over 1000 million, it can save more than 80% of the space.

PUM

no PUM

Description & Claims & Application Information

We can also present the details of the Description, Claims and Application information to help users get a comprehensive understanding of the technical details of the patent, such as background art, summary of invention, brief description of drawings, description of embodiments, and other original content. On the other hand, users can also determine the specific scope of protection of the technology through the list of claims; as well as understand the changes in the life cycle of the technology with the presentation of the patent timeline. Login to view more.

Similar technology patents

Message transmission system and transmission method supporting unicast and broadcast

PendingCN112737797Aavoid information congestionReduce memory usage
Owner:重庆攸亮科技股份有限公司

Method and device for managing files on disk, computer equipment and storage medium

PendingCN114281769AAvoid Missing ListeningReduce memory usage
Owner:BEIJING TOPSEC NETWORK SECURITY TECH +2

Picture display method, device and equipment and storage medium

PendingCN113220201AReduce memory usageImprove loading time
Owner:上海御微半导体技术有限公司

Classification and recommendation of technical efficacy words

  • Reduce memory usage
  • Reduce memory fragmentation

Memory allocation processing method and device

InactiveCN107153618AReduce memory fragmentationImprove space utilization
Owner:ADVANCED NEW TECH CO LTD

Method of quickly reading and writing mass data file

ActiveCN102193873AReduce memory fragmentationImprove memory usage efficiency
Owner:INST OF MICROELECTRONICS CHINESE ACAD OF SCI
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products