Data storage method and device, server and storage medium

A data storage and data technology, applied in the field of data processing, can solve the problems of low data processing efficiency and large storage space occupied by mass data storage, and achieve the effect of improving data processing efficiency and avoiding excessive storage space occupation.

Pending Publication Date: 2020-11-17
RUN TECH CO LTD BEIJING
View PDF0 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The present invention provides a data storage method, device, server and storage medium, which solves the problems of large storage space occupied by mass data storage and low data processing efficiency, so as to realize fast data storage, reduce the requirements on the hardware environment, and further improve Data processing efficiency

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data storage method and device, server and storage medium
  • Data storage method and device, server and storage medium
  • Data storage method and device, server and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0028] figure 1 It is a schematic flow diagram of a data storage method provided by the embodiment of the present invention. This embodiment is applicable to the situation of deduplication, merging and storage of massive structured data. The method can be executed by the management system of the storage chip, and the system can Realized in the form of software and / or hardware.

[0029] Before introducing the technical solution of this embodiment, an application scenario is briefly introduced. For example, the method can be applied to offline and / or online tasks, that is, the situation of offline or online data deduplication and merging.

[0030] Such as figure 1 As shown, the method specifically includes the following steps:

[0031] S110. Acquire data to be stored, and determine a current data identifier and a target storage identifier corresponding to the data to be stored.

[0032] Wherein, the data to be stored may be massive structured data requiring structured dedupl...

Embodiment 2

[0047] figure 2 It is a flowchart of a data storage method provided by Embodiment 2 of the present invention. This embodiment is a further optimization based on Embodiment 1. Such as figure 2 As shown, the method specifically includes:

[0048] S201. For each piece of data to be stored, determine key information corresponding to at least one deduplication field.

[0049] Among them, the data level of the data to be stored is generally in the hundreds of billions, and the space occupation may reach tens or even hundreds of terabytes. Therefore, it is necessary to deduplicate, merge and store each piece of data to be stored in each data set.

[0050] Wherein, in this embodiment, first, data may be processed according to a deduplication policy, and key information in at least one deduplication field may be extracted. It can be understood that the deduplication strategy is to determine the fields to be deduplicated, that is, the fields that need to be deduplicated, and the s...

Embodiment 3

[0086] image 3 It is a structural block diagram of a data storage device provided by Embodiment 3 of the present invention. The device is used to execute a data storage method provided by any of the above embodiments, and has corresponding functional modules and beneficial effects for executing the method. The device includes: an identifier determination module 310 , a storage block number determination module 320 , an identifier acquisition module 330 and a data storage module 340 .

[0087] The identification determination module 310 is used to obtain the data to be stored, and determine the current data identifier corresponding to the data to be stored and the target storage identification; the storage block number determination module 320 is used to determine the target storage identification based on the target storage identification. The target storage block number corresponding to the data to be stored; the identification acquisition module 330, configured to obtain t...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a data storage method and device, a server and a storage medium. The method comprises the steps of obtaining to-be-stored data, and determining a current data identifier and atarget storage identifier corresponding to the to-be-stored data; determining a target storage block number corresponding to the to-be-stored data based on the target storage identifier; acquiring a historical data identifier corresponding to each piece of historical storage data stored in a target storage block corresponding to the target storage block number and the current data identifier; andwhen a preset condition is satisfied between the historical data identifier and the current data identifier, storing the to-be-stored data into the target storage block. According to the embodiment ofthe invention, storage of massive structured data can be realized, and rapid deduplication and merging of the massive structured data are realized.

Description

technical field [0001] Embodiments of the present invention relate to data processing technologies, and in particular, to a data storage method, device, server, and storage medium. Background technique [0002] With the widespread popularity of Internet applications, the storage of massive data has become an indispensable part of system design. [0003] At present, the process of mass data storage is to convert object data into strings during data storage, and store them in HDFS (Hadoop Distributed File System) in a specific file format. In order to facilitate the search for data when performing tasks in batches later, a directory is created by business and date when data is stored. Next, run MapReduce (simple data processing on super-large cluster) offline tasks, that is, read data according to business requirements, load all massive data into memory, and complete operations such as merging and counting according to the merging dimension in memory. Finally, Output full am...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/2458G06F16/27G06F16/182
CPCG06F16/182G06F16/2471G06F16/27
Inventor 任丽超谢永恒程强
Owner RUN TECH CO LTD BEIJING
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products