Data storage method and system

A data storage and data technology, applied in the field of data processing, can solve the problems of increasing investment in storage hardware equipment, increasing data replication time overhead, wasting physical storage space, etc., so as to improve storage space utilization, enhance security, and enhance The effect of the scope of application

Active Publication Date: 2011-04-27
INSPUR SUZHOU INTELLIGENT TECH CO LTD
View PDF3 Cites 20 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] Due to a series of problems such as the standardization of data archiving, a large amount of duplicate data exists in physical storage devices, which wastes a lot of physical storage space. Many companies have to increase investment in storage hardware devices and increase many unnecessary overhead
[0004] The current mainstream data remote replication methods include full replication, differential replication, and differential replication. Among the three strategies for implementing data replication, they cannot completely solve the performance and efficiency problems of data replication in essence, because whether it is full replication , Differential replication or differential replication, the replicated data has a lot of redundancy, and many data are replicated again and again, which increases the time overhead of data replication, and also brings in the process of data replication. security risks

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data storage method and system
  • Data storage method and system
  • Data storage method and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0034] Embodiment 1, a data storage method, such as figure 1 shown, including:

[0035] dividing each stored file into data segments of a predetermined size;

[0036] generating identification information uniquely corresponding to the data segment for each divided data segment, the identification information being used to carry attribute information of the corresponding data segment;

[0037] Compare the contents of each data segment to find duplicate data;

[0038] Two or more copies of data with the same content are regarded as a group; for each group of duplicate data, one of the data is retained, and the physical storage location of the data is saved as a redundant data watermark for other data in the group ; If there is duplicate data in a data segment, replace the duplicate data in the data segment with its redundant data watermark.

[0039] In this embodiment, the step of dividing each stored file into data segments of a predetermined size may be performed once when ...

Embodiment 2

[0052] Embodiment 2, a data storage system, such as figure 2 shown, including:

[0053] A segmentation module, configured to divide each stored file into data segments of a predetermined size;

[0054] An index module, configured to generate identification information uniquely corresponding to the data segment for each divided data segment, and the identification information is used to carry attribute information of the corresponding data segment;

[0055] A comparison module is used to compare the contents of each data segment and find out duplicate data;

[0056]The processing module is used to treat two or more copies of data with the same content as a group; for each group of repeated data, one of the data is retained, and the physical storage location of the data is saved as the other data in the group redundant data watermark; if there is duplicate data in a data segment, the duplicate data in the data segment will be replaced by its redundant data watermark.

[0057...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a data storage method and system. The method comprises the following steps of: dividing each stored file into a data segment with a preset size; generating unique identification information corresponding to the data segment for each divided data segment, wherein the identification information is used for carrying attribute information corresponding to each data segment; comparing the content of each data segment, and finding out repetitive data; grouping two or more than two portions of data with the same content; keeping one data for the repetitive data of each group, and storing a physical storage position of the portion of data as redundant data watermarks of the other portions of data in the group; and if one data segment has the repetitive data, replacing the repetitive data in the data segment with the redundant data watermarks of the repetitive data. The method benefits the saving of space resource of the data on a physical storage entity, thereby improving the efficiency and safety of remote data replication.

Description

technical field [0001] The invention relates to the field of data processing, in particular to a data storage method and system. Background technique [0002] With the acceleration of digital informationization and the explosive increase of data volume and access volume, the duplication and backup of data as data protection is facing a huge test. [0003] Due to a series of problems such as the standardization of data archiving, a large amount of duplicate data exists in physical storage devices, which wastes a lot of physical storage space. Many companies have to increase investment in storage hardware devices and increase many Unnecessary overhead. [0004] The current mainstream data remote replication methods include full replication, differential replication, and differential replication. Among the three strategies for implementing data replication, they cannot completely solve the performance and efficiency problems of data replication in essence, because whether it i...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
Inventor 文中领张雷张宇
Owner INSPUR SUZHOU INTELLIGENT TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products