Scalable chunk store for data deduplication

A data block and data stream technology, which is applied in the field of scalable storage methods and systems, and can solve problems such as technical difficulties in data deduplication.

Active Publication Date: 2012-07-04
MICROSOFT TECH LICENSING LLC
View PDF4 Cites 27 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Moreover, due to the very high rate of growth in the amount of digital data, the size of storage devices (e.g., storage disks) and the overall storage capacity associated with computing devices must also increase, resulting in an inability to handle the increased storage volume very well. Difficulties in scaled data deduplication techniques

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Scalable chunk store for data deduplication
  • Scalable chunk store for data deduplication
  • Scalable chunk store for data deduplication

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0043] I. Introduction

[0044] This specification discloses one or more embodiments that incorporate the features of this invention. The disclosed embodiments are merely illustrative of the invention. The scope of the invention is not limited to the disclosed embodiments. The invention is defined by the appended claims.

[0045] References in the specification to "one embodiment," "an embodiment," "example embodiment," etc. mean that the described embodiments may include a particular feature, structure, or characteristic, but that each embodiment may not necessarily include the particular feature, structure, or characteristic. Characteristic, structure, or feature. Moreover, these phrases are not necessarily referring to the same embodiment. Furthermore, when a particular feature, structure or characteristic is described with reference to one embodiment, it is assumed that such feature, structure or characteristic can be implemented with other embodiments within the knowl...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to scalable chunk store for data deduplication. Data streams may be stored in a chunk store in the form of stream maps and data chunks. Data chunks corresponding to a data stream may be stored in a chunk container, and a stream map corresponding to the data stream may point to the data chunks in the chunk container. Multiple stream maps may be stored in a stream container, and may point to the data chunks in the chunk container in a manner that duplicate data chunks are not present. Techniques are provided herein for localizing the storage of related data chunks in such chunk containers, for locating data chunks stored in chunk containers, for storing data streams in chunk stores in localized manners that enhance locality and decrease defragmentation, and for reorganizing stored data streams in chunks stores.

Description

technical field [0001] The invention relates to a method for storing data, in particular to a deduplicated and scalable storage method and system for user data. Background technique [0002] Data deduplication, also known as data optimization, is the act of reducing the physical amount of bytes of data that needs to be stored on disk or transmitted over a network without compromising the fidelity and integrity of the original data. Data deduplication reduces the storage capacity required to store data and can thus result in savings in storage hardware costs and data management costs. Data deduplication provides a solution for handling rapidly growing digitally stored data. [0003] Data deduplication may be performed according to one or more techniques for eliminating duplication within or between persistent storage files. For example, according to one technique, unique data regions that occur multiple times in one or more files can be identified, and a single copy of thes...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F12/06G06F17/30
CPCG06F17/30082G06F16/1752G06F16/122
Inventor 張震河P·A·奥尔泰安R·卡拉赫A·古普塔J·R·本顿R·德塞
Owner MICROSOFT TECH LICENSING LLC
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products