A protection method and system for HDFS access mode

An access mode and file technology, applied in digital data protection, instruments, computing, etc., can solve the problems of reducing attack costs, attackers cannot distinguish whether the access is real or forged, and cannot further infer the content and importance of user data , to achieve the effect of enhancing the security

Active Publication Date: 2021-06-01
PEKING UNIV
View PDF8 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

ORAM can hide the data access mode and confuse each access, so that the attacker cannot distinguish whether the access is real or fake, so the attacker will not be able to obtain privacy information such as user data storage location, access frequency, and access sequence. It is also impossible to further infer information such as the content and importance of user data
[0004] The shortcomings and limitations of existing methods are: HDFS, as a distributed storage system widely used in industry and academia, cannot resist all types of attacks only by data encryption, and attackers can still infer privacy through user access patterns. Information, there is no relevant research to realize the protection scheme for HDFS user access mode, which undoubtedly creates a huge security risk for many companies and individual users who use HDFS for distributed storage
The number of nodes in the cluster may be large in magnitude. If an attacker cannot grasp the user's access frequency, he needs to attack tens of thousands of nodes. This is obviously unrealistic, so the leakage of access frequency greatly reduces the attack cost

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A protection method and system for HDFS access mode
  • A protection method and system for HDFS access mode

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0024] In order to make the above objects, features and advantages of the present invention more comprehensible, the present invention will be further described in detail below through specific embodiments and accompanying drawings.

[0025]The code realization of the present invention is based on Hadoop 2.8.4 source code, revises source code according to function module and file read-write process, and the java file of main modification is CommandWithDestination.java, IOUtils.java and PathData.java, and newly added java file is FileBuffer. java, TreeNode.java and TreeORAM.java, the total code size is 4686 lines. The specific operation process is described below.

[0026] 1. Initialization

[0027] After the HDFS cluster is started, an initialization operation is required to transfer several dummy files to the data nodes. The dummyfile is an invalid file, and the difference between the dummy file and the real file cannot be distinguished outside the client. The role of initi...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a protection method and system for HDFS access mode. This method decomposes the read and write operations on the data nodes of the HDFS cluster into two atomic operations that read first and then write to hide the operation type of the file; add obfuscated data blocks to the file before writing the file to the data node to hide the file The number of blocks; after each file is read, delete the file from the data node, and randomly select a file in the client's file buffer to write back to the data node to hide the location of the data node where the file is stored; through file storage Constant changes in location, frequency and order of access to hidden files. The invention provides the design and implementation of the HDFS access mode protection scheme based on the ORAM technology, which fills the blank of the HDFS access mode protection and enhances the security of the HDFS while bringing performance overhead within an acceptable range.

Description

technical field [0001] The invention relates to data protection of Hadoop distributed file system (HDFS), in particular to a protection method and system for HDFS access mode. Background technique [0002] HDFS (Hadoop Distributed File System) is the core distributed file system of Hadoop. HDFS is often used to store large files, similar to traditional distributed systems. When the size of a data set exceeds the storage limit of a computer, the data set is divided and stored on a large number of cheap servers. HDFS is currently widely used in industry and academia. In recent years, people have higher and higher requirements for data privacy protection, which undoubtedly brings greater challenges to the data privacy protection capabilities of HDFS. For the current version of the HDFS system, the system design focuses on data availability and data integrity in terms of data security, and there are relatively few strategies for protecting data confidentiality. For example, a...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F21/62
CPCG06F21/6227G06F21/6245
Inventor 沈晴霓秦嘉吴鹏飞康雨城刘忠开
Owner PEKING UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products