Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Big data non-structured file dynamic desensitization method and system

An unstructured, data desensitization technology, applied in the field of data security, can solve problems such as not supporting dynamic desensitization rule setting, and changes in desensitization rules cannot reflect desensitized data in real time, achieving less workload, low technical threshold, The effect of saving development costs

Active Publication Date: 2017-11-03
北京明朝万达科技股份有限公司
View PDF8 Cites 12 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] (1) Once the desensitization is completed, the desensitization data is fixed. If other desensitization rules need to be applied, the desensitization process must be performed again
[0007] (2) An additional repository is required to store desensitized data
[0008] (3) Dynamic desensitization rule settings are not supported, and changes in desensitization rules cannot be reflected in desensitization data in real time

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Big data non-structured file dynamic desensitization method and system
  • Big data non-structured file dynamic desensitization method and system
  • Big data non-structured file dynamic desensitization method and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0041]

[0042] Hadoop: It is a software framework capable of distributed processing of large amounts of data, including HDFS, YARN, and MapReduce. HDFS is a distributed file system, YRAN is a distributed resource scheduling system, and MapReduce is a distributed programming framework. , consisting of two atomic operations, Map and Reduce.

[0043] HDFS: It adopts a master-slave architecture and consists of two parts, NameNode and DataNode. Among them, NameNode is the master node, storing file metadata information, which can be one or more; DataNode is a slave node, storing actual file blocks, and the number can reach thousands.

[0044]

[0045] Such as figure 2 , the system of the present invention includes: a client, a central scheduler, a desensitization system, and Hadoop, wherein Hadoop includes HDFS; the client sends a data read instruction to the central scheduler, and according to the DataNode returned from the central The address obtains data from the host whe...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a big data non-structured file dynamic desensitization method and system. The system comprises a client, a central scheduler, a desensitization system and an HDFS, wherein the client sends a data reading instruction to the central scheduler and obtains data from a host where the desensitization system is located or the HDFS according to a DataNode address returned from the central scheduler; the central scheduler analyzes the data reading instruction and judges whether desensitization needs to be performed or not; if the desensitization needs to be performed, the central scheduler modifies the DataNode address of an obtained file data block to an address of the host where the desensitization system is located, returns the modified DataNode address to the client, and sends the unmodified DataNode address to the desensitization system; and the desensitization system is used for desensitizing data obtained from the DataNode address and returning the desensitized data to the client. Through the scheme, the realization is simple; the cost is reduced; the deployment is transparent; and existing applications are not influenced.

Description

technical field [0001] The invention relates to the field of data security, in particular to a method and system for dynamic desensitization of large data unstructured files. Background technique [0002] In the Hadoop ecosystem, the HDFS system is used to store unstructured data. While the big data processing system mines value from massive data, some archived data such as data dictionaries and population information serve as an important basis for data value mining. Once stored, they are rarely changed. HDFS files are a common storage method for this type of data. [0003] Using permission-based access control to make two authentication conclusions of yes or no for users' access to data resources, this method is difficult to meet the diverse data usage needs of users. When users who access certain types of sensitive data only need part of the sensitive data and do not need to obtain data with a higher security level, the permission control model alone cannot meet this dem...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F21/62
CPCG06F21/6218G06F2221/2141
Inventor 李学进王志海喻波魏力
Owner 北京明朝万达科技股份有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products