Big data desensitization method and system

A data desensitization and desensitization technology, applied in the field of information security, can solve the problem of desensitization complexity and desensitization algorithm flexibility, lack of sensitive data classification and more fine-grained management, lack of mature methods and processes, etc. question

Pending Publication Date: 2019-12-03
北京方盈智能数字科技有限公司
View PDF7 Cites 38 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The above method has been widely used and achieved great success, but there are several problems in the current desensitization method: 1) The current data desensitization technology based on Hadoop big data platform and Hive, HBase and other databases It is still in the exploratory stage and there are no mature methods and processes; 2) The management of data masking strategies and the quality of masking are uneven, and some do not fully consider the complexity, risk, and flexibility of masking algorithms; 3) the existing Data desensitization only considers the identification of sensitive data, and lacks classification and finer-grained management of sensitive data

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Big data desensitization method and system
  • Big data desensitization method and system
  • Big data desensitization method and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0040] The present invention provides a kind of big data desensitization method, it is characterized in that comprising the following steps:

[0041] Such as figure 1 As shown, the flow of the big data desensitization method is as follows:

[0042] S1: Sensitive data identification

[0043] S10: Sensitive data scanning. In this embodiment, a full data scan is performed on the Hadoop big data platform, HDFS, Hive, and HBase data sources are found, and data samples and metadata are collected.

[0044] S11: Data format analysis. Obtain descriptions of HDFS file names, Hive, HBase database table names, columns, etc., analyze the format and content of source data, and realize sensitive information identification through data feature matching.

[0045] It should be noted that for different sources and types of data, the detection methods for sensitive features will be different. In this embodiment, a sensitive information database can be established for different source data cl...

Embodiment 2

[0095] The present invention provides a big data desensitization system, specifically as follows.

[0096] The big data desensitization system includes two subsystems of sensitive data identification and data desensitization.

[0097] Such as Figure 4 As shown, the functional modules of the big data desensitization system are as follows:

[0098] S5: Sensitive Data Identification Subsystem

[0099] The sensitive data identification subsystem includes a data source scanning module, a sensitive data identification and grading module, and a sensitive feature database management module.

[0100] S50: the data source scanning module is configured to scan and discover the full amount of data on the Hadoop platform, and obtain Hive and HBase source data.

[0101] S51: The sensitive data identification and grading module is configured to discover sensitive data, classify and classify sensitive data, automatically detect sensitive information fields in the database through the conf...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a big data desensitization method and system. The big data desensitization method specifically comprises: firstly, source data of a big data platform being subjected to scanningdiscovery, data analysis and sensitive data grading classification; the data desensitization module monitoring an access request of the big data platform, performing desensitization processing on a requested data result set by utilizing a Hive hook mechanism and an HBase coprocessor mechanism according to a user role, request data content and a preset desensitization algorithm rule, and finally returning desensitized data to the big data platform; on the basis, performing desensitization strategy making, desensitization algorithm and parameter setting, strategy management and issuing. A complete solution from sensitive data identification to dynamic data desensitization in a big data environment is realized.

Description

technical field [0001] The invention relates to the field of information security, in particular to a big data desensitization method and system. Background technique [0002] A large amount of sensitive information may be stored in the data accessed by the big data platform. Once leaked or illegally used, it will bring irreparable losses to individuals, enterprises and even the country. For this reason, big data privacy protection is getting more and more attention. Data desensitization realizes the protection of sensitive data by masking and deforming sensitive data through given rules and policies. Data desensitization expands the use range and sharing objects of original data without reducing security, and is the most effective sensitive data protection method in the big data environment. [0003] A good data desensitization process must have the following characteristics: the desensitized data should have most of the characteristics of the original data to ensure busi...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F21/62
CPCG06F21/6218
Inventor 不公告发明人
Owner 北京方盈智能数字科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products