A document desensitization system and method based on big data

A big data and desensitization technology, applied in file systems, digital data protection, digital data processing, etc., can solve the problems of high proportion of unstructured data and insufficient static desensitization.

Inactive Publication Date: 2019-01-29
CHINA ELECTRONICS TECH CYBER SECURITY CO LTD
View PDF10 Cites 17 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] 1. The proportion of unstructured data in big data is relatively high. Existing products can only support structured data. How to deal with unstructured documen...

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A document desensitization system and method based on big data
  • A document desensitization system and method based on big data
  • A document desensitization system and method based on big data

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0088] In order to better understand the present invention, the present invention will be described in detail below in conjunction with the accompanying drawings.

[0089] The document desensitization system desensitizes documents in Word, Excel, PPT, TXT, PDF, and XML formats. The system mainly consists of a system management module, a data source management module, a sensitive data discovery module, a desensitization task management It consists of seven modules: sensitive configuration management module, desensitization verification, multi-level management module, and security audit module.

[0090] like figure 1 As shown, a big data-based document desensitization system of the present invention includes a system management module that manages roles and users, provides data source registration, and provides data source management for data desensitization tasks and data descriptions Module, sensitive data discovery module that automatically discovers sensitive data in docume...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

A document desensitization system based on big data is used to desensitize Word, Excel, PPT, TXT, PDF, XML format documents desensitization processing, the system is mainly composed of system management module, data source management module, sensitive data discovery module, desensitization task management module, desensitization configuration management module, desensitization verification module,multi-level management module, security audit module seven modules. By means of natural language processing and semantic analysis, the invention solves the identification problem of sensitive data ina document, and the identification accuracy is high. The invention provides a method for solving static desensitization and dynamic desensitization of unstructured data such as documents, which ensures the safety of sharing and exchanging documents under the environment of big data. By analyzing the document, identifying the sensitive data in the document and desensitizing the document, the invention ensures that the original format of the document is not destroyed, and effectively solves the difficulty of desensitizing the document.

Description

technical field [0001] The invention relates to the cross-technical field of computer technology and information security, in particular to a document desensitization and system method based on big data. Background technique [0002] With the deepening of informatization, enterprises and governments have more and more data, and the sharing and exchange of data within departments or even across departments is becoming more and more frequent, and its security issues are becoming more and more prominent. The data contains a large amount of sensitive private information, once leaked, it will cause irreparable losses. According to the research of relevant organizations, 85% of the current data are unstructured data. How to identify sensitive data of these unstructured data and realize data desensitization is an urgent problem to be solved. [0003] At present, most data desensitization products on the market are aimed at static desensitization of structured databases, which can ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F21/62G06F16/13
CPCG06F21/6245
Inventor 陈天莹李霄李全兵郭小华
Owner CHINA ELECTRONICS TECH CYBER SECURITY CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products