Unlock instant, AI-driven research and patent intelligence for your innovation.

Data desensitization method and device

A data desensitization and data technology, applied in the field of data desensitization, can solve problems such as meaningless research, high overhead, and reduced data availability, and achieve the effect of simplifying computational complexity and reducing unnecessary overhead

Active Publication Date: 2020-06-05
HENGRUITONG FUJIAN INFORMATION TECH CO LTD
View PDF7 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

If all sensitive attributes in the data are removed, such data will make the research meaningless
[0003] At present, regarding the issue of privacy leakage in data release, the existing research mainly uses methods such as restricting data release, data scrambling, and k-anonymity. Although these methods can protect data privacy to a certain extent, there are still some problems. Some security and usability flaws
For example, restricting data release is mainly to cut off the association between data, but it will reduce the availability of data, and the number of data released is not easy to control; data scrambling is mainly to disturb the data, by increasing the appropriate Noise is used to change the data, which is conducive to the maintenance of data characteristics, but it has low clustering availability and high computational overhead; k-anonymity mainly requires that there are at least k indistinguishable records in the published data, so that the attacker cannot Identify the specific individual to whom the private information belongs, thereby protecting the privacy of the individual. Although k-anonymity protects the privacy of the individual to a certain extent, it will reduce the availability of data clustering at the same time.
[0004] Therefore, there are two main problems in the existing privacy protection mechanism for data publishing: one is the problem of complex calculation and high cost; the other is the difficulty of maintaining the balance between data availability and privacy

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data desensitization method and device
  • Data desensitization method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0028] Please refer to figure 1 , a data desensitization method, comprising steps:

[0029] S1. Obtain the original data and perform nuclear processing to obtain new data;

[0030] Step S1 is specifically:

[0031] The original data is obtained, and the nonlinear data in the original data is converted into linear data by kernelization processing to obtain new data.

[0032] S2. Perform dimensionality reduction processing on the new data to obtain dimensionality-reduced data;

[0033] Step S2 is specifically:

[0034] The dimensionality reduction processing is performed on the new data through principal component analysis to obtain dimensionality-reduced data.

[0035] S3. Perform centralized processing on the data after dimension reduction to obtain desensitized data.

[0036] The centralized processing includes:

[0037] An equivalence set corresponding to the dimensionality-reduced data is constructed according to the distance minimization principle.

[0038] The cent...

Embodiment 2

[0041] The difference between this embodiment and Embodiment 1 is that this embodiment will further illustrate how the above-mentioned data desensitization method of the present invention is implemented in combination with specific application scenarios:

[0042] The present invention mainly includes two stages of data dimensionality reduction and centralized processing;

[0043] 1. Data dimensionality reduction stage

[0044] Get the original table data S to be published n×h , wherein, n is the number of records of the table data, h is the dimension of the table data, first, for the original table data S n×h The numerical non-linear data in the kernelization process is converted into numerical linear data, and the new table data S′ n×h ; Then, the new data S' is analyzed by principal component analysis n×h Perform dimensionality reduction processing to obtain table data S″ after dimensionality reduction.

[0045] Each record includes m public attributes and t sensitive at...

Embodiment 3

[0074] Please refer to figure 2 , a device 1 for data desensitization, including a memory 2, a processor 3, and a computer program stored in the memory 2 and operable on the processor 3, and the processor 3 implements the first embodiment when executing the program. each step.

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a data desensitization method and device, wherein the method comprises the steps: obtaining original data, carrying out the nucleation processing, and obtaining new data; carrying out dimension reduction processing on the new data to obtain dimension-reduced data. so as to remove redundant information in the data, simplying the calculation complexity, and reduce unnecessaryexpenditure; and performing centralized processing on the dimension-reduced data to obtain desensitized data, and protecting private data on the premise of ensuring the availability of the data.

Description

technical field [0001] The invention relates to the technical field of data processing, in particular to a data desensitization method and device. Background technique [0002] In recent years, with the continuous development of information technology, the generation of personal data has grown exponentially, and a large amount of personal information has been stored and released by government departments and commercial organizations. As a means of information sharing, data release not only provides convenience for data exchange and data sharing, but also increases the risk of personal privacy data leakage. "Privacy data" refers to sensitive information that the data owner does not want others to know, such as home address, ID number, phone number, disease information, location information, etc. For example, in order to study the usage of various types of drugs and the patient's illness, the relevant departments may require the hospital to provide relevant drug purchase form...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F21/62
CPCG06F21/6245
Inventor 张美跃周业陈佳伟周定云俞宏青俞基锋
Owner HENGRUITONG FUJIAN INFORMATION TECH CO LTD