Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Method and system for realizing data set desensitization based on k-anonymity algorithm

An anonymous algorithm and data set technology, applied in the field of data desensitization, can solve the problems of increasing data availability, data information loss, data set desensitization, etc., and achieve the effect of reducing information loss and improving usability

Active Publication Date: 2019-12-06
JINAN UNIVERSITY
View PDF9 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0017] In order to solve the problem that the prior art does not take into account the distance of adjacent data in the temporary anonymous group, data information loss is likely to occur when generating a data set, and the existing segmentation is a rectangle formed with area corners, thereby reducing the usability of the obtained data To solve this problem, a method and system for desensitizing data sets based on the k-anonymity algorithm is provided, which can solve the problem of area corners, and under the premise of protecting privacy, the more anonymous groups of data, the more generalized the data The lower the rate, the greater the availability of data

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and system for realizing data set desensitization based on k-anonymity algorithm
  • Method and system for realizing data set desensitization based on k-anonymity algorithm
  • Method and system for realizing data set desensitization based on k-anonymity algorithm

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0092] A method for realizing data set desensitization based on k-anonymity algorithm, said method comprising the following steps:

[0093] S1: Input the original data set T, and set parameters τ, P, k; among them, P represents the record set in the data table, τ represents the hypersphere area occupied by the record set; k represents the parameters of the algorithm, indicating that there are at least k A record has the same quasi-marker as other records, so that each record has a probability of 1 / k to be confirmed;

[0094] S2: Delete the display identifier of each record in the original data set, define the order of the value domain of each attribute in the quasi-identifier, and make it an ordered domain; then map the ordered domain to the real number domain one by one;

[0095] S3: Set range=τ, TMP=P, express it as a linear function of k by |P|=αk+β, where β is a non-negative number smaller than k, and α represents the relationship between the number of records in the anony...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a method and system for realizing data set desensitization based on a k-anonymity algorithm. The method comprises the following steps: acquiring a data set which is not subjected to desensitization processing; deleting the data set display identifiers, and defining a sequence on each attribute value domain in the quasi-identifiers to form an ordered domain; mapping the ordered domains into real number domains one by one; defining the distance of each data point in the space, calculating the relative distance, determining the division points of the data set according tothe relative distance in combination with a projection area density division algorithm, recursively solving each level of division points, and finally establishing a hypersphere group; and carrying out generalization processing on point information contained in each hyper-sphere in the hyper-sphere group, so that the quasi-identifier values of all records are the same, and completing desensitization processing. According to the method, the regional angle problem existing in a rectangle can be improved, and the distance problem between adjacent points in the temporary anonymous group can be considered, so that more anonymous groups are obtained on the premise of ensuring privacy protection, and the lower the generalization degree of data is, and the higher the availability of the data is.

Description

technical field [0001] The present invention relates to the technical field of data desensitization, and more specifically, to a method and system for desensitizing data sets based on a k-anonymity algorithm. Background technique [0002] The common processing method of privacy data anonymization originates from the data processing method in the statistical database, mainly through the information loss of the attribute values ​​in the published data in exchange for the accuracy of re-identifying some individuals through these attribute values, and at the same time It is possible to guarantee the availability of published data, and to achieve a balance between the accuracy of published data and privacy protection. [0003] As far as the current technology is concerned, the division strategy for anonymous groups disclosed in [1] is "anonymous algorithm based on rounding and division" (RPF) and "k-anonymous algorithm based on vertex and edge modification" is disclosed in docume...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F21/62G06F17/18
CPCG06F21/6254G06F17/18
Inventor 陈成赖兆荣
Owner JINAN UNIVERSITY
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products