Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Differential privacy protection clustering method based on K-means

A technology of differential privacy and clustering methods, applied in the field of information security, can solve problems such as slow convergence speed, low level of privacy protection, and high computational complexity

Active Publication Date: 2021-01-08
NANJING UNIV OF POSTS & TELECOMM
View PDF3 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] From the perspective of privacy protection, the following methods may generally be adopted: 1) Cryptography (Cryptography): the information is completely fidelity, but the computational complexity is too high
2) Anonymization: The level of privacy protection is relatively low, and may encounter NP-hard problems
The second flaw is that these early privacy-preserving models do not provide an efficient and rigorous method to demonstrate their privacy-preserving level, so when the model parameters are changed, the privacy-preserving level cannot be quantitatively analyzed
However, the traditional differential privacy algorithm is because the k-means is extremely sensitive to the initial center point, and the random noise added in the iterative process slows down the convergence speed.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Differential privacy protection clustering method based on K-means
  • Differential privacy protection clustering method based on K-means

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0029] The technical solution of the present invention will be further described in detail below in conjunction with the accompanying drawings.

[0030] A differential privacy protection clustering algorithm based on k-means of the present invention, in each iteration process, the upper and lower bounds of the privacy budget are limited by the arithmetic sequence, according to the clustering effect U(ε) and the privacy protection effect V(ε ) to determine the allocation of the privacy budget, which is called the adaptive privacy budget allocation mechanism. When calculating the distance between the data and the center point of the cluster, the calculation time is reduced through the triangle inequality, and the efficiency is improved. The following problems need to be solved: 1. How to solve the adaptive privacy budget allocation to solve the optimal privacy budget, 2. How to solve the differential privacy protection in the process of cluster analysis. The following is divided...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a differential privacy protection clustering method based on Kmeans. The method comprises the steps of data preprocessing, clustering iteration and differential privacy; wherein each iteration needs to be carried out as follows: setting a privacy budget, calculating noise, calculating the sum of data points of each cluster and the number of points, and then adding the calculated noise to the data points; for each iteration, maintaining the upper bound and the lower bound of one privacy budget, selecting adaptive privacy budget allocation between the upper bound and thelower bound according to the availability and the privacy protection degree, and repeating the steps until the sum of squares of errors converges or reaches the upper limit of the number of iterations. According to the method, in the iteration process of the Kmeans clustering algorithm, through adaptive privacy budget allocation, data privacy is protected on the basis of data distortion, and dataavailability is guaranteed.

Description

technical field [0001] The invention belongs to the technical field of information security and relates to a clustering method and privacy protection technology, in particular to a differential privacy protection clustering method based on K-means. Background technique [0002] With the continuous popularization and deepening of information technology applications, various information systems have stored and accumulated rich data. In the face of massive data, in the industry, data holders can obtain potential value through data mining technology; in academic In the world, data mining technology has also made great progress in some research and application. As the most commonly used data mining technology, clustering algorithm is widely used. At the same time, these data contain a lot of sensitive information, which will bring immeasurable threats and losses to users. Therefore, it is necessary to protect data privacy in the process of cluster analysis. [0003] From the pe...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F21/62G06K9/62
CPCG06F21/6245G06F18/23213
Inventor 李鹏朱祥王汝传徐鹤程海涛朱枫张玉杰李正材
Owner NANJING UNIV OF POSTS & TELECOMM
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products