Uncertain data frequent item set publishing method based on differential privacy

A technology for frequent itemsets and data determination, applied in digital data protection, electrical digital data processing, instruments, etc., can solve the problem of difficulty in taking into account privacy protection security and data availability, large privacy budget, and difficulty in taking into account data availability and privacy security. and other problems to achieve the effect of ensuring availability and security, reducing the search space, and improving the efficiency of the algorithm

Active Publication Date: 2021-03-09
SOUTHEAST UNIV +1
View PDF3 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The existing methods of publishing frequent itemsets of privacy-preserving uncertain data based on differential privacy have the following deficiencies: (1) The privacy protection mechanism is independent of the mining process, causing the accuracy of the published top-k frequent itemsets to be affected by the k value (2) The algorithm uses Exponential mechanism, when applied to large-scale frequent itemsets, the privacy budget is large, and it is difficult to take into account data availability and privacy security (3) Separating the association between frequent itemsets and their support for privacy protection processing, resulting in unsatisfactory output results Top-k descending order constraints and correctness constraints
In summary, the existing methods have the defect that the privacy protection mechanism is independent of the mining process, and it is difficult to balance privacy protection security and data availability.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Uncertain data frequent item set publishing method based on differential privacy
  • Uncertain data frequent item set publishing method based on differential privacy
  • Uncertain data frequent item set publishing method based on differential privacy

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0029] Example 1: see Figure 1-Figure 2 , the present invention is a method for publishing frequent itemsets of uncertain data based on differential privacy, comprising the following steps:

[0030] Step 1: Set λ 0 , get the frequent itemsets FI from D, then obtain all the frequent 1-itemsets ILIST and the support degree λ of the kth largest frequent itemsets in FI, and set the privacy budget Sensitivity Δq=1, q(D, r) is set as the support count of the item, and the number of items in ILIST is n, which is proportional to The probability of choosing n items from all 1-item sets of D to add to Set a privacy budget of Δq=1, above the threshold [1, |I|] probability sampling to obtain where q(D,i) is the support of the i-th frequent itemset, and I is the frequent itemset set of D. Get parameters through step 1 as the basis for subsequent processing.

[0031] Step 2: Transaction truncation. Traverse data set D, for all records whose length exceeds l opt , truncat...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an uncertain data frequent item set publishing method based on differential privacy, which comprises the following steps: (1) giving a data set D, carrying out frequent item set mining on the D to obtain the support degree [lambda] of a frequent 1-item set ILIST and a kth frequent item set, and adding noise to obtain the support degree [lambda] of the frequent 1-item set ILIST and kth frequent item set; (3) extracting a length estimation value of a maximum frequent item set containing each item x from the D', and taking the length estimation value as hierarchical information; and (4) setting an initial node as a null set in the D', and screening by using hierarchical information constraints and subset frequent constraints to obtain a candidate item list NodeList ofthe current node; and (5) screening a top-k frequent item set and a support degree thereof in the FISet, and publishing the topk frequent item set and the support degree thereof. The whole process ofthe method meets differential privacy, and the privacy security of data can be improved.

Description

technical field [0001] The invention relates to a method for publishing frequent itemsets of uncertain data based on differential privacy, which belongs to the publishing algorithm of frequent itemsets in the field of uncertain data. Background technique [0002] At present, the research on the release of frequent itemsets of privacy-preserving uncertain data is still in its infancy, and the existing research is mainly based on differential privacy. Differential privacy has become a new popular privacy protection model, which does not care about the background knowledge possessed by the attacker. All data. Differential privacy achieves privacy protection by adding noise to query or analysis results. Existing methods for publishing frequent itemsets of uncertain data with privacy protection based on differential privacy have the following shortcomings: (1) The privacy protection mechanism is independent of the mining process, which causes the accuracy of the published top-k...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F21/62
CPCG06F21/6245
Inventor 倪巍伟邹云峰鲍晓涵
Owner SOUTHEAST UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products