Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Anonymous method for missing data and storage medium

A technology for missing data and data sets, applied in digital data protection, electronic digital data processing, instruments, etc., can solve problems such as information loss and affecting the normal use of data sets, so as to improve security, reduce information loss, and ensure availability Effect

Pending Publication Date: 2020-10-30
ZHENGZHOU UNIV
View PDF1 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Traditional anonymity technology removes incomplete records contained in medical data sets during data processing, resulting in excessive loss of information and affecting the normal use of data sets

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Anonymous method for missing data and storage medium
  • Anonymous method for missing data and storage medium
  • Anonymous method for missing data and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0051] The present invention will be described in further detail below in conjunction with the accompanying drawings and specific embodiments.

[0052] The present invention provides an anonymous method for missing data. The method is based on a distance calculation method based on information entropy, and performs anonymous processing for missing data types, including:

[0053] ST1: Set the clustering parameter k, the clustering parameter k indicates that in multiple groups formed by clustering, each group includes no less than k data records;

[0054] ST2: Set l-diversity model parameter l; the parameter l indicates that each group includes not less than l different quasi-identification attribute types.

[0055]ST3: According to the similarity judgment value of the data records in the data set, l-diversity model parameters and clustering parameters, cluster all the data records in the data set, divide the data set into multiple clusters, and obtain multiple clustered data s...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a missing data anonymity method and a storage medium. The missing data anonymity method comprises the steps of setting a clustering parameter k; setting l-diversity model parameter l; clustering all the data records in the data set according to the similarity judgment values of the data records in the data set, the l-diversity model parameters and the clustering parameters,dividing the data set into a plurality of clustering cluster groups, and obtaining a data set with a plurality of clustering clusters; and performing generalization processing on each clustering cluster group in the data set, performing generalization on each clustering cluster to obtain an equivalence class, and in the obtained equivalence class, the values of the data records in the same equivalence class on the quasi-identifier attribute are the same, thereby finishing anonymous processing. According to the method, the availability of the data after anonymization processing is guaranteed tothe maximum extent by processing the incomplete data set, information loss caused by a traditional anonymization method is reduced. Meanwhile, sensitive attributes related to a user are protected byl-diversity, and the safety of anonymization processing of the data set is improved.

Description

technical field [0001] The invention relates to an anonymous method for missing data, in particular to an anonymous method when incomplete data is contained in a data set. Background technique [0002] With the dissemination and sharing of a large amount of information, while bringing convenience to social development and people's lives, it also increases the risk of privacy information leakage. Although the display identifier (such as name, ID number, credit card number, etc.) will be eliminated when the data is released, the attacker can carry out link attacks through multiple data sets, which will still cause a certain degree of privacy leakage. The k-anonymity in data publishing anonymity technology and its series of optimization schemes have been extensively studied and applied to the anonymity protection of data publishing. These methods reduce the information loss caused by generalization, but at the same time increase the risk of privacy leakage. When the existing ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F21/62G06K9/62
CPCG06F21/6254G06F18/23213
Inventor 刘炜佘维田钊裴孟丽程聪聪
Owner ZHENGZHOU UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products