A distributed data anonymous processing method based on mapreduce

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
A technology of distributed data and processing methods, which is applied in the field of data processing to achieve efficient processing, improve efficiency, and solve the lack of server storage and computing power.

Active Publication Date: 2019-05-28

徐工汉云技术股份有限公司

View PDF3 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

But there is no feasible solution for global anonymity

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment

[0041] Taking a data table with four quasi-identifiers as an example, the specific implementation process is as follows:

[0042] Step 1, first choice, determine the quasi-identifier of the data table, take a data table with four quasi-identifiers (Supplier, Code, Price, Time) as an example for data processing, the generalization rules are as follows: S0 (supplier) , C0 (material code), P0 (material price), T0 (process time) are quasi-identifiers, which are used for generalized attributes. According to prior knowledge, formulate corresponding generalization rules for alignment identifiers:

[0043] For example, generalize {a certain limited company in Xuzhou City, a certain limited company in Beijing, a certain limited company in Hefei, a certain limited company in Suzhou,...} attribute from h=0 to h=1 layer and become gender {Jiangsu Province, Beijing, Anhui Province,...}, generalize from h=1 to h=2 to {China}; generalize the specific time (T) of the process to {≤30min, >30m...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a MapReduce-based distributed data anonymity processing method, which comprises a server side and computer terminals, wherein an original data table is stored in the server side to carry out global generalization on data and give a generalized lattice which is likely to meet k- anonymity; the server side utilizes a method of bisection to allocate a computational node to each computer terminal; each computer terminal carries out computation in parallel and returns a value to the server side according to a computation condition; if the return value does not meet k- anonymity, the server side sends a descendant node determined by the method of bisection to each computer node, otherwise the server side sends an ancestor node determined by the method of bisection to a computer; and each computer terminal recalculates according to a new node given by the server side until all nodes which meet k- anonymity are found. The method solves the trouble between explosive data growth and existing server storage and computational capabilities, and the efficiency of massive data processing is improved.

Description

technical field [0001] The invention relates to a distributed data anonymous processing method based on MapReduce, which belongs to the technical field of data processing. Background technique [0002] Due to the needs of knowledge decision-making, information sharing, and scientific research, data owners need to release the data to the outside world. In order to reduce the possibility of privacy leakage during the data release process, it is necessary for the data owner to perform privacy protection related processing on the data before release. [0003] Currently, Sweeney and Samarati et al. proposed a k-anonymity privacy protection model. The k-anonymity privacy protection model can avoid connection attacks and effectively protect private data information, but does not take effective protection measures for sensitive attribute information, and there is still a risk of private data information leakage. In the case of homogeneity attack, background knowledge attack, simil...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & AuthorityPatents(China)

IPC IPC(8): G06F21/62G06F21/57

CPCG06F21/57G06F21/6254

Inventor黄凯张启亮

Owner徐工汉云技术股份有限公司

A distributed data anonymous processing method based on mapreduce

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology