Method for constructing decision tree based on differential privacy protection

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A differential privacy and decision tree technology, applied in digital data protection, instrumentation, electrical digital data processing, etc., can solve the problems of inefficient selection methods and depletion of privacy budget, so as to protect privacy, improve accuracy, and reduce rapid consumption Effect

Inactive Publication Date: 2017-12-29

RENMIN UNIVERSITY OF CHINA

View PDF0 Cites 16 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

However, they mainly have two shortcomings: 1) decision classification is only performed on small spatial data. When the data points reach millions of levels, a large number of classification trees will be generated, resulting in inefficient selection methods; 2) when building a decision tree In the process, the privacy cost will be allocated layer by layer. When the height of the tree is very large, the privacy budget may be exhausted

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0019] The present invention will be described in detail below in conjunction with the examples.

[0020] The present invention provides a method based on differential privacy protection decision tree, which aims at the differential privacy protection of the classic greedy decision tree C4. A protection mechanism alters this answer in a way that preserves the privacy of everyone in the dataset. The present invention comprises the following steps:

[0021] 1) Use Bernoulli (Bernoulli) random sampling principle to sample the original data set with sampling probability p to obtain a data set sample, and the obtained data set satisfies ln(1+p(e ε -1))- Differential privacy:

[0022] Perform Bernoulli random sampling on the original data set with the assumed sampling probability p, put the selected samples into the spatial samples, otherwise discard them, and calculate the privacy budget ε required to build the entire decision tree under the sampling probability p p . Among the...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention relates to a method for constructing a decision tree based on differential privacy protection. The method comprises the steps that sampling is performed on an original dataset according to a sampling probability p to obtain dataset samples, and the obtained dataset meets the requirement for ln(1+p(e<epsilon>-1)-differential privacy, wherein primary processing is performed on the dataset obtained through sampling, and continuous properties and discrete properties are made to jointly participate in decision selection under privacy protection; a C4.5 decision tree is initialized according to extracted dataset samples, and a sparse vector method is utilized to judge whether nodes in the decision tree continue to split; and the decision tree is constructed recursively. Through the method, classification accuracy is high, and the decision tree can be constructed efficiently and accurately while privacy is protected.

Description

technical field [0001] The invention relates to a decision tree privacy protection method, in particular to a method based on differential privacy protection decision tree. Background technique [0002] With the development of hardware and technology, it is not a difficult problem to collect a large amount of data in a timely and effective manner, but how to mine useful knowledge and value from these data is a difficult point for people to study. Classification algorithm is a commonly used data mining tool. It can well support applications such as precise marketing, personality preference and credit analysis, and is widely loved by the financial industry and companies. Decision tree is one of the common classification algorithms. When building a decision tree, you first need to decide which attribute to split the node on. This decision is dominated by the data in the node. In addition, once the decision tree is constructed, the leaf nodes can output count information about ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G06F21/62G06K9/62

CPCG06F21/6245G06F18/24323G06F18/2415

Inventor 孟小峰郭胜娜

Owner RENMIN UNIVERSITY OF CHINA

Features

Generate Ideas
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Method for constructing decision tree based on differential privacy protection

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology