Hadoop-based fast neighborhood rough set attribute reduction method

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
A neighborhood rough set and attribute reduction technology, applied in special data processing applications, instruments, electrical digital data processing, etc., to improve analysis efficiency, reduce time complexity, and reduce output

Active Publication Date: 2013-10-02

HUZHOU TEACHERS COLLEGE

View PDF6 Cites 35 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

However, there are few researches on distributed attribute reduction methods at home and abroad.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0031] In order to achieve the above object, the present invention proposes a Hadoop-based neighborhood rough set fast attribute reduction method, comprising the following steps:

[0032] a) Set up a distributed platform based on Hadoop: set up the HDFS distributed file system and the MapReduce parallel programming model; the HDFS distributed file system adopts a master-slave structure system, consisting of a manager and multiple workers, and the manager manages files The namespace of the system maintains the file system tree and all files and directories in the entire tree. The worker is the working node of the file system, stores and retrieves data blocks as needed, and periodically sends a "heartbeat" report to the manager. If the management If the operator does not receive the worker's "heartbeat" report within the specified period of time, the manager starts a fault-tolerant mechanism to process it; the MapReduce parallel programming model divides the task into several sma...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a Hadoop-based fast neighborhood rough set attribute reduction method. The method comprises the following steps: a, establishing a distributed platform based on the Hadoop; b, defining a neighborhood rough set; c, generating a candidate set; d, calculating the importance of each attribute; e, selecting the attribute with the largest importance and adding the attribute into the candidate set; f, judging whether a stop condition is met or not; g, storing conditions selected by characteristics. The method is based on the Hadoop distributed platform to analyze the parallelization of a parallel data mining algorithm so as to realize the parallelization of a neighborhood rough set attribute reduction algorithm; the time complexity of the parallelized attribute reduction is greatly lowered, the output of an intermediate result in the performing intermediate process is greatly reduced, and the analysis efficiency of large-scale data is improved, so that numerous and varied mass data are converted into available data with information and business values, thereby completing mining and analysis optimizing of data.

Description

【Technical field】 [0001] The invention relates to a data attribute reduction method, in particular to a large data distributed attribute reduction method. 【Background technique】 [0002] With the rapid development of the high-tech information industry and the continuous updating of the chapters of human history, we have now entered an era of data explosion and information expansion. Every day, massive amounts of data are generated, operated and utilized every second. The "big data era" is coming. Within one minute, the amount of new data posted on Weibo exceeds 100,000. The New York Stock Exchange generates 1TB of transaction data every day, and the world generates 2.5 Ai (1 Ai equals 10 to the 18th power) words every day. section data. IDC's recent digital universe research predicts that by 2020, the world's total data storage will reach 35ZB (1Z is equal to 10 to the 21st power). Faced with the rapid growth of massive data, how to more effectively analyze the massive dat...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G06F17/30

Inventor蒋云良杨建党刘勇范婧张雄涛

OwnerHUZHOU TEACHERS COLLEGE

Hadoop-based fast neighborhood rough set attribute reduction method

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology