Data placement method based on distributed cluster

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
A distributed cluster and data technology, applied in electrical components, transmission systems, etc., can solve problems such as data recovery performance loss, increase data recovery time, and computing power affecting performance, to prevent waste of resources, ensure load balance, and ensure transmission. The effect of efficiency

Inactive Publication Date: 2014-02-19

LANGCHAO ELECTRONIC INFORMATION IND CO LTD

View PDF3 Cites 29 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

But at this time, unnecessary data recovery time may be increased because the remote node is too far away from the local node, and random selection of nodes cannot guarantee the balance of data storage between nodes

Due to frequent node failures in the system, random selection of remote nodes will cause unnecessary performance loss in data recovery, resulting in performance degradation of the entire storage system

However, the network distance of the remote data copy and the data load of each node and the computing power of each node will affect the performance

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0029] Referring to the accompanying drawings, a specific example will be used to describe the process of implementing the distributed cluster-based data placement method for the content of the present invention.

[0030] First, deploy a distributed cluster environment, and install hadoop components on the operating system centos6.3 according to official documents. Then enable the hdfs and mapreduce services. The nodes in rack 1 have ordinary computing capabilities, and the nodes in racks 2 and 3 have fast computing capabilities. There are 5 Datanode nodes in each rack. The flow chart of the data placement method for distributed clusters is as follows figure 1As shown in , when a user submits a data storage request, first select nodes in different racks, and then judge whether the obtained nodes reach the selected fixed value. node. When entering the data placement evaluation module, it is first necessary to calculate the distance information of the current node, the numbe...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a data placement method based on a distributed cluster. In order to solve the problem that the loading condition, the computing power of a computational node and movement of mass data can have an influence on operational performance, the three factors are effectively combined to compute an evaluation value of data placement, and then a node is selected according to the evaluation value. The data placement method based on the distributed cluster has the advantages that load balancing of data placement can be achieved, and the degree of parallelism is improved when data read-write is carried out; the computing power of the node can be well used, corresponding computation tasks are distributed according to the computing power, and the time of operation is reduced; good transmission performance is achieved, data are stored in the nearby computational node, data transmission can be minimized, and efficiency is improved.

Description

technical field [0001] The invention relates to a data placement method based on a distributed cluster. technical background [0002] With the continuous development of Internet technology and the rapid increase of network information, the ability to efficiently and reliably process large-scale data sets is crucial to the development of the Internet. MapReduce is an easy-to-write parallel programming framework. Massive data can be processed through the MapReduce framework in the Hadoop cluster to improve efficiency through parallelism. However, since the input data of the operation in MapReduce is usually a large amount of data, if the data is distributed on different racks, a large amount of data will be moved, which will affect the performance of the operation. Therefore, the data should be placed close to the computing nodes to reduce the performance loss caused by large amounts of data movement. Therefore, the data placement method of the distributed cluster is very i...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & AuthorityApplications(China)

IPC IPC(8): H04L29/08

Inventor郭美思王秀娟

OwnerLANGCHAO ELECTRONIC INFORMATION IND CO LTD

Data placement method based on distributed cluster

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology