Method and system for carrying out matrix product operation on computer cluster

A computer cluster and matrix product technology, applied in the direction of complex mathematical operations, multi-programming devices, etc., can solve problems such as time-consuming, limited number of Reduce tasks, bottlenecks in product operation speed, etc., and achieve the effect of improving the operation speed

Inactive Publication Date: 2012-12-19
BEIJING IZP NETWORK TECH CO LTD
View PDF0 Cites 15 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0010] In the MapReduce engine, the number of Reduce tasks is usually 0.95 or 1.75 × the number of computing nodes × Mapred.tasktracker.tasks.maximum, where the computing nodes (Computer Nodes) are equivalent to the Hadoop server host (Host), and how many A tasktracker node. A tasktracker node can be used to execute a Map / Reduce task. Mapred.tasktracker.Map.tasks.maximum is usually set to the number of cpu cores of the computing node minus 1. If the host of the computing node is 8 cores , then the number of Reduce tasks is 6.65 or 12.25*the number of computing nodes, that is, the number of Reduce tasks is limited; thus, when the matrix scale is relatively large, in the Reduce task stage of job1, a large number of two tasks need to be completed with a limited number of tasks The operation of combining two groups of matrix elements to obtain the product is very time-consuming and becomes the bottleneck of the speed of the entire matrix product operation.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and system for carrying out matrix product operation on computer cluster
  • Method and system for carrying out matrix product operation on computer cluster
  • Method and system for carrying out matrix product operation on computer cluster

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0069] In order to make the above objects, features and advantages of the present application more obvious and comprehensible, the present application will be further described in detail below in conjunction with the accompanying drawings and specific implementation methods.

[0070] In the technical field, it is known that multiple computing nodes for executing Map tasks and Reduce tasks are deployed on computer clusters, wherein multiple tasktracker nodes are deployed on one computing node, and one tasktracker node can be used to execute a Map / Reduce tasks, so one computing node can be used to execute several Map tasks or Reduce tasks to achieve parallelism between Map tasks and Reduce tasks.

[0071] In the MapReduce engine, the parallelism of a computing node executing the Reduce task is limited, which is 6.65 or 12.25; therefore, in the prior art, complex operations involved in the Reduce task (such as combining two groups of large-scale matrix elements to obtain the produ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a method and a system for carrying out matrix product operation on a computer cluster. A distributed file system and a plurality of calculation nodes for executing Map tasks and Reduce tasks are arranged on the computer cluster. The method specifically comprises the following step of: executing a first Map task, a first Reduce task, a second Map task and a second Reduce task on the calculation nodes, wherein the first Map task is used for carrying out first treatment to obtain a corresponding first key value pair result; the first Reduce task is used for gathering the key value of the same main key in the first key value pair result; the second Map task is used for identifying to obtain elements of a first matrix and a second matrix, and carrying out two-two combining and multiplying operation to obtain a second key value pair result; and the second Reduce task is used for summing the key values of the same main key in the second key value pair result. According to the method and the system provided by the invention, the calculation speed of the matrix product can be improved.

Description

technical field [0001] The present application relates to the technical field of computer parallel computing, in particular to a method and system for performing matrix product computing on a computer cluster. Background technique [0002] At present, with the rapid development of high-performance applications and computing requirements, a single computer can no longer solve some ultra-large-scale application problems, such as spatial joins, nearest neighbor queries of multiple data sets, etc. This requires combining multiple computer resources to form a computer cluster to jointly solve large-scale application problems. Pirate (Hadoop) is such a distributed system cluster architecture, which can achieve high-efficiency parallel computing and massive storage. [0003] Hadoop consists of many elements, the bottom of which is the Pirate Distributed File System (Hadoop Distributed File System, HDFS), which stores files on all storage nodes in the Hadoop cluster; the upper laye...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/16G06F9/46
Inventor 张一凡张中峰罗峰黄苏支李娜
Owner BEIJING IZP NETWORK TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products