Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Disk caching method and device in a parallel computing system

A parallel computing and disk caching technology, applied in computing, memory systems, instruments, etc., can solve problems such as no cache to external memory and inability to process, achieve fast computing, small fan-in and fan-out costs, and ensure operability and extensible effects

Active Publication Date: 2017-03-29
CHINA MOBILE COMM GRP CO LTD
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0023] From the above, it can be seen that the existing parallel iterative processing systems based on the BSP model mainly include: Google's Pregel system, Apache's open source HAMA system and Giraph system. Both data and message data are completely resident in memory and do not have the ability to be cached to external storage. When running a job whose data volume exceeds the upper limit of memory in a cluster system with limited memory resources, it cannot be processed.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Disk caching method and device in a parallel computing system
  • Disk caching method and device in a parallel computing system
  • Disk caching method and device in a parallel computing system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0042] In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be described in detail below with reference to the accompanying drawings and specific embodiments.

[0043] The problem to be solved by the embodiment of the present invention is exactly: how to support the automatic caching of data to the disk in the parallel iterative computing system based on the BSP model, that is: when the memory can accommodate the calculation data and message data, it is still running in the memory; When the amount of data and messages exceeds the memory capacity, it can automatically cache the overflowed part to the disk file. Furthermore, the embodiment of the present invention can also ensure that when the subsequent calculation process needs these data, it can be quickly read from the disk into the memory for calculation with a small fan-in and fan-out cost, thereby providing the system with the ability to run large-scale...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a disk cache method and device in parallel computer system. The method comprises pre-assigning the respective occupied rate of record data and information data in a data processing memory area; caching part of the record data in disk space with a Hash bucket as the unit when the capacity of the record data in the data processing memory area is about to exceed the pre-assigned rate during data loading; during calculating of task traversal access of the recorded data, caching the accessed Hash buckets in the data processing memory area in the disk space one by one till the released space can load the Hash buckets required to be accessed if the Hash buckets to be accessed is in the disk space and the surplus record data space in the data processing memory area is insufficient to load the Hash buckets to be accessed. By means of the method and the device, automatic caching of data to the disk can be achieved in a BSP model based parallel iterative computation system.

Description

technical field [0001] The present invention relates to the technical field of a Bulk Synchronous Parallel Computing Model (BSP) computing model, in particular to a disk cache method and device in a parallel computing system based on the BSP computing model. Background technique [0002] The BSP computing model, also known as the large synchronization model or the BSP model, was proposed by Viliant of Harvard University and BillMcColl of Oxford University (1990). A BSP model includes multiple processing units interconnected by a network. Each processing unit has a fast local memory and can launch multiple threads of computing tasks. A BSP calculation process includes a series of iterative processes of global super steps. Such as figure 1 As shown, a superstep consists of the following three sequential phases: [0003] (1) Concurrent local computing: Several computing tasks are performed on each processing unit. Each computing process only uses data stored in local memor...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F12/0866G06F12/0868G06F12/1009
Inventor 邓超郭磊涛钱岭孙少陵
Owner CHINA MOBILE COMM GRP CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products