Check patentability & draft patents in minutes with Patsnap Eureka AI!

Graph iteration job-oriented running time prediction system and method in a Gia system

A running time and iterative technology, applied in prediction, structured data retrieval, instrumentation, etc., can solve the problems of roughness and low prediction accuracy, and achieve the effect of low training overhead

Pending Publication Date: 2021-11-09
BEIJING INSTITUTE OF TECHNOLOGYGY +1
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Although these methods are quite effective, they are all limited to predicting under a fixed-configuration cluster, and because the obtained data are all offline data or rough statistical data, the prediction accuracy is not high

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Graph iteration job-oriented running time prediction system and method in a Gia system
  • Graph iteration job-oriented running time prediction system and method in a Gia system
  • Graph iteration job-oriented running time prediction system and method in a Gia system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0146] In this embodiment, as image 3 The running time prediction system integrated into the Gaia system shown above is an actual application scenario for processing the four iterative algorithms PageRank, Connected Components, SSSP, and Adsorption. The data sets used by the above iterative algorithms are shown in Table 1.

[0147] The single-source shortest path algorithm (SSSP) calculates the shortest distance from a certain source node to all other nodes in the graph. PageRank is a well-known web page ranking algorithm, which updates the importance score of each page node through iterative and recursive calculations. The Adsorption algorithm diffuses labels in the graph according to the Random Walk model until the distribution of labels in each node in the graph reaches a stable level. The Connected Components algorithm searches iteratively to find connected parts in a large graph. The datasets in Table 1 are all from the real world and downloaded from the Stanford Large...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a graph iteration job-oriented running time prediction system and method in a Gia system. The method comprises the following steps: quickly capturing off-line features of a current graph iterative algorithm through sampling execution before job execution, wherein the off-line features comprise convergence features and key input features of each iteration; continuously capturing runtime characteristics in a job execution process, wherein the runtime characteristics comprise job parameters, resource utilization conditions and detailed statistical data; taking the similarity between jobs as the basis of job matching and final predicted value calculation, mainly comprising static similarity captured through sampling execution and dynamic similarity captured through real execution. According to the matching algorithm, specific parameters of the algorithm can be trained through formulated similarity evaluation standards, so that iteration job can automatically adapt to various similarities. The invention is an end-to-end running time prediction method, integrates the offline features and the runtime features of the graph iteration job, and can accurately predict the running time of the distributed graph iteration job under low training overhead.

Description

technical field [0001] The invention relates to the technical field of distributed big data computing, in particular to a running time prediction system for graph-oriented iterative jobs in the Gaia system. Background technique [0002] The Gaia system is a new generation of big data computing system with high timeliness and scalability based on the hybrid coexistence of multiple computing models. Solve a series of key technical problems at several core levels of big data analysis systems such as adaptive and scalable big data storage, batch-flow fusion big data computing, high-dimensional large-scale machine learning, and high-time-effective big data intelligent interactive guides. Build an independent, controllable, time-effective and scalable new-generation big data analysis system, and master the core technology of the world's leading big data analysis system. [0003] Iterative computing is usually the core part of machine learning and graph processing algorithms, that...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06Q10/04G06F16/23G06F16/2458G06F16/27G06K9/62
CPCG06Q10/04G06F16/2365G06F16/27G06F16/2462G06F16/2474G06F18/22
Inventor 岳晓飞王国仁赵宇海郑军李博扬
Owner BEIJING INSTITUTE OF TECHNOLOGYGY
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More