Unlock instant, AI-driven research and patent intelligence for your innovation.

Layout method of scientific workflow data in hybrid cloud environment

A layout method and workflow technology, applied in the direction of electrical digital data processing, genetic rules, biological models, etc., can solve the problems of not considering the impact between data sets, shortening data transmission time, etc.

Inactive Publication Date: 2018-10-12
FUJIAN NORMAL UNIV
View PDF4 Cites 8 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Among them, the most relevant research work considers data dependencies, screens out non-private datasets that are highly dependent on private datasets, and puts these datasets together in the corresponding private cloud data center, thus effectively shortening the time spent on data. transfer time, however this work does not consider the effect of size differences between datasets on data layout

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Layout method of scientific workflow data in hybrid cloud environment
  • Layout method of scientific workflow data in hybrid cloud environment
  • Layout method of scientific workflow data in hybrid cloud environment

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0068] In order to describe the present invention in more detail, further explanation will be given below in conjunction with the accompanying drawings.

[0069] 1 Problem analysis:

[0070] Such as figure 1 As shown, a scientific workflow example of the present invention is given, figure 2 and 3 Two data layout schemes are given to describe the problem in detail. This scientific workflow consists of 5 tasks {t 1 , t 2 , t 3 , t 4 , t 5}, 5 input datasets {d 1 , d 2 , d 3 , d 4 , d 5} and 1 intermediate dataset {d 6}composition. where data set ds 4 It is a private data set and must be stored in the data center dc 2 In the task t 4 The input data set is {ds 3 , ds 4 , ds 6}, then t 4 must also be in the data center dc 2 run on. If only the number of data transfers is considered, figure 2 The number of transfers in is 2, image 3 The number of transmissions in is 5, but the size of the data set cannot be ignored, the present invention assumes ds 1 = 1...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a layout method of scientific workflow data in a hybrid cloud environment. A scientific workflow data layout problem is encoded in a discrete particle swarm encoding manner; variation and cross operations of a genetic algorithm are imported to update particles so as to effectively overcome the premature convergence problem of discrete particle swarm optimization; an updatemethod for adaptively adjusting an inertia weight factor based on the degree of difference between global historical optimal particles and current particles is constructed to effectively meet the complex and variable nature of data problems; and a fitness function between infeasible particles with data center capacity deposit excess and feasible particles with remaining capacity is defined to reduce the data movement times between data centers and compress the data transmission amount, so that the execution efficiency of scientific workflow can be effectively improved.

Description

technical field [0001] The invention relates to a scientific workflow data layout method in the field of parallel and distributed high-performance computing, in particular to a scientific workflow data layout method in a hybrid cloud environment. Background technique [0002] In scientific research fields such as astronomy and bioinformatics, scientists analyze data from existing data sources or collect data from physical devices by performing thousands of tasks, and generate a large amount of new data, such as intermediate data or final data results , These data are often in the order of TB or even PB. In the past, scientists typically used simple methods such as the Perl scripting language to orchestrate tasks and manage data, but this approach was time-consuming and error-prone. Computational tasks and scientific applications are therefore usually modeled as workflows that automate data analysis through data or control dependencies. Scientific workflows involving a larg...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06N3/00G06N3/12G06F9/48G06F9/50
CPCG06F9/4881G06F9/5027G06N3/006G06N3/126
Inventor 林兵陈星项滔卢宇黄志高郭文忠
Owner FUJIAN NORMAL UNIV