Data provenance traceability system and method based on multi-state scientific workflow

A technology of data lineage and traceability system, applied in the field of data lineage traceability system, can solve the problems of lack of multi-state scientific workflow data lineage description and traceability methods

Active Publication Date: 2014-04-23
PEKING UNIV
View PDF7 Cites 10 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0008] Aiming at the lack of a unified data lineage description and traceability method supporting multi-state scientific workflow in the current complex system design simulation process and scientific research experiment based on scienti

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data provenance traceability system and method based on multi-state scientific workflow
  • Data provenance traceability system and method based on multi-state scientific workflow
  • Data provenance traceability system and method based on multi-state scientific workflow

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0143] Figure 7 It is a simple scientific workflow model with a total of 7 nodes, and the model has the following provisions:

[0144] 1) Each node will generate a version of data after each execution;

[0145] 2) Each node can start to execute only after all its predecessor nodes are executed and data is generated;

[0146] 3) After a node is executed once, it can be executed again;

[0147] 4) If a node is re-executed, all its successor nodes also need to be re-executed.

[0148] For this model, this example will perform the following series of operations:

[0149] 1) Execute nodes 1, 2, 3, 5, 4, and 6 in sequence, and the version of the data obtained by each execution is 1.0;

[0150] 2) Re-execute node 4, and the data version obtained by re-executing is 2.0;

[0151] 3) Re-execute all executed subsequent nodes of node 4 to obtain a new data version 2.0;

[0152] 4) Execute node 7, and the obtained data version is 1.0;

[0153] 5) Query the da...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a data provenance traceability system and method based on multi-state scientific workflow. The method comprises the following steps of obtaining an extended scientific workflow process model on the basis of a digraph-based scientific workflow process model through extension of the digraph-based scientific workflow process model; enriching a data model part of the extended scientific workflow process model by utilizing a data provenance technology, comprehensively describing an executing procedure of the scientific workflow from two points of a process and data so as to obtain a unified management model of process data based on the multi-state scientific workflow, and describing and tracing the data provenance. According to the data provenance traceability system and method, the evolution and state of data in large-scale complicated scientific calculation and collaborative research and development process can be better described, so that the monitoring capacity of the flow process is enhanced, the comprehensive management strategy of the flow is realized, the scientific research and development efficiency is improved, and the scientific development and the technical progress are promoted.

Description

technical field [0001] The invention provides a data lineage tracing system and method based on a multi-state scientific workflow, in particular to a method for tracing the data lineage relationship between each process node in a scientific workflow instance in a multi-task state and a data lineage relationship tracing method. storage method. Background technique [0002] In large-scale, complex system design and manufacturing processes and scientific experiments, such as spacecraft design, ship manufacturing, etc., many people are usually required to cooperate to complete a large number of interdependent tasks of the same magnitude. In this process, the distinguishing feature is that a large number of tasks and massive data are involved in the design process and implementation process, and the workflow is highly complex. [0003] For the management of complex workflow, in terms of process, since the design process of complex model products usually includes a large number o...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06Q10/06G06F17/30
Inventor 黄雨井玉欣王捍贫张世琨
Owner PEKING UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products