A method and an apparatus for setting data processing flow

A data processing and setting method technology, applied in the field of data processing, can solve the time-consuming problems of debugging logic, etc., and achieve the effect of lowering the access threshold, flexible tuning, and accelerating iteration efficiency

Active Publication Date: 2018-12-11
ADVANCED NEW TECH CO LTD
View PDF21 Cites 13 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

If the amount of data is large and exceeds the processing capacity of a single machine, a big data computing platform is needed, and its debugging logic is often quite time-consuming

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A method and an apparatus for setting data processing flow
  • A method and an apparatus for setting data processing flow
  • A method and an apparatus for setting data processing flow

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0025] Embodiments of this specification will be described below with reference to the accompanying drawings.

[0026] figure 1 A schematic diagram of a system 100 according to an embodiment of the specification is shown. The system 100 is used to perform a series of data processing (ie, data processing flow) on the input data set to finally obtain the required data set. Here, the input dataset can be batch data or streaming data. In one example, the input data set is the source data stream of machine learning (for example, the operation data of shopping platform users within a predetermined period of time, such as clicks, exposure data, etc.), and the data processing process may include, for example, reading the source data stream, perform data processing such as parsing, filtering, and grouping on the source data stream, and write the output data stream, which is a sample data set for machine learning.

[0027] Such as figure 1 As shown, the system 100 includes a develop...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

A method and an apparatus for setting a data processing flow are disclosed in embodiments of that present specification. The method comprises: obtaining a language description of the data processing flow, wherein the language description comprises, a name of an input data set of the data processing flow, the name of each intermediate data set acquired in the data processing flow, a name of an output data set of the data processing flow, a processing logic between the respective data sets, and a plurality of operators corresponding to the respective data sets, wherein the plurality of operatorsare used for applying data processing corresponding to the respective data sets; acquiring configuration information including configurations of the respective data sets and the plurality of operators; setting a calculation module for implementing the data processing flow based on the language description and the configuration information.

Description

technical field [0001] The embodiments of this specification relate to the field of data processing, and more specifically, to a method and device for setting a data processing flow. Background technique [0002] For the massive scale and unlimited growth of big data in the Internet, machine learning is a very efficient and useful tool. Big data is a collection of data that is so large that it exceeds the capabilities of traditional database software tools in terms of acquisition, storage, management, and analysis. great feature. In the field of machine learning for big data, users spend a large proportion of their time and energy on sample generation and feature engineering. Especially in data preprocessing, users need to read various data sources. If the amount of data is large and exceeds the processing capacity of a single machine, a big data computing platform is needed, and its debugging logic is often quite time-consuming. Currently commonly used real-time computi...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F8/20G06F8/30G06F8/35
CPCG06F8/24G06F8/315G06F8/355
Inventor 孙尚椿王一光王琳朱冠胤
Owner ADVANCED NEW TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products